A tale of Template Haskell and cross compilation

25 November 2020 — by Cheng Shao, Georgios Karachalias

Template Haskell (TH) is a widely used yet controversial language extension. You have probably used it in your own code; with a single line of splice code, you can achieve tasks like deriving instances and embedding files easily. And you might also have heard the reasons why people may dislike it: it slows down compilation, breaks encapsulation, arbitrary IO at compile time is risky, etc.

But it is less well known that Template Haskell also makes cross compilation with GHC harder. In this post, we’ll show why this is a challenge, some existing solutions developed by the community, and in particular, how this problem is addressed by Asterius.

Just run some code at compile time, what can go wrong?

Conceptually, Template Haskell is a principled way of generating Haskell AST at compile time, like in the simplified example below:

{-# LANGUAGE TemplateHaskell #-}

import Data.Char
import Language.Haskell.TH.Syntax
import System.Process

gitRev :: String
gitRev =
  $( do
       rev <-
         runIO $
           filter isHexDigit <$> readProcess "git" ["rev-parse", "HEAD"] ""
       liftString rev
   )

Suppose we’d like to define a gitRev string that represents the current git revision in the project repository. This can be done using an expression splice: it is written using the $(...) syntax, and the content within $() is an expression of type Q Exp, representing a compile-time computation that returns an Exp value, which is, in this case, the current git revision as a string literal.

Splice code lives in the Q monad, which manages the context for Template Haskell and provides a rich set of interfaces. Inside Q we can query info about datatypes or functions, allocate fresh identifiers, etc. Arbitrary IO actions may also be run inside Q. Here, we run git rev-parse HEAD to obtain the git revision and then return it. When GHC compiles this module, the splice is replaced with a string literal, and compilation moves on.

So at first glance, Template Haskell is just about running user code at compile time, what can go wrong? All is right for most developers, who compile to the same platform they run GHC on, but there’s trouble ahead when you try to do cross compilation…

The what and why of cross compilation

Suppose we’d like to write a Haskell app for an Android phone or a Raspberry Pi. It’s possible to bootstrap a native GHC release on them and use it to compile stuff, but given the limited hardware resources of these machines, it’s wiser to run GHC on a proper x64 build server and emit code for these ARM devices. When we do so, we’re performing cross compilation. Some terminology:

The host platform is where we run GHC to compile stuff.
The build platform is where we compile GHC. For simplicity, we assume build=host and only use the host term from now on.
The target platform is where we run the compiled Haskell app. When host=target, the GHC is a native GHC, otherwise it’s a cross GHC.

For a native GHC, Template Haskell isn’t a problem, since GHC can link and run its emitted code just like native dynamic libraries. But this doesn’t work out-of-the-box for a cross GHC.

Over the years, people have come up with different approaches to address the cross compilation issue of Template Haskell, each coming with its own rough edges; more details follow in later sections.

Only run TH on the host platform

If we can’t run emitted code, then how about we don’t run it at all and stay with a cross GHC without TH support? We’ll preprocess the cross GHC input code, strip usages of the Template Haskell extension, and replace all TH splices with the expanded code. And the way to expand the splices would be… using a native GHC to compile it!

There’s a GHC flag -ddump-splices which dumps the expanded splices code. Unfortunately, the dump output has extra text decorations and isn’t proper Haskell source code, so it takes more work to use the dumps. Here’s a list of known implementations of the splice dump approach:

EvilSplicer uses a parsec-based parser to process the dumps for later consumption of cross GHC. It was used in the git-annex project until late 2018.
ZeroTH is a tool which does something similar, and includes a CLI and Cabal-related helper functions.
reflex-platform uses a patched native GHC which dumps the expanded splices as proper Haskell source code, and feeds into GHCJS.

However, making native/cross GHC work together is not trivial:

Unlike gcc or clang which can emit code for other platforms by simply adding relevant CLI flags, a GHC installation can only emit code for a single target platform configured at its build time. So two different GHC installations must be managed in isolated places.
Native/cross GHC must have the same version and process the same build plan to minimize the chance of emitting wrong code. Say package foo includes a TH splice that uses package bar, if native/cross GHC sees different versions (or even same version but different build plan) of bar, the splice behavior could potentially differ, expanding into wrong code that may be silently consumed by cross GHC.

Given the complexity of the required hacks and GHC/Cabal’s lack of cross compilation support, it’s common to use an external build system (e.g. Nix) to encapsulate this mechanism.

Other than saving dumps of expanded splices, there is another solution to only run TH splice code on the host platform: the same GHC always compile everything to both host/target code in one invocation! When running TH, we can just load host code just like native GHC. This requires quite some customization of GHC behavior and is only possible for 3rd-party compilers based on GHC API. In fact, GHCJS used this approach in its earliest days.

Pros and cons of running TH on the host platform

Running TH on the host platform works for pure splices, which can only do things like reifying info and generating ASTs. It should also work pretty well for side-effecting splices which reads files, spawns processes or fires missiles, since the splice behavior should be just the same as when we use a native GHC to compile stuff.

But is this the end of story? Not yet. Here’s one immediate problem: the native/cross GHC may not consume the same Haskell sources despite our best efforts.

Haskell modules may use the CPP extension with target-specific macros, so when you compile for different targets, you see different top-level definitions.
Cabal files may also check implementation/platform/etc, and end up with different flags or even different modules to be consumed by GHC.

The problems above will likely trigger compile-time errors. And there’s an even stealthier problem that may lead to generating incorrect code instead of a crash: the architecture difference of host/target, e.g. word size or endianness. For instance, a TH splice may make use of sizeOf (undefined :: Int), which is 4 on 32-bit target platforms, and if the host platform is 64-bit, then the TH splice will see 8, which sneaks into the emitted code without a single warning.

Run TH code on the target platform

As explained in earlier sections, vanilla GHC can only link and run host code. Would it be possible to teach GHC to link and run target code? The answer is yes. The key to supporting running non-native code is RPC (Remote Procedure Calls). GHC needs to call into target code to obtain the splice expansion result; the target code needs to call GHC to do reification. These calls are achieved via exchanging serialized messages between GHC and the loaded splices. Since there is a fixed set of operations allowed in the Q monad (as methods of the Quasi class), the operations and the results can be encoded as a serializable Message datatype.

This RPC approach to run TH code is standardized in the external interpreter feature. When running TH, GHC starts an external process calls iserv, pipes messages to iserv and tells it to load archives, objects, etc and link code. After a splice starts running in iserv, iserv may send queries back to GHC and get results. Finally, the splice expansion result is sent back to GHC.

The external interpreter opens up the possibility of using various emulators (e.g. wine for windows, node for js/wasm or even qemu for exotic platforms) to run target code for TH. GHC itself doesn’t need to care about how the code is actually linked and run in iserv, and TH should work as long as our target-specific iserv can properly process the messages.

This approach was pioneered by GHCJS, and later made it into upstream GHC by 7.10. Other than GHCJS, known users include:

GHC itself, even in native GHC! But why bother? Well, suppose we’re compiling a profiled library with TH usage. Since profiled code follows different runtime conventions and links with profiled runtime, in the early days, a profiled GHC executable was needed. Now, we can simply use a profiled iserv executable, and avoid the extra profiling overhead in GHC.
haskell.nix, which includes support for cross-compiling to Windows via wine emulation of TH code.
Mobile Haskell, which are ARM-targetting GHC distributions. They use Android/iOS emulators to set up the splice runtime environment. GHC talks to an iserv-proxy process via pipes, and iserv-proxy merely relays the messages to the real iserv program in the emulator via a socket.
The Eta Haskell-to-JVM compiler.
Asterius, which uses node for running the WebAssembly & JavaScript code.

Pros and cons of running TH on the target platform

Compared to running TH on the host platform, there are a few benefits to running it on the target platform:

No host/target incoherence issues, as explained in earlier sections.
Less hacky and more standardized. Although upstream GHC won’t likely contain iserv implementations for all interesting target platforms out there, developers can just roll their own if needed.
Simpler, since there isn’t a bunch of hacks to be packaged via nix anymore, and it works with vanilla cabal/stack.

It would be tempting to announce TH for cross compilation is now a solved problem! Turns out it’s not. Recall how TH enables running arbitrary IO in splices? It’s used in some popular packages, e.g. gitrev for obtaining git revisions, and file-embed for embedding files. For native GHC, the IO actions have full access to the host system: its file-system, external tooling, etc. But for cross GHC, the IO actions may be run in a sandbox without those facilities, so these packages and their dependents will then fail to compile!

Host-specific side effects in TH splices is still possible, but require case-by-case analysis and patching. Instead of runIO someOperation, we can directly add someOperation to the methods of the Quasi class, and patch the package to use it. When splices gets run, iserv will simply send a SomeOperation message to GHC, and GHC can run it on the host and deliver the serialized result back. This is used by MobileHaskell to support packages like file-embed.

Template Haskell in Asterius

Asterius uses the external interpreter approach to support Template Haskell. The execution of compiled WebAssembly code and the JavaScript runtime is done in a node process. Since node runs on the same machine along the compiler, it can access the same resources as a native GHC does, so as long as an operation is supported in the Asterius node runtime, it’ll work in a TH splice. This has worked pretty well so far, given the list of known-to-compile packages.

The current Asterius TH implementation comes with one limit though: no state is persisted across different splices in the same module, due to the lack of a true dynamic linker/runtime. Why would someone want to do this? One example is reusing an expensive resource across all splices, be it a process handle, network socket or anything else:

import Language.Haskell.TH.Syntax

data Resource

newResource :: IO Resource

freeResource :: Resource -> IO ()

useResource :: Q Resource
useResource = do
  m <- getQ
  case m of
    Just r -> pure r
    _ -> do
      r <- runIO newResource
      addModFinalizer $ runIO $ freeResource r
      putQ r
      pure r

In the example above, we have newResource/freeResource for allocating/freeing an expensive resource. Then we can implement useResource which attempts to get a Resource from the TH session, and initialize one if it’s not present. The registered finalizer will be run after all splices in the same module has been expanded.

Rest assured, cross-splice state persistence is quite rare in practice. So for typical TH scenarios, our current implementation should be sufficient.

Summary

Cross compilation isn’t a daily use case for most Haskellers out there, so they may not realize that features like TH can’t be taken for granted in a cross setting. We hope the writing above helps a bit in raising the awareness in our community.

You can help in improving the situation by thinking twice before rolling up the sleeves and reaching for things like TH with runIO, Plugins or custom Setup.hs. Would it work if GHC is targeting another platform and configured with a different toolchain? Even without actual testing in the end, this extra mindset has the potential of avoiding frustration of your project’s future users :)

For interested readers, we also recommend taking a look at the pending GHC stage hygiene proposal which includes some quality discussion.

Behind the scenes

Cheng Shao

Cheng is a Software Engineer who specializes in the implementation of functional programming languages. He is the project lead and main developer of Tweag's Haskell-to-WebAssembly compiler project codenamed Asterius. He also maintains other Haskell projects and makes contributions to GHC(Glasgow Haskell Compiler). Outside of work, Cheng spends his time exploring Paris and watching anime.

Georgios Karachalias

George is a Software Engineer with expertise in the design and implementation of functional programming languages. Prior to joining Tweag, he was in academia; during that time he designed extensions to Haskell, made contributions to GHC, and published his research to conferences such as ICFP and the Haskell Symposium. He holds an MEng in Electrical and Computer Engineering from the National Technical University of Athens, and a PhD and postdoctorate from KU Leuven. Outside of work, George enjoys reading (and occasionally writing) literature and poetry, carving wooden pipes, and hiking in forests and mountains with his partner and their dog.

Tech Group

Programming Languages and Compilers

Research, create, improve and maintain programming languages and their tooling to enhance developer productivity and to deliver reliable, maintainable, correct and performant software with minimum effort.

If you enjoyed this article, you might be interested in joining the Tweag team.

This article is licensed under a Creative Commons Attribution 4.0 International license.

← Self-references in a content-addressed Nix The Shrinks Applicative →