25 September 2019 ||
I got the opportunity to work on Bazel's Persistent Worker Mode for Haskell GHC during my internship at Tweag. Let's begin with some context. The
rules_haskell project adds support for Haskell components in software based on the Bazel build system. By default, compiling an individual Haskell target triggers a separate sandboxed GHC invocation. This approach is not optimal for two reasons: the recurring cost of compiler startups and the potential loss in incremental builds.
Bazel has a special mode of communication with a compiler to resolve the issue. My internship goal was to improve the method of communication between Bazel and the Haskell GHC compiler by adding support for this persistent worker mode in
rules_haskell. Let's explore what I learned and what I was able to accomplish.
Consider the following example of a C++ application build script in Bazel.
cc_library( name = "Lib", srcs = ["A.cpp", "B.cpp"] ) cc_binary( name = "Bin", srcs = ["Main.cpp"] deps = [":Lib"] )
cc_binary are rules to describe C++ build targets. We have two targets in this application, called
Lib library target depends on two source files,
Bin binary target depends on
Lib and the
Main.cpp source file.
Bazel controls the order in which targets are built and does not depend on a programming language in question. But how a target is built, does, hence the names like
haskell_binary, etc. For instance, each Haskell target is built with just a pair of calls to a compiler (one for compiling and one for linking). In the case of C++, however, every file is compiled by a separate call to a compiler. On the one hand, the one-call-per-file strategy wastes time on repetitive compiler startups, which may form a significant cost in languages like Scala or Haskell but not a big deal for C++. On the other hand, this strategy creates an additional opportunity for improved incremental building. Let's consider each of these two observations in more detail.
The opportunity to save on startup times was pointed out in Persistent Worker Processes for Bazel, the original Bazel Blog post on the topic. The Bazel team demonstrated the significant benefits of using a persistent compiler process for Java, where startups are expensive and JIT needs runtime stats to do its job effectively. Other JVM languages, e.g., Scala and Kotlin, followed this path.
Today, many of the main Bazel-enabled languages hosted under
bazelbuild have persistent workers, but not all of them benefit from warm startup and caching JIT data as much as the JVM-based languages do. Luckily, there is another way to improve performance with a persistent compiler process, namely, reusing auxiliary build artifacts that did not change since the last build—incremental builds.
Fast, correct incremental builds are such a fundamental Bazel goal they're the first thing mentioned in Why Bazel? at the top of Bazel's homepage. To fulfill this promise, though, Bazel needs sufficient knowledge about dependencies between the build artifacts. Let's get back to our example to explain this better.
Bin target has been built once, any change to
Main.cpp would require recompiling
Main.cpp and relinking
Bin, but does not require rebuilding
Lib. This is the inter-target incrementality supported by Bazel universally: no specific knowledge about the programming language is needed, and the logic fully translates to
rules_haskell and its
The difference comes when after a full build you make a change in, e.g.,
A.cpp. As we know,
.cpp files are compiled separately, and the knowledge is encoded in the
cc_library; therefore, Bazel would only recompile
A.cpp, but not
B.cpp. In contrast, the recompilation strategy for Haskell is rather subtle: whether you need to recompile the module
B in similar circumstances roughly depends on whether there is an import of
B. This goes beyond the knowledge of
haskell_library, and the rule will simply recompile all the modules in the
Lib target. The bottom-line is: due to the difference in the language nature, Bazel supports sub-target incrementality for C++ components but not for Haskell components.
Is it possible to get better incremental builds for Haskell components? Almost certainly, yes. In fact, this was one of the driving powers of the project. The persistent worker mode opens an opportunity for, first, the sub-target dependency analysis using GHC API and, second, caching of auxiliary build artifacts (e.g.,
.o files) to save work during rebuilds.
Unfortunately, it's hard to get the implementation of incremental builds right, and only a few Bazel-aware languages support sub-target incremental builds (e.g.,
rules_swift and one of several
rules_scala forks that employ the Zinc compiler). I did not get to implementing incremental builds in my project. I spent most of my time finding the right way to integrate the worker mode into
When Bazel encounters a compile-like command for the first time, it spawns a persistent process. This child process gets its
stdout redirected to talk directly to Bazel. The process listens on its
stdin for a compilation request and upon processing one, sends back a response. Bazel speaks to its workers in a simple Protobuf-based protocol: a request is a list of filepaths with the corresponding file hashes; a response is an exit code and a string with other textual output the worker decides to report (e.g., compiler's warning messages).
All in all, this scheme looks straightforward except an IPC solution based on
stdout complicates debugging by an order of magnitude. For example, sometimes GHC sends diagnostic messages to
stdout instead of
stderr, and sometimes you cannot mute such messages (I solved one particularly annoying instance of the problem during this work). One might hope redirecting
stdout helps, but some standard tricks may fail for all sorts of reasons; e.g., concurrency employed by Bazel bit me when running
rules_haskell's test suite under the worker strategy.
Several libraries implement support for Protobuf in the Haskell ecosystem. We chose
proto-lens which allows us to generate Haskell definitions from a
.proto description and conveniently access data with lenses.
One obstacle with
proto-lens was that they silently (and unconsciously, it seems) dropped support for parsing messages from an unbounded stream. That means once you have a handle to read in a message from, you have to specify the size of the bytestring you're going to read before the parser can get its hands on it. The length of a message is variable and encoded as a variable-length integer sent in front of every message. The
proto-lens library had internal machinery to read
varints but lacked a reasonable interface to employ it when receiving messages. I fixed this.
The worker application is a simple single-threaded server creating a fresh GHC session for every request. One issue I hit when employing GHC API to the
rules_haskell use case is that we use separate GHC calls to compile and then link. The Hello-World example for using the GHC API in the GHC User Guide does not cover the latter use case (where you should run GHC in the "one-shot" instead of the
--make mode), and I ended up copying some parts of the GHC driver to support this use case, since GHC doesn't export its driver module, unfortunately.
Since version 0.27 (June 2019) Bazel picks the worker strategy as the default one if it is available for action at all. Finding a convenient way to override the default was not straightforward—I had to rework the solution several times through both of my PRs to
The final version of the interface to activate the worker mode consists (as now described in the
rules_haskell docs) of a couple of actions: one has to, first, load the worker's dependencies in the
WORKSPACE file, and, second, pass a command-line argument when starting the build:
bazel build my_target --define use_worker=True
--define syntax is heavyweight due to Bazel's initial reluctance to provide user-defined command-line arguments. Recently, Bazel added special support for this feature but, as I discovered, the implementation of the feature has issues in realistic applications going beyond Hello-World.
The worker mode passed the whole test suite without a glitch from the first time; this is impressive given that the test suite contains tricky examples, e.g., with GHCi, C dependencies, and even GHC plugins. There's only one gotcha to take into account: the worker is not sandboxed by default. In some cases, GHC prefers rebuilding a perfectly valid target when it has access to the target's source. This will fail if GHC is not provided with sufficient dependencies. There were about 4 test cases out of 96 that failed due to this. The solution is to simply always use sandboxing passing
--worker_sandboxing in the command-line.
We did not get to rigorous performance measurements, but there are some promising observations even for the current implementation lacking sub-target incrementality. First of all, I assembled a sample ten module project where each module held one function calling the function from the previous module. Every module turned into a separate target in the
BUILD script, forming a deep target dependency tree. For this setup, I observed 10–15% speedup for the worker-enabled version of
rules_haskell (excluding the time for building the worker). On the other hand, running the
rules_haskell test suite did not show significant improvements on the worker-enabled version (2–3% speedup). I attribute this difference to two features of the test suite: first, the suite holds a fair amount of non-Haskell code, which dims the worker effect on build time; second, the suite represents a very shallow dependency graph, unlike in the first experiment with ten modules. Overall, there is a hope for a speedup in projects with deep dependency graphs.
There are many efforts underway to make GHC friendlier in various client-server kind of scenarios including IDEs (e.g., HIE and now hie-bios in ghcide), interactive sessions (e.g., this issue and this PR), and, finally, build systems (e.g., the recent Extended Dependency Generation proposal by David Eichmann and his work on cloud builds for the Shake-based GHC build system Hadrian). Indeed, we can now see some level of the convergence foreseen by Edward Z. Yang among compilers, build systems, and IDEs happening in the Haskellverse today.
What about incremental builds? David's findings suggest: GHC's ability to communicate detailed source file dependencies allows for fine-grained control over the build process. Under his proposal, the build system might get to decide when to call GHC and only ever call it in the "one-shot" mode. This decision logic could hardly fit in
rules_haskell main code but seems perfectly relevant to the worker implementation.
Although I did not get to incremental builds, I did some experiments with warm GHC startups. None of those ended up in the current implementation since I think there is room for improvement here. I believe one of the possible ways to improve GHC session startup times is caching package data for packages loaded from the global package database. To me, this loading stage looks like the most expensive action during GHC session initialization. Sadly, I found there were not enough utilities exported from GHC's
Packages.hs to tune this behavior.
I'm grateful to Tweag for giving me an exciting opportunity to work in an industrial setting; to Mathieu Boespflug for suggesting a project that not only kindled my Haskell passion but also pushed me outside my comfort zone to learn a new thing (Bazel); to Andreas Herrmann, my mentor in this project, for providing endless insightful help and feedback; to all other Tweagers, who either directly helped me or brought me excitement by demonstrating their exceptional engineering skills as well as creativity.
Here's a summary of my contributions and some pointers to possible future directions for improvement of the persistent worker mode in
Worker pull requests to
rules_haskell: , . The second one adds the worker sources and reworks good part of the first because the initial strategy to implement the switch between the regular and the worker modes forced the user to download worker dependencies anyway. The current strategy based on
select does not have the flaw. It can be improved when the Bazel Custom keys issue is resolved.
The initial worker repository. Unlike its replica inside
rules_haskell, which just holds static Protobuf descriptions in Haskell, the repository implements proper generation of those descriptions from a
.proto file. Notably, the repository contains the
reuse-ghc-session branch, which explores a warm startup of a GHC session. It is blocked because once all package databases are loaded into a session, they cannot be easily unloaded with just the utilities exported from GHC's
hDuplicate issue, which makes it harder to design protocols around
stdout if you want to intercept certain writes to
stdout (or reads from
My PR to GHC allowing to mute the
Loading package environment message with
-v0—it required more refactoring than one might imagine.
David Eichmann's Extended Dependency Generation (EDG) GHC proposal suggesting that no build system or IDE should ever call GHC in the
--make mode. Instead, GHC should be able to dump all the necessary information about dependencies into a machine-readable format. Once you have that file, with dependencies recorded, you only need GHC to compile individual files, the "one-shot" mode, and never the normal
--make mode. This approach would liberate us from certain shortcomings of the
--make mode like timestamp-based recompilation checking.