When working in a sufficiently large Haskell project, one eventually faces the question of whether to implement missing libraries in Haskell or borrow the implementation from another language, such as Java.
For a long time now, it has been possible to combine Haskell and Java programs using inline-java. Calling Java functions from Haskell is easy enough.
import Language.Java (J, JType(Class)) -- from jvm import Language.Java.Inline (imports, java) -- from inline-java imports "org.apache.commons.collections4.map.*" createOrderedMap :: IO (J ('Class "org.apache.commons.collections4.OrderedMap")) createOrderedMap = [java| new LinkedMap() |]
javac compilers can cooperate to build this program provided
that both know where to find the program’s dependencies. Back in the day, when
inline-java was being born, there was no build tool capable of
pulling the dependencies of both Haskell and Java, or at least not
without additional customization. Since then, however, Tweag has invested effort into
enabling Bazel, a polyglot build system, to build Haskell
programs. In this blog post we go over the problems of integrating
build tools designed for different languages, and how they can be
addressed with Bazel, as an example of a single tool that builds them
all. More specifically, this post also serves as a tutorial for using
inline-java with Bazel, which is a requirement for the latest
Suppose we rely on cabal-install or stack to install
the Haskell dependencies. This would make the
jvm packages available. But these tools are specialized to build Haskell
packages. If our program also depends on the
Java package, we can’t rely on Cabal to build it. We need help from some other Java-specific package manager.
We could rely on maven or gradle to install
common-collections4. At that point we can build our project by
ghc and telling
javac where to find the java dependencies
in the system via environment variables (i.e.
With some extra work, we could even coordinate the build systems so
one calls to the other to collect all the necessary dependencies and
invoke the compilers. This is, in fact, what
inline-java did until
But there is a severe limitation to this approach: no build system can
track changes to files in the jurisdiction of the other build system.
Cabal-based build system is not going to notice if we change the
gradle configuration to build a different version of
common-collections4, or if we change a source file on the Java side.
Easy enough, we could run
gradle every time we want to rebuild our
program, just in case something changed in the Java side. But then,
should the Haskell build system rebuild the Haskell side? Or should it
reuse the artifacts produced in an earlier build?
We could continue to extend the integration between build systems so one can detect if the artifacts produced by the other have changed. Unfortunately, this is a non-trivial task and leads to reimplementing features that build systems already implement for their respective languages. We would be responsible for detecting changes on every dependency crossing the language boundary.
If that didn’t sound bad enough, incremental builds is not the only concern requiring coordination. Running tests, building deployable artifacts, remote builds and caches, also involve both build systems.
A straightforward answer is to use only one build system to build all languages in the project. We chose to turn to Bazel for our building needs.
Bazel lets us express dependencies between artifacts
written in various languages. In this respect, it is similar
make. However, Bazel comes equipped with sets of rules, such as rules_haskell,
for many languages, which know how to invoke compilers and
linkers; these rules are distributed as libraries and are reusable across projects. With
the user must manually encode all of this knowledge in a
herself. It’s not the subject of this blog post, but Bazel comes with
a number of other perks, such as hermeticity of builds for
builds, and remote caching.
# file: BUILD.bazel load( "@rules_haskell//haskell:defs.bzl", "haskell_binary", ) haskell_binary( name = "example", srcs = ['Main.hs'], extra_srcs = ["@openjdk//:rpath"], compiler_flags = [ "-optl-Wl,@$(location @openjdk//:libjvm.so)", "-threaded", ], deps = [ "//jvm", "//jni", "//:inline-java", "@rules_haskell//tools/runfiles", "@stackage//:base", "@stackage//:text", ] + java_deps, data = [ ":jar_deploy.jar" ], plugins = ["//:inline-java-plugin"], ) java_deps = [ "@maven//:org_apache_commons_commons_collections4_4_1", ]
load instruction is all we need to do to invoke the
rules_haskell library and access its various rules. Here we use the
haskell_binary rule to build our hybrid program.
Besides the fact that
Main.hs is written in both Haskell and Java,
one can tell the hybrid nature of the artifact by observing the
dependencies in the
deps attribute, which refers to both Haskell
and Java libraries.
//:inline-javarefer to Haskell libraries implemented in the current repository
@rules_haskell//tools/runfilesrefers to a special Haskell library defined as part of the
rules_haskellrule set. More on this below.
@stackage//:textrefer to Haskell packages coming from
@maven//:org_apache_commons_commons_collections4_4_1is a Java library coming from a
Additional configuration in the WORKSPACE file
makes precise the location of dependencies coming from
maven, where the
stackage snapshot and the list of
maven repositories is specified.
More information on the integration with
stackage can be found in
an earlier post and with
maven in the
While the dependencies in
deps are made available at build time,
data = [":jar_deploy.jar"] declares a runtime
dependency. Specifically, it makes the Java artifacts available to the
":jar_deploy.jar" target is a
jar file produced with the following
rule from the Java rule set.
A future version of
rules_haskell may generate this runtime
dependency automatically, but for the time being we need to add it manually.
java_binary( name = "jar", main_class = "dummy", runtime_deps = java_deps, )
Now, when there are changes in the Java dependencies, Bazel will know to rebuild the Haskell artifacts, and only if there were changes.
Bazel makes sure that
jar_deploy.jar is built, and stores it in an
appropriate location. But nothing, so far, tells the Haskell
program where to find this file. This is where the
comes into play. Bazel lays out runtime dependencies, such as
jar_deploy.jar following known rules; the
abstracts over these rules. To complete our example, we need the
function to call to the
runfiles library and discover the
import qualified Bazel.Runfiles as Runfiles import Data.String (fromString) import Language.Java (J, JType(Class), withJVM) ... main = do r <- Runfiles.create let jarPath = Runfiles.rlocation r "my_workspace/example/jar_deploy.jar" withJVM [ "-Djava.class.path=" <> fromString jarPath ] $ void createOrderedMap
Mixing languages is challenging both from the perspective of the compiler and of the build system. In this blog post we provided an overview of the challenges of integrating build systems, and how a unifying build system can offer a more practical framework for reusing language-specific knowledge.
Bazel is a materialization of this unifying build system, where rule
sets are the units of reuse. We recently moved
to rely on Bazel, by depending strongly on the rule sets for
Haskell and Java. This implies that end users will also have to use Bazel.
Though this means departing from
the build tools that Haskellers are used to, we hope that it will
offer a better path for adopters to build their multi-language projects.
rules_haskell can still build
Cabal packages and
stackage snapshots via Cabal-the-library, going
with Bazel doesn’t forego the community effort invested in curating
the many packages in the Haskell ecosystem.