A build is hermetic if it is not affected by details of the environment where it is performed. Hermeticity is a prerequisite for generally desirable features like remote caching and remote execution. While certain build systems, such as Nix, impose hermeticity through their design, others rely on their users to do the extra work and be vigilant to get it. Bazel enforces hermeticity to some extent, for example through sandboxing, but is less strict about it than Nix. In this post I’m going to try to enumerate most ways in which hermeticity of a Bazel project can be compromised.
One source of inhermeticity is the file system. If tools, such as compilers, are invoked in a way that does not limit their access to contents of the file system, the output of these tools can be influenced by extraneous files that might be present during the build. One example could be include files in languages like C or C++. Imagine a shared machine that is used to perform builds with different configurations. One build might generate some header files and place it in a directory that might later be specified as an include directory in a compiler invocation performed by another build. If the generated header file happens to have the right name it can shadow the correct header file and lead to a build failure that is hard to reproduce and understand. This is not a hypothetical example, but a real problem our client once struggled with. This is why it is important to always use some form of sandbox for your build actions. Sandboxing also guarantees that all build inputs are declared correctly, because otherwise the input files will simply not be available.
The use of sandbox is controlled by choosing an execution strategy. The following execution strategies are available:
standalone, which is the same but deprecated) causes commands to be executed as local subprocesses without sandboxing.
sandboxedcauses commands to be executed inside a sandbox on the local machine.
workercauses commands to be executed using a persistent worker, if available.
dockercauses commands to be executed inside a docker sandbox on the local machine.
remotecauses commands to be executed remotely; this is only available if a remote executor has been configured separately.
Without going into details of all the strategies mentioned, it must be noted
local should be avoided if the build is to stay hermetic.
In addition to the strategy flags there are several ways to choose local
- Using a tag with special meaning such as
- Setting the
It should also be noted that, as of this writing, Windows has no support for sandboxing. Therefore build hermeticity on Windows cannot be enforced at that level.
Another pitfall is related to the
While using persistent workers can have performance benefits, these workers
will not use sandboxed execution by default. It must be enabled manually by
Environment variables can also be a source of inhermeticity. There are many ways to inherit the environment of the machine that executes the build:
- Setting the
Truein invocations of
Truein test attributes.
- Not using
- Using the
--action_envflag to inherit the value of a given environmental variable. This option can also be used with the
--action_env=name=valuesyntax. Extra care must be taken in that case to guarantee that
valuestays reasonably stable (e.g. it is not an absolute path which can vary from machine to machine).
Whenever the environment of host machine is inherited it becomes an input to the respective build actions and since it is very hard to ensure identical environments on different machines, especially developer machines, features like remote caching have no chance to work.
While most modern Bazel rules will provide a way to pin the toolchain that
is used for the build, others will default to simply picking up binaries
PATH. Nothing prevents these binaries to vary from machine to
machine. The built-in C and C++ rules are notorious for this kind of
behavior. It is worth paying attention to what kind of rules you are using
and what their guarantees with respect to hermeticity and reproducibility
Not a bug, but a feature—workspace status is in the gray
area with respect to hermeticity. Activated by the
--workspace_status_command command line option, it allows users to call an
arbitrary program before the build begins and then use its output to stamp
build results (e.g. status command could return git commit hash or time
stamp). If an action directly depends on the output of the status command,
typically stored as
bazel-out/stable-status.txt, then it will likely be invalidated and rebuilt more often than intended and
not benefit much from remote caching. Extra care must be exercised so as to pick only
relevant bits of information from
stable-status.txt, put them in a
separate file, and depend on that file only when truly necessary.
Unfortunately, there is always a new way to shoot yourself in the foot. Here are some examples:
- Repository rules can execute arbitrary code outside of the sandbox, they
can potentially break hermeticity. For example,
npm_installmay build native components with whichever compiler is in
PATH, linking against whichever system libraries are found. Avoiding such dependencies, importing them in a reproducible way, for example through rules_nixpkgs, or carefully controlling the environment during fetch may be solutions to this problem.
- Performing any non-deterministic actions. Creating archives (zip, tar, etc.) is a good example: The order of directory listings as well as timestamps are usually non-deterministic. The [reprodubile-builds project(https://reproducible-builds.org/docs/archives/) is a great resources to learn about these issues and how to circumvent them.
In general, detecting hermeticity issues is hard. The best strategy, it seems, is to attempt building your project in different environments and have Bazel write execlogs. An execlog is the ground truth about what is going on during the build. This page about troubleshooting remote cache hits describes how to make Bazel write execlogs. Let’s summarize it:
bazel cleanin order to force the subsequent build command to perform all necessary actions so that they end up in the execlog.
bazel build //your:target --execution_log_binary_file=/tmp/exec1.log. This will produce a binary execution log.
- Re-run the build (preceding it with a
bazel cleaninvocation) in a different environment or even in the same environment if there is a reason to suspect that something could change between two runs in the same environment.
- Compare execution logs following the instructions from this
section. The procedure involves building a
special parser that can convert binary execlogs produced by Bazel into
text and then diffing the obtained text files with a tool like
diff. Differences found in this way will reveal sources of inhermeticity.
With this approach the main question becomes “how to choose the environments in which builds are performed so as to detect all hermeticity issues.” There is no answer to this question that works in all cases. Varying host name and user name might catch some problems, while others may only reveal themselves in specific circumstances. If you already know what might be a source of potential problems that could help with choosing the right build environments for these tests. From a pragmatic point of view, choosing environments that are already typically used to perform builds (remote workers, build agents, local developer machines) is probably a good first step.
It is likely true that virtually all users of Bazel wish their builds be hermetic. The blog post summarizes most ways in which hermiticity can be violated and provides some suggestions about how to avoid the common pitfalls and debug hermeticity issues.