A Tour Around Buck2, Meta's New Build System

6 July 2023 — by Andreas Herrmann

Meta recently announced they have made Buck2 open-source. Buck2 is a from-scratch rewrite of Buck, a polyglot, monorepo build system that was developed and used at Meta (Facebook), and shares a few similarities with Bazel.

As you may know, the Scalable Builds Group at Tweag has a strong interest in such scalable build systems.

We were thrilled to have the opportunity to work with Meta on Buck2 to help make the tool useful and successful in the open-source use case.

We also have a strong focus and a lot of experience with the Bazel build system. To connect the two I will give an overview of how Buck2 differs from Bazel.

High-Level Differences

Buck2 is a fresh rewrite in Rust. The codebase is structured in a very modular fashion, such that most components can also be published as independent packages on crates.io, and are independently useful. For example, starlark-rust is used by Buck2 to parse and evaluate Starlark, however, it is also used for Buck2 IDE integration, and could be useful for Bazel IDE integration as well. The terminal UI engine used by Buck2 is split out into a reusable component called superconsole. Even the incremental computation engine is designed as a standalone component.

Bazel, on the other hand, is a much older codebase, with the first public commit dating back to 2015 and internal development dating back much longer. Most of it is implemented in Java, and while some components can be used independently, e.g., the remote worker, most are generally not developed or published as standalone components.

User Interface

Both build tools expose a similar user interface, taking commands of the form:

$ bazel build //some:target
$ buck2 build //some:target

Buck2’s user interface feels more modern and snappy compared to Bazel’s. The progress reporting uses modern TUI features, the error messages tend to indicate more clearly which target is failing to build and what the error is exactly. Also, Starlark errors are indicated more clearly using pretty error messages as one would expect in a modern programming language.

That said, Buck2 is still a relatively young tool and hasn’t been battle-tested as much outside of Meta as Bazel has been outside of Google, so one may encounter the occasional bug.

Buck2 places build outputs in a sub-directory in the current working directory, instead of Bazel’s approach of convenience symlinks such as bazel-out into a separate output base. The output paths take the form buck-out/v2/gen/PROJECT/HASH/some/file, where HASH is the configuration hash. Those familiar with Bazel’s configuration transitions will recognize some similarities. Hopefully, baking the configuration hash into every output path from the start will avoid some of the difficulties that Bazel is encountering with configuration transitions.

Target Definitions

Buck2 and Bazel are very similar in how build targets are defined. Both use dedicated build files that contain restricted Starlark code to define build targets in a declarative way. For example, the following defines a C++ library and binary in Buck2:

cxx_library(
    name = "lib",
    srcs = ["lib.cc"],
    headers = ["lib.h"],
)

cxx_binary(
    name = "main",
    srcs = ["main.cc"],
    deps = [":lib"],
)

In Bazel these files are called BUILD or BUILD.bazel. In Buck2, on the other hand, they are called BUCK. Though, this name can be configured with Buck2 and you may find TARGETS or BUILD used as well.

There are a few small, more cosmetic differences. For example, public visibility is declared via visibility = ["PUBLIC"] instead of Bazel’s special targets like //visibility:public. Runtime dependencies are declared on the resources attribute instead of Bazel’s data attribute. Also, use of : as a prefix on targets is enforced more consistently in Buck2.

External Dependencies

A big difference is the management of external dependencies and what Bazel calls workspaces.

A Bazel workspace is a directory tree that contains a WORKSPACE or WORKSPACE.bazel file at its root. A Bazel project has a main workspace, and it can invoke repository rules to import third-party dependencies, or perform configuration steps, and capture the result in an external workspace.

Repository rules are distinct from regular rules: They are executed in a earlier phase than regular build actions, are not isolated, can download artefacts from the internet, can inspect the host system, and can execute arbitrary programs.

Historically, the WORKSPACE file of the main workspace was used to invoke repository rules to manage dependencies and configuration. However, Bazel is introducing a new dependency management system called bzlmod. We’ll get back to it below.

A target in the external Bazel workspace “my-workspace” is referred to using the label syntax @my-workspace//some:target.

Buck2 doesn’t have repository rules and it doesn’t have workspaces. Instead, it has something called “cells”.

A cell is essentially a sub-directory of the project that can be seen as a more-or-less independent component. The label syntax around cells is similar to Bazel’s workspace label syntax, i.e., a target in the cell “my-cell” is referred to as my-cell//some:target.

Each cell can be configured individually with a dedicated .buckconfig file at its root. This file can configure options like the build file name (e.g. BUCK or TARGETS), but also where to find other cells that the current cell depends on.

A typical .buckconfig file might look like this:

[repositories]
root = .
prelude = prelude
toolchains = toolchains

[repository_aliases]
config = prelude
ovr_config = prelude

There is no WORKSPACE file in a Buck2 project, and there are no repository rules to import external dependencies. Instead, at least for now, cells are just sub-directories. Third-party dependencies, like the Starlark “prelude”, can be imported as Git submodules.

With Bazel’s new dependency management system bzlmod, it introduces proper version resolution and transitive dependency management for native Bazel projects. It also introduces a more unified and principled interface for language-specific package manager dependencies, such as Maven packages.

Such dependencies currently don’t have unified support in Buck2 projects. For example, Buck2 itself is a Rust project and as such depends on third-party Rust packages downloaded from crates.io. These Rust crates are currently imported using Reindeer in a self-hosted build, i.e., when Buck2 builds itself. The tool is run before the Buck2 build, discovers the required dependencies based on the Cargo configuration, and generates Buck2 targets to import and build them.

Notably, contrary to Bazel rules, Buck2 allows regular build rules to download files from the internet provided that the download is reproducible, i.e., the expected content hash is pre-declared. This means that downloads of third-party code can occur in regular build rules, reducing the need for complex features like repository rules.

Our experience with Bazel projects has shown that one of the largest sources of pain and frustration relating to third-party dependencies comes from C/C++ packages. The reason is simply that, contrary to most other language ecosystems, C/C++ has no established standard package management or distribution mechanism.

The build systems used by existing projects are diverse and often bespoke, and Bazel projects with C/C++ dependencies cannot rely on one standard way to import these dependencies. In simple cases, it may be straight forward to invoke the dependency’s build system inside Bazel or even just compile a few C sources with Bazel’s builtin rules. But, in difficult cases, it is necessary to perform bespoke integration work; migrating the dependency to be built with Bazel is often the easiest solution.

But, while they aren’t quite as standard as, say, Maven for Java, there are package managers for the C and C++ ecosystems. We believe that developer time can be spent in better ways than migrating third-party C/C++ project’s build systems to Bazel, Buck2, and other build systems over and over again. So, we looked for a way to reuse the packaging work that was already done by the community. To that end, we started work on Buck2 integration for the Conan package manager. Detailed motivation can be found in the linked PR, but in brief, of the available options, it is the one with the largest package set, widest range of platform support, and easiest integration.

Platforms and Toolchains

Both Bazel and Buck2 have a notion of platforms and toolchains to support multiplatform and cross-platform projects and are able to automatically select the correct toolchain for each pair of execution and target platform.

In Bazel this is done by registering toolchains using the built-in Starlark API or command-line flags. The appropriate toolchain is then resolved out of all the registered instances.

The approach in Buck2 is a bit more explicit: Rules depend on a hard-coded top-level toolchain target, e.g., C++ rules depend on toolchains//:cxx. That target can then use select, similar to Bazel, to pick the appropriate toolchain.

A project using Buck2 will typically contain its own toolchains cell that loads and calls rules and macros provided by the prelude to configure the toolchains it needs.

We contributed a C/C++ toolchain based on Zig that contains a simple toolchain configuration example here and a more involved example supporting cross-compilation here.

Note, the Zig foundation is planning changes to the Zig compiler that will have an impact on the implementation of this Zig based C/C++ toolchain. The plan is to move the zig cc and zig c++ commands into an independently maintained project, instead of them being a part of the Zig compiler itself.

Remote Cache and Execution

Bazel is well known for its caching and distributed build capabilities. Buck2 also offers these features and, while Meta uses a dedicated protocol internally, we contributed support for Bazel’s remote build execution protocol to the open-source version of Buck2.

It’s worth pointing out, however, that Buck2 relies on remote execution for build isolation to enforce hermetic builds. Purely local builds are currently not sandboxed in Buck2 and therefore hermeticity cannot be enforced.

We think there is room for improvement in this area, as isolated local builds would be very valuable for open-source projects, where not every contributor may have access to remote execution infrastructure.

Extensibility

Both Buck2 and Bazel support user-defined extensions implemented in Starlark. However, Bazel still has a number of built-in rules implemented in Java, and migrating them to Starlark is an ongoing effort. In Buck2, on the other hand, all rules are implemented in Starlark. Meaning, for an extension author, there is no hidden functionality only accessible to built-in rules. Anything that the existing Buck2 rules can do, your rules could do too.

Going even further, Buck2 can be extended with the Buck2 Extension Language (BXL): Starlark-based scripts that can access the build graph and issue commands like queries or builds.

The Starlark extension API is very similar to Bazel’s but has a few key differences and improvements. For example, Buck2’s Starlark supports and uses Python-like type annotations and checks, with type errors being treated as errors.

The API also tends to be more composable: Types of rule attributes are not limited to a fixed set of possible types, but can be composed using combinators, e.g., attrs.list(attrs.source(), default = []) defines an attribute type of list of source files. In Bazel this would typically be defined using attr.label_list(allow_files=True).

The build action API is quite different. For running commands there is only ctx.actions.run which takes a cmd_args object. This object captures all inputs and outputs directly, and is composable. This makes it much easier to factor out and abstract functionality and write modular code. For example, a toolchain can define its compiler as follows:

compiler = cmd_args([cxx_file] + default_flags)
compiler.hidden(extra_inputs)

This captures the cxx_file object for the compiler itself, and any additional input files stemming from default_flags. The member function .hidden is used to add extra_inputs, which are required to execute the command, but should not appear on the command-line.

Rules can then invoke this compiler by simply applying run to this with additional flags like so:

ctx.actions.run([toolchain.compiler, some, more, flags])

Rules do not need to manually forward file dependencies of the compiler itself. This reduces the likelihood of forgotten inputs in rules. These are particularly hard to debug if they only occur in certain edge cases, as that often means that they go unnoticed by the rule developers and only surface at the use site.

Incremental Builds

Both Bazel and Buck2 aim to provide fast and correct builds and to avoid unnecessary work by supporting incremental builds, i.e., changes should only trigger rebuilds of artefacts affected by those changes.

However, both build systems also aim to scale to very large projects and support fast queries across the target dependency graph without requiring a build first.

These two goals can be at odds from time to time. Some programming languages require modules to be compiled in dependency order — Haskell, for example. If we want fine-grained incremental builds and a static build graph, then we have to map the module dependency graph into our build system. You can read here how we solved this problem for Bazel.

Buck2’s answer is dynamic dependencies. At the target level, Buck2 still enforces a static dependency graph. However, it weakens this restriction at the action level. Within a rule implementation, we are allowed to declare a set of build actions whose interdependencies will only be determined dynamically at build time. It provides a rather elegant API for this:

First, we register a build action whose output defines the fine-grained dependency graph. Think of something like gcc -M generating a .d file.
Then, we have to define all the possible inputs and outputs for the dynamic actions upfront.
Finally, we define a function that will be invoked during the build once the fine-grained dependency graph is available, and that function will register the actual build actions with the correct fine-grained dependency graph.

Relatedly, Buck2 also offers incremental build actions. In these, we can pass the outputs of a previous instance of a build action as inputs into a new instance of a build action. This is useful if your compiler already has built-in support for incremental builds. However, this feature also raises the risk of non-hermetic and non-deterministic builds. So, it should be used with great care.

In Conclusion

Bazel has proven its worth on many projects and has established principles like reproducible and hermetic builds in industry practice. However, we think the last build system has not been written, yet, and there is still plenty of room for innovation and new ideas.

Therefore, we very much welcome Buck2, which draws lessons from Bazel and Buck1, as well as from years of experience in academic research on build systems, and pushes the field forward yet again. For more information, you may also find the recent Reddit AMA by the Buck2 team and BuildBuddy’s Buck2 unboxing post very interesting.

It is good and healthy to have multiple competing tools on the market for users to be able to choose the best fit for their needs and for the competing projects to push each other and the industry forwards faster.

Behind the scenes

Andreas Herrmann

Andreas is a physicist turned software engineer. He leads the Bazel team, and maintains Tweag's open source Bazel rule sets and the capability package. He is passionate about functional programming, and hermetic and reproducible builds. He lives in Zurich and is active in the local Haskell community.

Tech Group

Scalable Builds

Correct, efficient, and reliable builds are critical for developers to work and collaborate effectively.

If you enjoyed this article, you might be interested in joining the Tweag team.

This article is licensed under a Creative Commons Attribution 4.0 International license.

← Packaging Topiary in OPAM Python Monorepo: an Example. Part 2: A Simple CI →