Skyscope is a new tool from the Scalable Build Systems team at Tweag. You can use it to visualise and explore Bazel build graphs in your web browser. More specifically, it lets you import a snapshot of a Skyframe graph (which might contain hundreds of thousands of nodes) and then focus on a particular area of interest. For example, this image was produced by running Skyscope on its own build graph:
The Bazel documentation gives a good overview of Skyframe that is worth reading if you’re interested, so we won’t go into the details here. Essentially though, Skyframe is the underlying model that Bazel uses to determine what actions it needs to perform when you run a command. The main Skyframe data structure is a dependency graph where the nodes are entities like ConfiguredTarget, FileState and ActionExecution.
Bazel provides high level ways to access this information (e.g.
bazel cquery) and mostly those methods are sufficient. However it can
sometimes be helpful to get at the raw and unfiltered Skyframe graph, even if
just to learn more about the internals of Bazel. This can be done with the
bazel dump --skyframe command. The problem is this command produces a huge
volume of fairly opaque textual data.
It would be great to feed this data into Graphviz so it could be
examined visually (like you can do with
bazel query --output graph), but with
hundreds of thousands of nodes that’s a non-starter. Graphviz would likely run
until the heat death of the universe but if it ever did finish, the layout
would be too tangled to make any sense of.
The idea behind Skyscope is to mitigate these problems by focusing on a more manageable subgraph. This is done by rendering only a small subset of all the nodes and edges in the graph and hiding the rest. You can enter a search pattern to find nodes to add to this subset. You can also click on various parts of the graph to add and remove nodes interactively.
There are a few different ways of getting Skyscope, but all
jq packages to be installed on your system
as a prerequisite. In this first example we will be looking at the
Skylib repository. You can adapt the commands for your own Bazel
repository, but if you want to follow along with the example you should begin
by cloning and building Skylib:
git clone https://github.com/bazelbuild/bazel-skylib.git cd bazel-skylib && bazel build //distribution:bazel-skylib
Then the easiest method1 of getting Skyscope is to append this snippet to
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive") http_archive( name = "skyscope", sha256 = "5544313ec77adbc96856c4cdfb3dfc6b5409e05790860ae19c7d321fb585490b", urls = ["https://github.com/tweag/skyscope/releases/download/v0.2.7/skyscope.zip"] ) load("@skyscope//:repository.bzl", "configure_skyscope") configure_skyscope()
Having done that, you can run the following commands to import a snapshot of the build graph:
# Clear the graph by stopping the Bazel server. bazel shutdown # Populate the graph again. bazel build //distribution:bazel-skylib # Import a snapshot of the graph. bazel run -- @skyscope//:import
The fewer dependencies the target you build has, the faster the import process will be. Once it is complete you will be prompted to open a link in your browser to view the graph. In the main search box, enter rule labels or filenames to find and display as nodes:
You can freely explore adjacent nodes by clicking them. Please consult the documentation for detailed usage instructions (if nothing else, it’s worth reading the section on exploring the graph to familiarise yourself with the basic interface).
We will now take a look at a couple of common use cases across a few different repositories. The procedure to install Skyscope in these is the same as above.
1. Find a dependency path from one target to another
Much like the
somepath function, you can use Skyscope to
find dependency paths between targets. The example below is for the TensorFlow
repository and the import was prepared as follows:
# Clear the Skyframe graph. bazel shutdown # Populate the graph with ConfiguredTarget nodes for tf-reduce and its dependencies. bazel cquery //tensorflow/compiler/mlir:tf-reduce # Import a snapshot of the graph with extra context for targets under the //tensorflow package. bazel run -- @skyscope//:import --query=//tensorflow/... --no-aquery
The import process will take a few minutes to index all the paths. Once it is
complete, you can search for the
kernels.cc targets and make
Click on Open to make the nodes on the dependency path visible. If you hover over a node, extra context will be displayed in a tooltip:
2. Discover the actions needed to transform a source file into a binary
The previous example looked only at
ConfiguredTarget nodes (populated by
bazel cquery) so let’s now take a look at a graph that has some
ActionExecution nodes. In this example we will import from the Bazelisk
# Clear the Skyframe graph. bazel shutdown # Populate the graph with the actions needed to build bazelisk. bazel build //:bazelisk # Import a snapshot of the graph (with default extra context). bazel run -- @skyscope//:import
Begin by searching for and displaying the
TargetCompletion node for
//:bazelisk. Then make the
ConfiguredTarget nodes for
bazelisk.go visible too:
If you open the dependency paths you can see the
actions that produce the
bazelisk binary. Also note that these actions have a
dependency on the
go_binary rule that created them:
The biggest limitation at the moment is probably the import speed. For small workspaces and rules with relatively few dependencies it only takes a couple of seconds, but larger workspaces can take several minutes to import and index. As yet no attention has been given to performance optimisation, so there are likely some easy gains to be had here.
A related issue is the whole graph is loaded into memory before being written to SQLite. That was a convenient way to prototype, but now it would be good to switch to a streaming based approach where the Skyframe data is parsed and written to the database in chunks. This would drastically reduce peak memory usage and make it possible to import from big repositories on systems with less RAM. For example on my laptop, trying to import Envoy causes the process to get OOM killed2 at around the 10GB mark.
As for new features, the issue tracker has a few ideas but I’d like to see if this is something people find useful before deciding what to work on next, so please give it a try if you can. Feedback and suggestions are very much welcome!
- While convenient, there are some drawbacks to this method. The Skyframe graph
will contain nodes related to running Skyscope itself, and depending on your
bazel runcan cause parts of the graph to become invalidated. It should work for the examples here but if you encounter this issue, you can manually install a release instead.↩
- Terminated by the Out-of-Memory manager for using too much memory.↩