It’s a good idea to run unused_deps regularly to keep things tidy. For example, the Bazel project does not run it automatically and has nearly 1000 unneeded deps (oops). You might want to add a git hook or something to your CI to run for every change.

I’ve heard a lot of users say they want a more comprehensive introduction to writing build extensions for Bazel (aka, Skylark). One of my friends has been working on Google Classroom and they just launched, so I created a build extensions crash course. I haven’t written much content yet (and I don’t understand exactly how Classroom works), but we can learn together! If you’re interested:

It’s free and you can get in on the ground floor of… whatever this is. If you’ve enjoyed/found useful my posts on Skylark, this should be a more serious business and well-structured look at the subject.

I’ll try to release content at least once a week until we’ve gotten through all the material that seems sensible, I get bored, or people stop “attending.”

In a toy example, this works fine. However, in a real project, we might have tens of thousands of .o files across the build tree. Every cc_library would create a new array and copy every .o file into it, only to move up the tree and make another copy. It’s very inefficient.

Enter nested sets. Basically, you can create a set with pointers to other sets, which isn’t inflated until its needed. Thus, you can build up a set of dependencies using minimal memory.

To use nested sets instead of arrays in the previous example, replace the lists in the code with depset and |:

Aspects are a feature of Bazel that are basically like fan-fic, if build rules were stories: aspects let you add features that require intimate knowledge of the build graph, but that that the rule maintainer would never want to add.

For example, let’s say we want to be able to generate Makefiles from a Bazel project’s C++ targets. Bazel isn’t going to add support for this to the built-in C++ rules. However, lots of projects might want to support a couple of build systems, so it would be nice to be able to automatically generate build files for Make. So let’s say we have a simple Bazel C++ project with a couple of rules in the BUILD file:

We can use aspects to piggyback on Bazel’s C++ rules and generate new outputs (Makefiles) from them. It’ll take each Bazel C++ rule and generate a .o-file make target for it. For the cc_binary, it’ll link all of the .o files together. Basically, we’ll end up with a Makefile containing:

(If you have any suggestions about how to make this better, please let me know in the comments, I’m definitely not an expert on Makefiles and just wanted something super-simple.) I’m assuming a basic knowledge of Bazel and Skylark (e.g., you’ve written a Skylark macro before).

This means that the aspect will follow the “deps” attribute to traverse the build graph. We’ll invoke it on //:bin, and it’ll follow //:bin‘s dep to //:lib. The aspect’s implementation will be run on both of these targets.

Add the _impl function. We’ll start by just generating a hard-coded Makefile:

Bazel doesn’t print anything, but it has generated bazel-bin/Makefile. Let’s create a symlink to it in our main directory, since we’ll keep regenerating it and trying it out:

$ ln -s bazel-bin/Makefile Makefile
$ make
g++ -o bin bin.cc lib.cc
$

The Makefile works, but is totally hard-coded. To make it more dynamic, first we’ll make the aspect generate a .o target for each Bazel rule. For this, we need to look at the sources and propagate that info up.

Basically: run g++ on all of the srcs for a target. You can add a print(cmd) to see what cmd ends up looking like. (Note: We should probably do something with headers and include paths here, too, but I’m trying to keep things simple and it isn’t necessary for this example.)

Now we want to collect this command, plus all of the commands we’ve gotten from any dependencies (since this aspect will have already run on them):

Now we need the last “bin” target to be automatically generated, so we need to keep track of all the intermediate .o files we’re going to link together. To do this, we’ll add a “dotos” list that this aspect propagates up the deps.

This is similar to the transitive_cmds list, so add a couple lines to our deps traversal function:

Documentation about aspects can be found on bazel.io. Like Skylark rules, I find aspects a little difficult to read because they are inherently recursive functions, but it helps to break it down (and use lots of prints).

AutoValue is a really handy library to eliminate boilerplate in your Java code. Basically, if you have a “plain old Java object” with some fields, there are all sorts of things you need to do to make it work “good,” e.g., implement equals and hashCode to use it in collections, make all of its fields final (and optimally immutable), make the fields private and accessed through getters, etc. AutoValue generates all of that for you.

To get AutoValue work with Bazel, I ended up modifying cushon’s example. There were a couple of things I didn’t like about it, since I didn’t want the AutoValue library to live in my project’s BUILD files. I set it up so it was defined in the AutoValue project, so I figured I’d share what I came up with.

In your WORKSPACE file, add a new_http_archive for the AutoValue jar. I’m using the one in Maven, but not using maven_jar because I want to override the BUILD file to provide AutoValue as both a Java library and a Java plugin:

Then if you ran bazel build --package_path=/home/user/gitroot/my-project:/usr/local/other-proj //..., Bazel would “see” the directory tree:

WORKSPACE
BUILD
foo/
bar/
BUILD

If a package is defined in multiple package paths, the first path “wins” (e.g., the top-level package containing foo/ above).

Finally, source code for external repositories is tucked away in Bazel’s output base. You can see it by running:

$ ls $(bazel info output_base)/external

Execution root

Once bazel figures out what packages a build is going to use, it creates a symlink tree called the execution root, where the build actually happens.* It basically traverses all of the packages it found and comes up with the most efficient way it can to symlink them together. For example, in the directory tree above, you’d end up with:

Sandbox

Here’s the * from the execution root section above! If you’re on Linux (and, hopefully, OS X soon), your build actually takes place in a sandbox based on the execution root. All of the files your build needs (and hopefully none it doesn’t) are mounted into their own namespace and executed in a hermetically sealed environment: no network or filesystem access (other than what you specified).

You can see what’s being mounted and where by running Bazel with a couple extra flags:

$ bazel build --sandbox_debug --verbose_failures //...

Derived roots

We’re working on improving how configurable Bazel is, but for years the main configuration options have been the platform you’re building for and what your compiler options were. If you run an optimized build, then a debug build, and then an optimized build again, you’d like your results to be cached from the first run, not overwritten each time. In a somewhat questionable design move, Bazel uses a special set of output directories that are named based on the configuration. So if you build an optimized binary, it’ll create it under execroot/my-project/bazel-out/local-opt/bin/my-binary. If you then build it as a debug binary, it’ll put it under execroot/my-project/bazel-out/local-dbg/bin/my-binary. Then, if you build it optimized again, it’ll be able to switch back to using the local-opt directory. (However, bazel uses symlinks out the wazoo, so I don’t know why it doesn’t use symlinks to track which configuration is being used. Seems like it’d be a lot easier to have the outputs directly under execroot/my-project.)

Also, Bazel distinguishes between files created by genrule and… everything else. Almost all output is under bazel-out/config/bin, but genrules are under bazel-out/config/genfiles.

(I’m kind of bitter about these files, I’ve been working on a change for what feels like months because of this stupid directory structure.)

Note that Bazel symlinks execroot/ws/bazel-out/config/bin to bazel-bin and execroot/ws/bazel-out/config/genfiles to bazel-genfiles. These “convenience symlinks” are the outputs shown at the end of your build.

Runfiles

Suppose you build a binary in your favorite language. When you run that binary, it tries to load a file during runtime. Bazel encourages you to declare these files as runfiles, runtime dependencies of your build. If they change, your binary won’t be recompiled (because they’re runtime, not compile-time dependencies) but they will cause tests to be re-run.

Bazel create a directory for these files as a sibling to your binary: if you build //foo:my-binary, the runfiles will be under bazel-bin/foo/my-binary.runfiles. You can explore the directory or see a list of them all in the runfiles manifest, also a sibling of the binary:

$ cat bazel-bin/foo/my-project.runfiles_manifest

Note that a binary can run files from anywhere on the filesystem (they’re binaries, after all). We just recommend using runfiles so that you can keep them together and express them as build dependencies.

If you’ve every used the AppEngine rules, you know the pain that is wait for all 200 stupid megabytes of API to be downloaded. The pain is doubled because I already have a copy of these rules on my workstation.

To use the local rules, all I have to do is override the @com_google_appengine_java repository in my WORKSPACE file, like so:

However, this is still imperfect: I don’t really want to maintain changes that basically amount to a performance optimization in my local client.

By using environment variables in the appengine_repository rule, we can do even better. I’m going to create a new rule that checks if the APPENGINE_SDK_PATH environment variable is set. If it is, it will use a local_repository to pull in AppEngine, otherwise it will fall back on downloading the .zip.

So, to start, let’s take a look at the existing rule that pulls in the AppEngine SDK. As of this writing, it looks like this:

You can’t actually set APPENGINE_SDK_PATH to where Bazel downlaoded the SDK the first time around ($(bazel info output_base)/external/com_google_appengine_java), which is suuuuper tempting to do. If you do, Bazel will delete the downloaded copy (because you changed the repository def) and then symlink the empty directory to itself. Never what you want.

It caches the environment variable, so if you change your mind you have to run bazel clean to use a different APPENGINE_SDK_PATH. I think this is a bug, although there’s some debate about that.

I am super excited that pmbethe09 and lautentlb just put in a bunch of extra work to open source Buildifier. Buildifier is a great tool we use in Google to format BUILD files. It automatically organizes attributes, corrects indentation, and generally makes them more readable and excellent.

Rules in Bazel often need information from their dependencies. My previous post touched on a special case of this: figuring out what a dependency’s runfiles are. However, Skylark is actually capable of passing arbitrary information between rules using a system known as providers.

Suppose we have a rule, analyze_flavors, that figures out what all of the flavors are in a dish. Our build file looks like: