Tag: ibb

On the Performance of ReadDirectoryChangesW

Brandon Ehle is absolutely right when he says ReadDirectoryChangesW is not free. In fact, many modern applications monitor directory trees for changes, so, on a fast disk such as my Intel SSD, “svn up” might quickly create and destroy thousands of zero-byte lockfiles. When I have Visual Studio and ibb_server running, I have witnessed them peg a core each to handle the filesystem events.

While filesystem monitors consumes a great deal of CPU in aggregate, the important metric to consider is latency. That is, it doesn’t actually matter that my CPU is pegged while I “svn up”: the factor limiting my efficiency is the sheer number of system calls to the OS. Cores are cheap and getting cheaper, but latency in computing systems isn’t improving. Put another way, each ReadDirectoryChangesW adds a small constant amount of work to what is already an O(files) process: the root problem is that svn up needs to create and destroy thousands of lockfiles. (I assume a small constant set of tools running at any given time. Assigning one core per tool seems realistic enough.)

Since common-case latency is the primary constraint we must optimize, converting O(n) to O(1) is a huge win. O(n) to O(k*n) for some small constant k isn’t as important.

I might also argue that O(n) use cases should be eliminated if possible. “svn up” already knows the changeset from the server – if it knew the changeset on the client without locking and scanning every directory, it would become O(changeset + localchanges).

What if you stop ibb_server and rebuild?

Then the next build is O(n) like SCons or Make, but there’s no reason it needs to be slower. Build signatures should definitely be cached in a local database so subsequent builds merely require scanning all files in the DAG. Recompiling from scratch upon every reboot would be a dealbreaker.

I previously argued that any tool whose running time is proportional with the number of files in a project scales quadratically with time. Bluem00 on Hacker News pointed me towards Tup, a scalable build system with goals similar to ibb.

Mike Shal, Tup’s author, wrote Build System Rules and Algorithms, formalizing the algorithmic deficiencies with existing build systems and describing Tup’s implementation, a significant improvement over the status quo. I would like to document my analysis of Tup and whether I think it replaces ibb.

Before we get started, I’d like to thank Mike Shal for being receptive to my comments. I sent him a draft of my analysis and his responses were thoughtful and complete. With his permission, I have folded his thoughts into the discussion below.

Is Tup suitable as a general-purpose build system? Will it replace SCons or Jam or Make anytime soon? Should I continue working on ibb?

Remember our criteria for a scalable build system, one that enables test-driven development at arbitrary project sizes:

O(1) no-op builds

O(changes) incremental builds

Accessible dependency DAG atop which a variety of tools can be built

Without further ado, my thoughts on Tup follow:

Syntax

Tup defines its own declarative syntax, similar to Make or Jam. At first glance, the Tup syntax looks semantically equivalent to Make. From the examples:

: hello.c |> gcc hello.c -o hello |> hello

Read the dependency graph from left to right: hello.c is compiled by gcc into a hello executable. Tup supports variable substitution and limited flow control.

Build systems are inherently declarative, but I think Tup’s syntax has two flaws:

Inventing a new syntax unnecessarily slows adoption: by implementing GNU Make’s syntax, Tup would be a huge drop-in improvement to existing build systems.

Even though specifying dependency graphs is naturally declarative, I think a declarative syntax is a mistake. Build systems are a first-class component of your software and your team’s workflow. You should be able to develop them in a well-known, high-level language such as Python or Ruby, especially since those languages come with rich libraries. As an example, SCons gets this right: it’s trivial for me to write CPU autodetection logic for parallel builds in a build script if that makes sense. Or I can extend SCons’s Node system to download source files from the web.

Implementation Language

Tup is 15,000 lines of C. There’s no inherent problem with C, but I do think a community-supported project is more likely to thrive in a faster and safer language, such as Python or Ruby. Having worked with teams of engineers, it’s clear that most engineers can safely work in Python with hardly any spin-up time. I can’t say the same of C.

Git is an interesting case study: The core performance-sensitive data structures and algorithms are written in C, but many of its interesting features are written in Perl or sh, including git-stash, git-svn, and git-bisect. Unlike Git, I claim Python and Ruby are plenty efficient for the entirety of a scalable build system. Worst case, the dependency graph could live in C and everything else could stay in Python.

Scanning Implicit Dependencies

The Tup paper mentions offhand that it’s trivial to monitor a compiler’s file accesses and thus determine its true dependencies for generating a particular set of outputs. The existing implementation uses a LD_PRELOAD shim to monitor all file accesses attempted by, say, gcc, and treats those as canonical input files. Clever!

This is a great example of lateral, scrappy thinking. It has a couple huge advantages:

No implicit dependencies (such as C++ header file includes) need be specified — if all dependencies come from the command line or a file, Tup will know them all.

It’s easy to implement. Tup’s ldpreload.c is a mere 500 lines.

And a few disadvantages:

Any realistic build system must treat Windows as a first-class citizen. Perhaps, on Windows, Tup could use something like Detours. I’ll have to investigate that.

Intercepting system calls is reliable when the set of system calls is known and finite. However, there’s nothing stopping the OS vendor from adding new system calls that modify files.

Incremental linking / external PDB files: these Visual C++ features both read and write the same file in one compile command. SCons calls this a SideEffect: commands that share a SideEffect cannot parallelize. A build system that does not support incremental linking or external symbols would face resistance among Visual C++ users.

And some open questions:

I haven’t completely thought this through, but it may be important to support user-defined dependency scanners that run before command execution, enabling tools such as graph debugging.

I don’t have a realistic example, but imagine a compiler that reads spurious dependency changes from run to run; say, a compiler that only checks its license file on every other day.

Stepping back, I think the core build system should not be responsible for dependency scanning. By focusing on dependency graph semantics and leaving dependency scanning up to individual tools (which may or may not use LD_PRELOAD or similar techniques), a build system can generalize to uses beyond compiling software, as I mentioned in my previous blog post.

Dependency Graph

Tup’s dependency DAG contains two types of nodes: Commands and Files. Files depend on Commands and Commands depend on other Files. I prefer Tup’s design over SCons’s DAG-edges-are-commands design for two reasons:

It simplifies the representation of multiple-input multiple-output commands.

Some commands, such as “run-test foo” or “search-regex some.*regex” depend on source files but produce no files as output. Since they fit naturally into the build DAG, commands are a first-class concept.

Build Reliability

Tup, like SCons, places a huge emphasis on build reliability. This is key and I couldn’t agree more. In the half-decade I’ve used SCons, I can count the number of broken builds on one hand. Sadly, many software developers are used to typing “make clean” or clicking “full rebuild” when something is weird. What a huge source of waste! Developers should trust the build system as much as their compiler, and the build system should go out of its way to help engineers specify complete and accurate dependencies.

Reliable builds imply:

Changes are tracked by file contents, not timestamps.

The dependency graph, including implicit dependencies such as header files and build commands, is complete and accurate by default.

Compiler command lines are included in the DAG. Put another way: if the command used to build a file changes, the file must be rebuilt.

Tup takes a strict functional approach and formalizes build state as a set of files and their contents. (I would argue build state also includes file metadata such as file names and timestamps, at least if the compiler uses such information.) If the build state does not change between invocations, then no work must be done.

Tup even takes build reliability one step further than SCons: If you rename a target file in the build script, Tup actually deletes the old built target before rebuilding the new one. Thus, you will never have stale target executables lying around in your build tree.

Nonetheless, there are situations where a project may choose to sacrifice absolute reliability for significant improvements in build speed, such as incremental linking discussed above.

Core vs. Community

A build system is a critical component of any software team’s development process. Since every team is different, it’s essential that a build system is flexible and extensible. SCons, for example, correctly chose to implement build scripts in a high-level language (Python) with a declarative API for specifying nodes and edges in the dependency graph.

However, I think SCons did not succeed at separating its core engine from its community. SCons tightly couples the underlying dependency graph with support for tools like Visual C++, gcc, and version control. The frozen and documented SCons API is fairly high-level while the (interesting) internals are treated as private APIs. It should be the opposite: a dependency graph is a narrow, stable, and general API. By simplifying and documenting the DAG API, SCons could enable broader uses, such as unit test execution.

Configuration

Like Tup’s author, I agree that build autoconfiguration (such as autoconf or SCons’s Configure support) should not live in the core build system. Autoconfiguration is simply an argument that build scripts should be specified in a general programming language and that the community should develop competing autoconfiguration systems. If a particular autoconfiguration system succeeds in the marketplace, then, by all means, ship it with your build tool. Either way, it shouldn’t have access to any internal APIs. Configuration mechanisms are highly environment-sensitive and are best maintained by the community anyway.

DAG post-process optimizations

Another argument for defining a build tool in a general-purpose language is to allow user-defined DAG optimizations and sort orders. I can think of two such use cases:

Visual C++ greatly improves compile times when multiple C++ files are specified on one command line. In fact, the benefit of batched builds can exceed the benefit of PCH. A DAG optimizer would search for a set of C++ source files that produce object files in the same directory and rewrite the individual command lines into one.

When rapidly iterating, it would be valuable for a build system or test runner to sort such that the most-recently-failed compile or test runs first. However, when hunting test interdependencies as part of a nightly build, you may want to shuffle test runs. On machines with many cores but slow disks, you want to schedule expensive links as soon as possible to mitigate the risk that multiple will execute concurrently and thrash against your disk.

Conclusion

Tup is a significant improvement over the status quo, and I have personally confirmed its performance — it’s lightning fast and it scales to arbitrary project sizes.

However, without out-of-the-box Windows support, a mainstream general-purpose language, and a model for community contribution, I don’t see Tup rapidly gaining traction. With the changes I suggest, it could certainly replace Make and perhaps change the way we iterate on software entirely.

Let’s spend a minute and talk about the performance of a tool you use every day: your build system. For this purpose, they’re all the same, so let’s assume you use GNU Make. For a no-op build, what is Make’s running time? You know, that O(N) stuff.

If you said O(Files), you’re right. Every time you type make and press enter, it scans every file in the project, looking for out-of-date sources. That doesn’t sound so bad. Linear algorithms are good, right?

Oh, but projects grow over time, typically in proportion with the number of engineers (even if some are netnegative). Since projects grow:

21.5 MINUTES! This is a no-op build! I changed nothing from the last build!

Mozilla’s build is an especially egregious example, but it’s not uncommon for nontrivial Make, SCons, or Jam no-op builds to exceed 15 seconds. Every time you run SCons, for example, it:

loads Python,

imports a bunch of Python code,

creates an entire dependency graph in memory,

scans all files for changes,

determines what to build,

builds it,

and throws all of those intermediate results away.

Am I the only person who thinks this situation is insane? We’re spending all of our time improving constantfactors rather than addressing fundamentally quadratic build algorithms.

No-op builds should be O(1) and instantaneous, and most other builds should be O(WhateverChanged). What if I told you this is possible?

Introducing IBB: I/O-Bound Build

Dusty, one of the best engineers I know, once told me that C compilers are now fast enough that they’re I/O-bound, not CPU-bound as you’d expect. This was inspiration for a build system dependency engine architecture:

A long-running server process that contains a complete dependency graph between files, including the commands required to update targets from sources.

An ultra-tiny C program that communicates build requests to the build server.

A background thread that watches for filesystem updates (via ReadDirectoryChangesW or libfam) and updates the dependency graph in memory.

I have built a prototype of ibb’s architecture, and its code is available on my github.

In ibb, a no-op build takes constant time, no matter the size of the project. I tested with Noel Llopis’s build system stress test and no-op builds are merely 50 milliseconds. Take that, everyone else!

A Faster Grep

Let’s say you wanted to run a regular expression across your entire codebase. On Windows, you’ll spend all of your time in kernel calls, even if the files are in cache. However, remember that modern memory bandwidths are measured in GB/s…

ibb’s file search implementation can run a regular expression across 200 MB of PHP and JavaScript in 100 milliseconds. It copies file data into memory on the first run and rapidly scans across that memory on future runs. It could just as easily mmap and use OS’s disk cache to avoid kernel calls.

Minimal Test Runner

Carl Masak’s inspirational addictive TDD harness is also possible with ibb. If you can determine which unit tests depend on which source files (via an __import__ hook in Python, say), you can run appropriate tests immediately after saving a source file.

Version Control

I’ve watched both git and Subversion fall apart as a codebase grows. git is so faaast… until you import a 20 GB repository. Then it’s O(Files) just like everything else.

What This Means For You

Running tests, grepping code, and building binaries from source are equivalent problems. ibb’s architecture supports lightning-fast implementations of all of the above.

By converting an O(Time^2) build, test runner, or version control system to O(1), we eliminate hours of costly engineer time and encourage good habits like committing often and frequently running tests, no matter the age of the project. Frankly, instant feedback feels great.

It’s crazy to me that this architecture isn’t standard practice across development tools.

Take ibb’s concepts or code, integrate it with your engineering processes, and improve the quality of your team’s life today.