Sometimes benchmark results are skewed because code executed earlier
encounters different garbage collection overheads than that run later. bmbm
attempts to minimize this effect by running the tests twice, the first time as
a rehearsal in order to get the runtime environment stable, the second time
for real. GC.start is executed before the start of each of the real
timings; the cost of this is not included in the timings. In reality, though,
there’s only so much that bmbm can do, and the results are not guaranteed to
be isolated from garbage collection and other effects.

Because bmbm takes two passes through the tests, it can calculate the
required label width.