The NoFib Benchmark Suite

Getting nofib

The NoFib suite is kept in a separate git repository (see Repositories), and it should be checked out at the top level of a GHC source tree, i.e. at the same level as compiler and libraries. From your GHC tree, run:

./sync-all --nofib get

It will be pulled into the a "nofib" subdirectory.

Benchmarking

Firstly, the nofib-analyse program requires the html and regex-compat cabal packages be installed:

to generate a comparison of the runs in captured in nofib-log-6.4.2
and nofib-log-6.6. When making comparisons, be careful to ensure
that the things that changed between the builds are only the things
that you wanted to change. There are lots of variables: machine,
GHC version, GCC version, C libraries, static vs. dynamic GMP library,
build options, run options, and probably lots more. To be on the safe
side, make both runs on the same unloaded machine.

To get measurements for simulated instruction counts, memory reads/writes, and "cache misses",
you'll need to get hold of Cachegrind, which is part of
​Valgrind. You can run nofib under valgrind like this:

Complete recipe

The output of the nofib-analyse tool is quite readable, with two provisios:

Missing values in the output typically mean that the benchmark crashed and may indicate a problem with your optimisation

If a difference between the two modes is displayed as an absolute quantity instead of a percentage, it means that the difference was below the threshold at which the analyser considers it significant

If the comparison identifies any particularly bad benchmark results, you can run them individually by changing into their directory and running something like:

EXTRA_HC_OPTS="-fenable-cool-optimisation -ddump-simpl" make

You can add whatever dumping flags you need to see the output and understand what is going wrong.

Some tests may require packages that are not in the ghc tree. You can add these to the inplace package database (inplace/lib/package.conf.d) using cabal. For example you can install parsec using the inplace compiler and inplace package database by running the following command from the top-level of the GHC source tree:

To run the parallel benchmarks with some number of cores, you need to compile the parallel benchmarks with the -threaded option and also pass the -N RTS argument; for example, the following runs the parallel benchmarks with 4 cores (run this from the parallel directory):

Many nofib programs have up to three test data sets. The mode variable tells the system which to use, thus:

make -k mode=slow
make -k mode=norm
make -k mode=fast

See mk/opts.mk. The default is mode=norm.

Other tips on measuring performance

It is often not necessary (or even useful) to do a full nofib run to assess performance changes. For example, you can tell whether compilation time has consistently increased by compiling a single file - a large one, and preferably not one of the perf tests
because those contain repeated patterns and aren't indicative of typical code. You can use nofib/spectral/simple/Main.hs for this purpose.

Measuring backend performance

To get some insights into changes to optimisations in the backend you can compile all the programs in codeGen/should_run both ways (unmodified GHC HEAD and GHC HEAD + some changes that are being tested), and then compare the sizes of the corresponding object files. Then investigate differences manually - this is a great way to get some insight into whether your optimisation is doing what you want it to do, and whether it has any unexpected consequences. As an example, the sinking pass in the Cmm pipeline is the result of iterating this process many times until most of the cases of bad code generation had been squashed. When you're satisfied that the optimisation is doing something sensible on these small examples, then move onto nofib and larger benchmarks.