Tuesday, July 15, 2014

Constant Constructors and Benchmark Scoring

It makes it much more obvious where things change than just looking at a table of numbers:

Loop Size

Classic

Nodes Traverse (single dispatch)

Visitor Traverses

10

131.6

122.7

121.4

100

125.9

113.6

120.15

1000

123.8

122.8

120.65

10000

157.0

153.25

120.45

100000

158.05

154.65

121.05

Building the actual graph was a pain. If there is pain involved, it is quite likely that I am not going to make use of what I just described as very useful. So, before moving on to other learning territory, let's see if I ease the pain.

To make the graph, I pasted the CSV into Google Docs spreadsheets, which is where the work (and pain) really started. The pain involved:

adding a column dividing the time by the number of loops

adding another sheet that averaged the numbers in different runs to create the table above

generating the graph with the correct labels and headers

Step #1 should be easy—I probably should have done that myself in the Dart benchmarking code. That I did not are the hazards of plowing ahead in a solution without thinking.

Step #2 was particularly hard because I had to manually edit the range of cells to be averaged and the cell that identified the number of loops. Copying and pasting always got the wrong column and or row. In a 5×3 table, that is 15 edits of 3 values. There is no way I am doing that again by hand.

Step #3 may very well need to remain in Google spreadsheets. I know of no Dart packages that can manipulate Google Sheets (really?) nor are there any Dart packages that I can find that can manually build pretty graphs in PNG form. I am more than happy to be corrected on either point!

Anyhow, let's see what I can do about steps #1 and #2. Ugh... strike that. Let's see what I can do about the “easy” #1. In theory, #1 ought to be easy—all I need to do is get the loopSize from my benchmark's main() entry point down to the benchmark class:

Ah, my old friend constant constructors. Wait, not “friend” in this case—the loopSize is coming in from the command-line so there is no way that it will ever be a compile time constant. So there is no way to create a constant scorer.

So I do something bad instead: storing the score in a global and make the loop responsible for printing the results:

I can understand why benchmarks rely on compile-time constants like the benchmarker and scorer. No matter how many times I create a new instance, no new memory is consumed. In other words, the benchmark harness will not be responsible for triggering garbage collection. At the same time, it sure makes reporting anything more than the simplest information a pain.

Update: I do not need the scoring emitter after all. I can get away with returning the results of measure() directly from main() in my benchmark class: