Benchmarks In Interpreted Mode

12 Dec 2014
tech

Work in progress.
Do tell me if you notice anything amiss, but do not be surprised.

A while ago, there was a
discussion on stackoverflow
where some guy named Evil Menace jokingly offered to disable JIT compilation
in order to get more consistent and predictable performance when running benchmarks.

Apparently, it might be not clear to everyone how exactly this is a bad idea.
Seeing as the stackoverflow guy has failed to come up with an example of that,
I have decided to do so on my own, even if it has been his plan all along.

Problem Statement

The original argument went thus:

I am not interested in how fast my code actually is. It doesn’t matter for me if
it is 10 ms, 20 ms or 100 ms fast. I just want to compare two different algorithms
and know which one is faster compared to the other.

My point is that such a comparison may well turn out to be meaningless, because the compiler
might easily turn the tables in such a comparison. So, we need to come up with two algorithms
A and B such that when benchmarked with -Xint, A is faster than B,
but when benchmarked normally, the performance evens out, or even turns the other way around.

To achieve this, we need the JIT to either greatly optimize B, but not A, or just severely
hurt the performance of A, but not B. The latter is not something that should happen
(although it might), so we will go with the former.

Such square root. Very perform. So wow.

Let us take something very simple, say, a square root, and calculate it using
Newton’s Method:

Well… Of course it is, because it involves a JNI call, and everyone knows JNI calls are expensive as hell.
Duh. Those stupid engineers, they cannot even get such a basic thing as a square root right… My job here is done!

But just for the sake of the argument, let me run the same thing with the JIT compiler enabled:

Oh, wait. What happened? Now the performance of B has become far superior to that of A… Weird, isn’t it?

Alright, enough fooling around. I have, of course, cheated by adding the empty loop of 1,000,000 iterations in the
beginning of B, which JIT has successfully eliminated as dead code.
The interpreter is obviously unable to perform such a feat.

A Plausible Example

Nobody really puts dead code that dominates the execution time into their algorithms.
So what we have above is clearly not a satisfying example, and we need to come up with something more realistic.

Off the top of my head, here are some optimizations that wouldn’t be considered cheating: