> Attached are four versions of a very silly example function.
> Loop unrolled 93023255 ops/ms
> Tail recursion and loop unrolled 636334712 ops/ms
Interestingly, if you look at the MLton SSA IL and assembly, you will see
that the expressions
addn (x-1-1-1-1-1-1-1-1) +1+1+1+1+1+1+1+1
and
tail (x-1-1-1-1-1-1-1-1) (a+1+1+1+1+1+1+1+1)
are not optimized to
addn (x-8) + 8
and
tail (x-8) (a+8)
Instead, there are 8 decrements followed by 8 increments. Reassociating
arithmetic in the presence of Overflow is tricky, but I suspect this could
be done.