Moreover, the R5RS generic arithmetic is difficult to implement as
efficiently as purely fixnum or purely flonum arithmetic.
"Interpreter-branch" disproves this asserertion as far as interpreters
with immediate fixnums and boxed numbers are concerned.

No it does not. It shows that one can make generic arithmetic for ONE
representation as fast as type-specific arithmetic.

For example, suppose we want to optimize for fixnums first and flonums
second. We write the generic arithmetic code as follows and arrange to
predicting that the conditions are true:

if (all arguments are fixnums)
deal with fixnums
else if (all arguments are flonums)
deal with flonums
else
deal with the general case

When you use this code with fixnums, the first branch is correctly
predicted and there is very little cost. However, when you use this code
with flonums, the first branch is incorrectly predicted and there can be
a significant cost.