Recommended Posts

I have an array of function pointers
[source lang="cpp"]void a();
void b();
void c();
void d();
jump={a, b, c, d};[/source]
I'm indirectly calling those functions in a loop
[source lang="cpp"]while(some_condition)
{
int i=...//i is random
jump();
}[/source]
If i is consistent, not just random values, does the loop run faster?
Does the cpu predict which function to call like in branch prediction?

0

Share this post

Link to post

Share on other sites

I doubt it. If I were random, never. Theoretically this could be reduced by the optimizer if there were no random. The loop could be unrolled, from there it could eliminate looking the function pointers up because they are never changed, but I doubt it would go that far. Maybe if you made the array immutable. I'm no expert on the inner workings of code optimization though.

Share this post

Link to post

Share on other sites

Each time you perform a jump, the code at the jump target has to be loaded into the L1 icache. If you're calling the same function over and over, there's a bigger chance it will still be present in the cache. On the other hand, if you're calling random functions, there's a larger chance of cache misses.

Regarding branch prediction, it depends on the CPU heavily.
Some CPUs have a history-based predictor, where it's first guess for any branch will be the target that this branch jumped to last time. In this case, as long as a/b/c/d aren't long enough (contain enough branches of their own) to completely flush that history table, then using the same '[font=courier new,courier,monospace]i[/font]' repeatedly would help the predictor.
That said, many of these history-based schemes only store true/false values, not actual addresses, so they only work for conditional jumps, not unconditional jumps like yours.

If the distance between fetching the value at jump and jumping to that value is large enough, then some CPUs may be able to fully determine the branch target before branching, meaning there's no prediction to be done. If this is the case, you might be able to help by unrolling your loop somewhat:int i1=...
int i2=...
int i3=...
int i4=...
jump[i1]();
jump[i2]();
jump[i3]();
jump[i4]();
However, the best overall solution is probably to use 4 loops:while(...)
a();
while(...)
b();
while(...)
c();
while(...)
d();Edited December 8, 2012 by Hodgman