FloTHERM is powerful 3D computational fluid dynamics (CFD) software that predicts airflow and heat transfer in and around electronic equipment, from components and boards up to complete systems. Learn More →

Efficient code – a quiz

I was recently reading a set of “golden rules” for embedded programming. I am very skeptical about such proscriptive instructions as, for them to be valid, a great many assumptions must be made and clearly stated. These rules were supposed to promote the production of safe, efficient code. I am OK with “safe” – that simply means that the code does what it is supposed to do, without deviation as a result of unexpected data or unexpected side-effects. I am not so sure about “efficient”, as that depends entirely on the design parameters in force; it might mean fast or it might mean small, for example. And what about maintainability and portability being significant parameters?

So, instead of trying to outline rules myself, I thought that I might set out to find out what real developers think …

Just for fun [budgets to not stretch to offering a prize], I would like to pose a question and request answers by comment or email:

You are developing a system, where execution speed is important. At one point in the code, there is an unsigned integer variable x which needs to be divided by 8. I can immediately think of 4 ways to code this:

x /= 8;

x = x / 8;

x >>= 3;

x = x >> 3;

My question is: which of these options is the most efficient and why? Of course, if you have other suggestions about how to address the problem, I would be very interested to hear.

More Blog Posts

Comments

The second is clearest, but the first is a well-known C idiom and is quite acceptable. The others are not acceptable because they hide the arithmetic nature of the job to be done. The mechanism by which division is performed is a matter for the processor, aided in this case, possibly, by the compiler, which should optimise the code to a simple shift if this results in quicker execution than just generating a divide instruction (assuming the processor provides one, of course)

There is another important point, too, related to the C standard, but you have neatly rendered its discussion “off-topic” by making x unsigned!

I take it, by the way, that you don’t want us to divert the discussion into arguments about the use of the explicit constants 8 and 3 in your examples.

I’ve heard the saying, “If you mean to shift, then shift; if you want to multiply or divide, then multiply or divide”.

Generally speaking, on most architectures, all 4 possibilities will result in the same assembly code being generated, at least with any sort of optimization turned on.

But the first 2 versions are more expressive – you want to divide, and you are dividing. Not to mention the fact that now if you want to divide by 9, it’s a single character being changed. Also, even though most of us see “>> 3″ and “instantly” know this is a divide-by-8, some people might have trouble converting 2**3 to 8. I had to fix code once where the original coder meant to divide by 16, but everywhere in the code it was coded as “>> 5″ – the dreaded “off by one [bit]” error.

And what if the value now changes to a floating point (float, double) type? How’s that bit shift working out now?

But most importantly, the first 2 versions do not rely on any specific architecture or internal representation. Although most architectures use 2s complement representation, this is not universal; binary representation is an implementation detail of the processor.

Bottom line: write your code for human beings, and let the compiler do its job.

IMHO
Given:
You are developing a system, where execution speed is important.
Fewer Cycles = faster execution speed
Use of fewer registers is more efficient

I would code x = x >> 3; I may create an intermediate term xshiftn
The shift should take 2 cycles and use 1 register
The divide code would take 3,4, or 5 cycles to produce a result and probably use 2 or 3 registers.
Code for the machine. The code may become the machine!

I also agree with Krzysztof. Back when we first moved from Assembly to C, we used to examine the Assembly output of the compiler, and we found that certain C constructs were more efficient than others. That gave the Assembly “old guard” reason to claim C was useless and to keep coding in Assembly. Shift rather than divide and if-then-else rather than switch-case were two hand optimizations we saw. After a few years, the C compilers solved these issues and it became no longer necessary to look at the compiler output in Assembly. I would hope that those years are long behind us.

I am a recent graduate so know that my reply is coming from a position of inexperience.

Without actually looking at the assembly code I would assume that method 1 and 2 are identical and would produce the same result. The second method may be a tad bit clearer but that’s the only difference I can spot.

A possible problem with using a right shift as a multiplier, even if it did result in better optimization, may in fact come from cross platform applications. In short, I believe that the code will have portability issues when dealing with data of different sizes and could possibly produce unreliable results in certain applications.