Efficient code - a quiz

I was recently reading a set of “golden rules” for embedded programming. I am very skeptical about such proscriptive instructions as, for them to be valid, a great many assumptions must be made and clearly stated. These rules were supposed to promote the production of safe, efficient code. I am OK with “safe” – that simply means that the code does what it is supposed to do, without deviation as a result of unexpected data or unexpected side-effects. I am not so sure about “efficient”, as that depends entirely on the design parameters in force; it might mean fast or it might mean small, for example. And what about maintainability and portability being significant parameters?

Preparing Recommendations

So, instead of trying to outline rules myself, I thought that I might set out to find out what real developers think …

Just for fun [budgets to not stretch to offering a prize], I would like to pose a question and request answers by comment or email:

You are developing a system, where execution speed is important. At one point in the code, there is an unsigned integer variable x which needs to be divided by 8. I can immediately think of 4 ways to code this:

x /= 8;
x = x / 8;
x >>= 3;
x = x >> 3;

My question is: which of these options is the most efficient and why? Of course, if you have other suggestions about how to address the problem, I would be very interested to hear.

I will publish some results and my answer in a couple of weeks.

More Blog Posts

About Colin Walls

Colin WallsI have over twenty-five years experience in the electronics industry, largely dedicated to embedded software. A frequent presenter at conferences and seminars and author of numerous technical articles and two books on embedded software, I am a member of the marketing team of the Mentor Graphics Embedded Systems Division, and am based in the UK. Away from work, I have a wide range of interests including photography and trying to point my two daughters in the right direction in life. Visit The Colin Walls Blog

Follow on Twitter

More Posts by Colin Walls

Preparing Recommendations

Comments (↓ Add Your Own)

10 Comments on this Post

Commented on 9:04 AM, Sep 26, 2011
By Peter Bushell

The second is clearest, but the first is a well-known C idiom and is quite acceptable. The others are not acceptable because they hide the arithmetic nature of the job to be done. The mechanism by which division is performed is a matter for the processor, aided in this case, possibly, by the compiler, which should optimise the code to a simple shift if this results in quicker execution than just generating a divide instruction (assuming the processor provides one, of course) There is another important point, too, related to the C standard, but you have neatly rendered its discussion "off-topic" by making x unsigned! I take it, by the way, that you don't want us to divert the discussion into arguments about the use of the explicit constants 8 and 3 in your examples.

Commented on 3:00 PM, Sep 26, 2011
By Colin Walls

Thanks Peter. I will comment in a follow up blog soon.

Commented on 3:30 PM, Sep 26, 2011
By Dan S

I agree with Peter. I've heard the saying, "If you mean to shift, then shift; if you want to multiply or divide, then multiply or divide". Generally speaking, on most architectures, all 4 possibilities will result in the same assembly code being generated, at least with any sort of optimization turned on. But the first 2 versions are more expressive - you want to divide, and you are dividing. Not to mention the fact that now if you want to divide by 9, it's a single character being changed. Also, even though most of us see ">> 3" and "instantly" know this is a divide-by-8, some people might have trouble converting 2**3 to 8. I had to fix code once where the original coder meant to divide by 16, but everywhere in the code it was coded as ">> 5" - the dreaded "off by one [bit]" error. And what if the value now changes to a floating point (float, double) type? How's that bit shift working out now? But most importantly, the first 2 versions do not rely on any specific architecture or internal representation. Although most architectures use 2s complement representation, this is not universal; binary representation is an implementation detail of the processor. Bottom line: write your code for human beings, and let the compiler do its job.

Commented on 6:50 PM, Sep 26, 2011
By Colin Walls

Good points Dan.

Commented on 2:41 PM, Sep 28, 2011
By Ken Simone

IMHO Given: You are developing a system, where execution speed is important. Fewer Cycles = faster execution speed Use of fewer registers is more efficient I would code x = x >> 3; I may create an intermediate term xshiftn The shift should take 2 cycles and use 1 register The divide code would take 3,4, or 5 cycles to produce a result and probably use 2 or 3 registers. Code for the machine. The code may become the machine!

Commented on 2:45 PM, Sep 28, 2011
By Colin Walls

Thanks Ken. I will be commenting on the responses I have received in a blog post very soon.

Commented on 10:17 PM, Sep 29, 2011
By Krzysztof Wesołowski

Code should present idea, operation, not implementation details. If your compiler do not optimize dividing by 8 then there is one ultimate way to efficient program - change compiler.

Commented on 7:30 AM, Sep 30, 2011
By Colin Walls

I agree entirely Krzysztof.

Commented on 11:34 AM, Sep 30, 2011
By Lee Riemenschneider

I also agree with Krzysztof. Back when we first moved from Assembly to C, we used to examine the Assembly output of the compiler, and we found that certain C constructs were more efficient than others. That gave the Assembly "old guard" reason to claim C was useless and to keep coding in Assembly. Shift rather than divide and if-then-else rather than switch-case were two hand optimizations we saw. After a few years, the C compilers solved these issues and it became no longer necessary to look at the compiler output in Assembly. I would hope that those years are long behind us.

Commented on 3:42 AM, Oct 5, 2011
By Shaun Shippey

Hi, I am a recent graduate so know that my reply is coming from a position of inexperience. Without actually looking at the assembly code I would assume that method 1 and 2 are identical and would produce the same result. The second method may be a tad bit clearer but that's the only difference I can spot. A possible problem with using a right shift as a multiplier, even if it did result in better optimization, may in fact come from cross platform applications. In short, I believe that the code will have portability issues when dealing with data of different sizes and could possibly produce unreliable results in certain applications. Just my thoughts, - Shaun

Add Your Comment

Please complete the following information to comment or sign in.

(Your email will not be published)

Archives

Tags

RSS