Sign In
Forgot Password?
Sign In | | Create Account

Efficient code - a quiz

I was recently reading a set of “golden rules” for embedded programming. I am very skeptical about such proscriptive instructions as, for them to be valid, a great many assumptions must be made and clearly stated. These rules were supposed to promote the production of safe, efficient code. I am OK with “safe” – that simply means that the code does what it is supposed to do, without deviation as a result of unexpected data or unexpected side-effects. I am not so sure about “efficient”, as that depends entirely on the design parameters in force; it might mean fast or it might mean small, for example. And what about maintainability and portability being significant parameters?

So, instead of trying to outline rules myself, I thought that I might set out to find out what real developers think …

Just for fun [budgets to not stretch to offering a prize], I would like to pose a question and request answers by comment or email:

You are developing a system, where execution speed is important. At one point in the code, there is an unsigned integer variable x which needs to be divided by 8. I can immediately think of 4 ways to code this:

x /= 8;
x = x / 8;
x >>= 3;
x = x >> 3;

My question is: which of these options is the most efficient and why? Of course, if you have other suggestions about how to address the problem, I would be very interested to hear.

I will publish some results and my answer in a couple of weeks.

Development Tools

More Blog Posts

About Colin Walls Follow on Twitter

Colin WallsI have over thirty years experience in the electronics industry, largely dedicated to embedded software. A frequent presenter at conferences and seminars and author of numerous technical articles and two books on embedded software, I am a member of the marketing team of the Mentor Graphics Embedded Systems Division, and am based in the UK. Away from work, I have a wide range of interests including photography and trying to point my two daughters in the right direction in life. Learn more about Colin, including his go-to karaoke song and the best parts of being British: Visit The Colin Walls Blog

More Posts by Colin Walls

Comments 10

Post a Comment
The second is clearest, but the first is a well-known C idiom and is quite acceptable. The others are not acceptable because they hide the arithmetic nature of the job to be done. The mechanism by which division is performed is a matter for the processor, aided in this case, possibly, by the compiler, which should optimise the code to a simple shift if this results in quicker execution than just generating a divide instruction (assuming the processor provides one, of course) There is another important point, too, related to the C standard, but you have neatly rendered its discussion "off-topic" by making x unsigned! I take it, by the way, that you don't want us to divert the discussion into arguments about the use of the explicit constants 8 and 3 in your examples.

Peter Bushell
9:04 AM Sep 26, 2011

Thanks Peter. I will comment in a follow up blog soon.

Colin Walls
3:00 PM Sep 26, 2011

I agree with Peter. I've heard the saying, "If you mean to shift, then shift; if you want to multiply or divide, then multiply or divide". Generally speaking, on most architectures, all 4 possibilities will result in the same assembly code being generated, at least with any sort of optimization turned on. But the first 2 versions are more expressive - you want to divide, and you are dividing. Not to mention the fact that now if you want to divide by 9, it's a single character being changed. Also, even though most of us see ">> 3" and "instantly" know this is a divide-by-8, some people might have trouble converting 2**3 to 8. I had to fix code once where the original coder meant to divide by 16, but everywhere in the code it was coded as ">> 5" - the dreaded "off by one [bit]" error. And what if the value now changes to a floating point (float, double) type? How's that bit shift working out now? But most importantly, the first 2 versions do not rely on any specific architecture or internal representation. Although most architectures use 2s complement representation, this is not universal; binary representation is an implementation detail of the processor. Bottom line: write your code for human beings, and let the compiler do its job.

Dan S
3:30 PM Sep 26, 2011

Good points Dan.

Colin Walls
6:50 PM Sep 26, 2011

IMHO Given: You are developing a system, where execution speed is important. Fewer Cycles = faster execution speed Use of fewer registers is more efficient I would code x = x >> 3; I may create an intermediate term xshiftn The shift should take 2 cycles and use 1 register The divide code would take 3,4, or 5 cycles to produce a result and probably use 2 or 3 registers. Code for the machine. The code may become the machine!

Ken Simone
2:41 PM Sep 28, 2011

Thanks Ken. I will be commenting on the responses I have received in a blog post very soon.

Colin Walls
2:45 PM Sep 28, 2011

Code should present idea, operation, not implementation details. If your compiler do not optimize dividing by 8 then there is one ultimate way to efficient program - change compiler.

Krzysztof Wesołowski
10:17 PM Sep 29, 2011

I agree entirely Krzysztof.

Colin Walls
7:30 AM Sep 30, 2011

I also agree with Krzysztof. Back when we first moved from Assembly to C, we used to examine the Assembly output of the compiler, and we found that certain C constructs were more efficient than others. That gave the Assembly "old guard" reason to claim C was useless and to keep coding in Assembly. Shift rather than divide and if-then-else rather than switch-case were two hand optimizations we saw. After a few years, the C compilers solved these issues and it became no longer necessary to look at the compiler output in Assembly. I would hope that those years are long behind us.

Lee Riemenschneider
11:34 AM Sep 30, 2011

Hi, I am a recent graduate so know that my reply is coming from a position of inexperience. Without actually looking at the assembly code I would assume that method 1 and 2 are identical and would produce the same result. The second method may be a tad bit clearer but that's the only difference I can spot. A possible problem with using a right shift as a multiplier, even if it did result in better optimization, may in fact come from cross platform applications. In short, I believe that the code will have portability issues when dealing with data of different sizes and could possibly produce unreliable results in certain applications. Just my thoughts, - Shaun

Shaun Shippey
3:42 AM Oct 5, 2011

Add Your Comment

Please complete the following information to comment or sign in.

(Your email will not be published)


Online Chat