In my latest project, I've had the need to optimize a lot of stuff, and I learned that a fast way to do a shift by 4 on a byte is to swap nybbles. But when I started looking at the assembly output of the compiler, I noticed >>4 and <<4 weren't being optimized to a swap command, but were instead loops with single shifts were being generated. Doing four shifts in a row would be faster than that. And a single swap command with a mask would be even faster still.

I thought maybe the issue was the compiler was set to optimize for code size rather than speed but since the swap takes fewer bytes I don't think that could be it. I was wondering though whether the compiler is optimized for speed or code size. It would be nice if we could choose in the IDE.

Anyway I was talking to some guys over on avrfreaks, and that's where I learned the latest GCC does produce a swap in those circumstances.

The reason is probably that noone's spent the time testing it for bugs - newer versions of optimising compilers tend to have bugs. You'll have a far easier life if you update compiler versions after they've been shaked down in the wild for some time. The principle of "if it isn't broken don't fix it" applies.

If you want to try more recent gcc-avr versions there is no-one stopping you.

[ I will NOT respond to personal messages, I WILL delete them, use the forum please ]

The reason is probably that noone's spent the time testing it for bugs - newer versions of optimising compilers tend to have bugs. You'll have a far easier life if you update compiler versions after they've been shaked down in the wild for some time. The principle of "if it isn't broken don't fix it" applies.

If you want to try more recent gcc-avr versions there is no-one stopping you.

However, the version Ardunio is shipping is 4.3.2 which was released on August 27, 2008, so the version is rather old. The latest release is 4.7.2 which was released on September 20, 2012. Even if you wanted to go back one major revision, the 4.6.3 release was made on March 1, 2012.

What I meant was that avr-gcc 4.3.2 and 4.7.0 behave in exactly the same way - so it isn't worth upgrading in the hope that the newer gcc will optimize the shift any better than the old one. I don't know why it doesn't use swap for a right shift, but there may be something I don't know (perhaps it was tried but caused problems) In any case it would only be useful for the fairly rare case of shifting an unsigned byte by 4 places.

It might be a bit too much to expect the compiler to know when it can use a swap - I wouldn't expect it to get x=(x>>4)|(x<<4); down to a single swap!

In any case it would only be useful for the fairly rare case of shifting an unsigned byte by 4 places.

It might be a bit too much to expect the compiler to know when it can use a swap - I wouldn't expect it to get x=(x>>4)|(x<<4); down to a single swap!

Actually, I believe the latest version does in fact optimize x=(x>>4)|(x<<4); down to a single swap. I saw someone use that in an example over on avrfreaks and show the assembly that was generated.

And swap isn't only useful for shifting an unsigned byte 4 places. It's also useful for shifting unsigned ints and longs. After all, if you want to shift those right by 4, then every byte in them needs to be shifted right by 4.

What I meant was that avr-gcc 4.3.2 and 4.7.0 behave in exactly the same way - so it isn't worth upgrading in the hope that the newer gcc will optimize the shift any better than the old one. I don't know why it doesn't use swap for a right shift, but there may be something I don't know (perhaps it was tried but caused problems) In any case it would only be useful for the fairly rare case of shifting an unsigned byte by 4 places.

It might be a bit too much to expect the compiler to know when it can use a swap - I wouldn't expect it to get x=(x>>4)|(x<<4); down to a single swap!

GCC has been doing the optimization of turning that into a rotate for a long time. It depends on whether the backend maintainer has described that particular optimization in the <machine>.md file.

the version Ardunio is shipping is 4.3.2 which was released on August 27, 2008, so the version is rather old.

It's worth noting that every release of the compiler BETWEEN 4.3 and 4.7 has had bugs that were very significant to Arduino software. Apparently there aren't very many other gnu-c++ users.

Or people aren't reporting the bugs. It is rather hard to fix bugs you don't know about. Many of the secondary ports have the problem that there often aren't enough people helping with the testing to make sure there is good coverage.

Given the age, the earlier branches are now closed to new submissions.