Following a previous thread (Lee's idea in fact), I made a tiny Win32 program that takes a float as an input and an error-level and then will output two integer numbers: first number divided by the second will give you aprox. the inputted float.
The goal is to avoid (of course, if it's possible) the float-math in your embedded applications.
Now, here's the proggy:

There are cases where the numbers will be used many, many times and speed is important. An example might be wire-frame rotation in graphics, where all the endpoints need to be re-processed every iteration. In that case, it is almost magic if you can find a close-enough ratio where the divisor (especially on AVRs with no hardware divide support) or the numerator are powers of two, so simple shifting can be used.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

C-compiler capable of doing such magic things
========================================
replacing mult and div by powers of 2 is a 'peephole optimization' done by many compilers. Imagecraft lists this one and others in its description of optimizations

C-compiler capable of doing such magic things
========================================
replacing mult and div by powers of 2 is a 'peephole optimization' done by many compilers. Imagecraft lists this one and others in its description of optimizations

avr-gcc appears to be pretty smart about division by a power-of-two. It does creative byte/nibble swapping to do partial divisions that are powers of 256/16. Then, it moves on to LSR's for the remaining amount of power-of-2 dividing that is left to be done.

For any pair of 8-bit numbers, on an AVR with an on-chip multiplier, you'll always be better off using the MUL instruction, right?

Well, if I do "8-bit times 63", it invokes MUL, using 5 words/cycles of code.

If I do "8-bit times 64", it creates a loop where it adds the result to itself 6 times, with 16-bit intermediate results, using 8 words of code, and requiring 33 cycles. It's actually slower and larger than the "non-optimized" multiplication by a non-power-of-two-above.

It's somewhat clearer what's going on with 16-bit ints:
If I do "16-bit times 45" it will invoke MUL twice, using 10 words/cycles of code.

If I do "16-bit times 64" it will double the number 6 times by repeatedly adding the result to itself, using 5 words of code, and 30 cycles.

If I do "16-bit times 63", it performs multiplication-by-64 and then subtracts one copy of the original number, using 8 words of code and 33 cycles.

So 16-bit optimization is smaller but slower... exactly what you expect when optimizing for size. But the 8-bit optimization is actually bigger and slower...

Now, if I set it to optimize for speed (-O3):
8-bit * 64: It does something very tricksy with rotating right through the carry bit across three registers. No loops or branching. Total: 11 words/cycles.

8-bit *63: Uses MUL. Total: 5 words/cycles. Same as with -Os.

16-bit *45: Uses MUL. Total: 10 words/cycles. Same as with -Os.

16-bit *64: Does the same tricksy rotating-right through the carry bit across 3 registers business. Total: 9 words/cycles. It is slightly faster and slightly smaller than the arbitrary case above. Almost twice as big as the optimize-for-space method, but also much faster.

(Now I understand what was going on in the 8-bit version above... the compiler is internally promoting the 8-bit operation into a 16-bit operation before beginning, and then jumping through hoops beforehand and afterwords to make sure that an 8-bit input is used and an 8-bit result is returned... That doesn't sufficiently explain "why?"... but at least it starts to tackle "what?". The rotate-right-by-two-thru-carry technique ends up being functionally equivalent to the "shift-left-by-6" or the "add-self-to-self-6-times" techniques that could otherwise be used, but faster.)

16-bit *63:
It does the same tricksy technique for multiplying by 64, then subtracting one copy of the original number. Total: 12 words/cycles. Actually larger/slower than the MUL option by 2 words/cycles.

So, I can conclude that avr-gcc tries very hard to optimize its power-of-two multiplication. But it tries too hard sometimes, and sometimes ends up using more code space and/or execution time than its own non-optimal versions of the same things!

It is a little hard to do direct work with a test program from your description. Are you doing

fred = ethel * 63; or
fred *= 63; ?

Are fred/ethel in registers or SRAM to start/end?

[Note to EW: Round 51 of Compiler Wars]

Anyway, CodeVision does similar with *63 & *64. Even with the "promote char to int" switch on, it recognizes that only 8 bits need to be fussed with. It skips the MUL for *64, but the resulting sequence is the same number of cycles though one word longer than the 5 word/6 cycle MUL for *63.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

I was referring to something completely different… replacing float operations with integer operations!
=========================================
Lee was talking about a 'tricky' integer ratio where numerator and denominator that were powers of two.... but that reminded me of a tricky FP mult by 2 by incrementing the exponent. Thats 1 8 bit op instead of calling the fpmult subroutine. Cool huh? I thought of that.

Sigh... misunderstanding maybe...
I was NOT saying to get rid of float data type!
I was only suggesting that sometimes it is possible to REPLACE float operations with INTEGER operations. For example if you have x * 4.6, you may replace it with (23 / 5). This will bring the benefit of a faster & smaller code! That's all...

Real men don't use backups, they post their stuff on a public ftp server and let the rest of the world make copies.

Actually, Carl, one can put in a full day of "programming AVRs" and actually be quite productive on the days the 1's are broken--if the chips are brand-new or freshly erased. Since the erased state is 0xff for EEPROM & flash -- all bits 1 -- then the task is merely to decide which ones should be 0. Come to think of it, that what us AVRFreaks do all day every day--decide which flash bits to turn to 0.

Groan. :)

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

No! Your english is quite good. Better then many U.S. citizens, in fact.

As I stated in other posts in the past... I do very low volume product design. Usually less then 10 units. I always use a microcontroller with RAM/ROM larger then I could possibly need. While I realize that others really need to minamize the controller size based on volume cost, I don't. As such, I don't need to concern myself about wasted space as a result of using float data types in my projects. I do however, need to be mindful of speed for certian functions. I haven't yet, run into a situation where I haven't been able to resolve speed related issues on the AVR. Though, I'm sure that day/project is nearing.

But comparing say, the MC68HC11timer functions to that of the AVR, there really isn't any! I implimented a analog controlled PWM/DAC on the AVR this past week that used almost no code space compared to what the same function took on the HC11. But for most projects, I couldn't practically impliment code in C in an HC11E2 product, the EEPROM (it's program space) was usally too small. So, I had to use assembly. But the assembly version of the ADC/PWM/DAC implimentation on the AVR was supprisingly small, as was the C implimentation.

I realize my needs are different. I also do understand the needs of the more moderately volumed developers.

I guess it's all for the sake of argument, really...

You can avoid reality, for a while. But you can't avoid the consequences of reality! - C.W. Livingston

On days when the 1's are broken, all the coding is done using just the 0's.

Lee

OK, OT, but this reminds me of a Dilbert strip where he and Wally are talking, going something like:

- Back in the good ol days we didn't need those sissy icons and windows stuff. All we had was ones and zeroes.
- You had zeroes? We had to use the letter O...

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here.

No guarantees, but if we don't report problems they won't get much of a chance to be fixed! Details/discussions at link given just above.

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

10. Exact estimate
9. Genuine imitation
8. Found missing
7. Butt Head
6. Military Intelligence
5. Women in programming
4. Computer security
3. Political science
2. Working vacation
And the number one top Oxymoron...
1. Micro$oft works

Real men don't use backups, they post their stuff on a public ftp server and let the rest of the world make copies.