_mjw_

In general, it's best to avoid floats anywhere you can use integers. For one, go browse the avr-libc handbook, particularly the math stuff. You'll find a lot of it uses floats as parameters by default. Since there's no floating-point hardware in an AVR mega or tiny, that all has to be done via software emulation, which is extra code bloat that you probably didn't need for your intended application.

Very interesting. Will have to have a look at the avr-libc handbook.

Quote

For that reason, be aware of the math functions you use and try to implement them in more efficient ways if you can. E.g., definitely use shifts to do powers of 2. You'll get a more accurate result, it'll execute faster (a relative term), and assuming you don't use floats anywhere else, it'll save all the flash space required to include the fp math library.

Agreed. Originally performed this calculation to get at each of the bits in a counter. Using bitshiftleft (<<) or bitread() is more direct.

Quote

Naturally, when you need non-integer values and can't get by with fixed-point math, well ya godda do whatcha godda do. And none of this applies in test cases and matters of curiosity like in this thread.

Yes, it is a matter of curiosity, but interesting enough to others also, I hope.

Rounding. These are the rules of interest...http://en.wikipedia.org/wiki/IEEE_floating_point#Rounding_rulesIn addition, Serial.print may use a rounding rule not on that list.

Which rule is used by your PC program? Which rule is used by AVR Libc? Which rule is used by the Arduino Run Time Library? Is it possible your PC program is using a different rule than your Arduino program?

Or, intermediates...

Quote

Quote

32 bit float vs. 64 bit float

Recompiled with the option -m32 (to enforce 32 bit calculations) and got the same results!

Incorrect. A modern 32 bit processor (like a Pentium) simply does not support operations on 32 bit floating point numbers. They are capable of storing and loading 32 bit floats but all operations are done at 64 bits (or higher) to avoid problems with intermediate values.

Because floating point results are inexact, you cannot make assumptions like this.Different implementations of pow, for instance, may special-case when the argumentshappen to be integral (on the Arduino there is no room for special cases, the standardmethod using logarithms will be used with the expected imprecision in the LSBs)

To get integer powers of two you use the << operator. 1 << n

[ I will NOT respond to personal messages, I WILL delete them, use the forum please ]

SirNickity

Because floating point results are inexact, you cannot make assumptions like this.

This is a profound point. To any programmers that aren't already aware of the significance of this statement, it's something you'll want to read up on.

The gist is this: Floating point values don't store exact values, they store approximations. There are many integer values that can't be stored as floating point values, but will instead be just slightly off. The implications of this are obvious for cumulative math -- and particular, code that deals with money. Higher precision does not solve the root problem. You may have 64 bits of precision, but all that means is you might get a whole bunch of .999999s when what you really wanted is .0.

(WOOHOO! 1000 posts. And I am no longer a God member. Edison outranks God? Well, I guess they both said "Let there be light." )

_mjw_

Because floating point results are inexact, you cannot make assumptions like this.Different implementations of pow, for instance, may special-case when the argumentshappen to be integral (on the Arduino there is no room for special cases, the standardmethod using logarithms will be used with the expected imprecision in the LSBs)To get integer powers of two you use the << operator. 1 << n