If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register or Login
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Strange differences in float calculations

Dear all,

I'm doing some work with images, and I've come across the strangest thing:

I have an array of float values, if I initialize it to zero and sum a series of floats on the array itself I obtain one result.
On the other hand if I use an auxiliary variable as an accumulator and after summing all values I assign it to the corresponding position in my array, the result is slightly different to the former one.

I have no idea why this is and would appreciate if anyone could give me any tips here. I'm using g++ 4.3.3.
I post below the excerpt of the code in question and briefly comment on the variables.

- nCols, nRows: numbers of columns and rows of my images.
- u, meanu, normu: different images. The data elements are of type unsigned char for 'u' and float for the others.
- _(img, x, y): this is an accessor macro. The data is just stored in an array, so the macro computes the offset corresponding to the indexes x and y.

The commented lines are the alternative that yields different results.

Re: Strange differences in float calculations

I didn't take the time to read your code, but as Lindley said, floats are just plain un-exact (not to confuse with inaccurate).

When you use floats, you have to accept things like this:

Code:

a+b+c =/= c+b+a;
a+b-a =/= b;

As I said, I didn't read your code, but if you did ANYTHING in your code that changed the order in which you manipulated your data, then that is your explanation. Not that one version is wrong per say, they are just both slightly un-exact, but probably close to each other and accurate.

I'm not good at explaining why, but if you take the time to understand what a float is, it should become obvious to you.

floats are good, but please please please make sure you understand the hows and whys of them, or you will be in a world of pain sooner or later.

Re: Strange differences in float calculations

Originally Posted by evertonland

Dear all,

I'm doing some work with images, and I've come across the strangest thing:

I have an array of float values, if I initialize it to zero and sum a series of floats on the array itself I obtain one result.
On the other hand if I use an auxiliary variable as an accumulator and after summing all values I assign it to the corresponding position in my array, the result is slightly different to the former one.

The issue with floating point calculations has little to do with the compiler you're using, and everything to do with computer science and numerical analysis.

A binary computing machine, i.e. your computer, cannot represent most decimal floating point values exactly. If you know the math, try to represent 0.3 exactly in binary. You can't, and neither can the computer. So an approximation is made. So right there, you're in trouble.

The only decimal floating point numbers that can be represented in binary are ones that are floating point values that are sums of negative powers of 2, i.e. 1/2, 1/4, 3/8, 1/16, etc.

If you juggle floating point calculations around, use temp variables, etc. then you're upsetting the round-off apple cart. That is the bottom line.

Re: Strange differences in float calculations

Thank you both for the quick replies.

@Lindley: I was aware that there are compiler optimizations that affect the result of float calculations, but I hadn't thought of it optimizing diffently floats and float arrays. Thanks for the idea, that might be it. Do you think this could still be the case even though I am compiling with the flag -O0, in order to turn off optimization?

@monarch_dodra: while looking around for the cause of what I'm telling you about I came across the kind of things you mention, but I think mine is stranger. I'll strip it to the bone for you:

Code:

float x = a + b;
float xArray[10];
xArray[0] = a + b;
x != xArray[0];

Furthermore, I've checked just in case that adding a + b will always yield the same result, the difference is in whether I assign it to a float standalone variable or to a value within an array. Is that freaky or what?

Re: Strange differences in float calculations

Originally Posted by evertonland

Thank you both for the quick replies.

@Lindley: I was aware that there are compiler optimizations that affect the result of float calculations, but I hadn't thought of it optimizing diffently floats and float arrays. Thanks for the idea, that might be it. Do you think this could still be the case even though I am compiling with the flag -O0, in order to turn off optimization?

@monarch_dodra: while looking around for the cause of what I'm telling you about I came across the kind of things you mention, but I think mine is stranger. I'll strip it to the bone for you:

Code:

float x = a + b;
float xArray[10];
xArray[0] = a + b;
x != xArray[0];

Furthermore, I've checked just in case that adding a + b will always yield the same result, the difference is in whether I assign it to a float standalone variable or to a value within an array. Is that freaky or what?

Again, thank you both very much; your comments were helpful.

My (quick) guess is that one of the calculations is optimized away during compile time, whereas the other one is done at run time. The way the compiler "calculates" a+b might be different from the way the compiled executable does it.

Or it might be something completely un-related, but that is my guess.

EDIT, could you provide us with your numeric values and code, I'd like to try it out on my machine.

Re: Strange differences in float calculations

Thanks Paul for your response.

@monarch_dodra: If you want to play around with the code, I'd be happy to let you have it. I just have to check with my boss first. That being said, the code is tied to an image processing library that you'd have to compile yourself. Let me know if your curiosity trumps the bother this represents.

Off-topic noob question: Are there private messages on this forum? I looked at the FAQ but it appears to me that they are disabled.

Re: Strange differences in float calculations

Originally Posted by evertonland

Furthermore, I've checked just in case that adding a + b will always yield the same result, the difference is in whether I assign it to a float standalone variable or to a value within an array. Is that freaky or what?

Some CPUs (or FPUs) do the calculations in 80 bit on the chip but the compiler stores doubles as 64 bit (and floats as 32 etc) in RAM.

It is possible that your 'stand alone variable' is in fact held in a register, while the array is obviously in RAM.

Found this out during a project at work; we were going a bit nuts because changing seemingly unrelated code near floating point calculations was changing the results!

Re: Strange differences in float calculations

Does declaring floats/doubles as volatile help alleviate that unexpexted behaviour? It makes it less likely that values are stored in registers if my understanding is correct.

Define what "un-exected behavior" is. Floating points calculations are by definition inaccurate. Relying on things like calling your floats volatile to hope they are only stored in registers is not only dangerous, but throw you into a world of non-portability, and machine/compiler/everything assumptions. The slightest change anywhere on your system could literally make your program explode.

Regardless of what you are doing, when you are using floating point calculations, you have to accept that your results might never be exact, or the same as another calculation that might theoretically provide the same result, but doesn't.

If you do that, then you won't even care that both results aren't the same, because your system was built to work on data that is not 100% accurate.