Recommended Posts

I have been fighting for quite some time with epsilons, and I have been hating the ambiguity of them - one kind of epsilon would have no impact on numbers too large, while having an enormous impact on smaller numbers.
What I have tried, is to scale my epsilon to match the numbers under comparison.
This is what I came up with:

Edit: This is an altered version compared to the original post, the first was flawed, but this should work.
If needed, EPSILON_DEFINER can be set to something else, if you want to have a hit on numbers further from each other - or if you want even more precision when 64-bit floats come along. Another way could be to make a global, that you allow the program to initialize upon startup...
Ok, this is what I do:
I start by sorting my input values (a_value2 will be the biggest).
Then I check if they are within decent range of each other (I multiply smallest number by 10 - if large number is still larger, then they are obviously not equal.
If they are in the same range, then I create an EPSILON based on their size, before I make the standard test (notice, I know that a_value2 is the largest, since I sorted it before, therefore I don't need to fabs the subtraction).
Do you see any flaws?
And how do you evaluate the costs? We are talking 3 comparisons (one inside SortValues()), one multiplication, one division, one subtraction. Too expensive?
(Notice that SortValues() is declared inline in its header file, so there is no overhead to the function call).
[Edited by - Mercenarey on January 1, 2005 11:24:46 AM]

Share this post

Link to post

Share on other sites

First I have to say, that I'm not sure what you exactly want. It seems to me like you want a sort of generic float comparison function, but why?

If you are looking if the temperature on two locations are the same, then you might want to call 0.0 and 0.2 equal. In other situations, like when 0.0 and 0.2 would indicate Amperes of current for example, you would certainly not call them equal. The truth is, that the epsilon is different in every situation, and that you should not look at the size of the floating point numbers.

If I am wrong and you're using it for some higher purpose, have a look at this:

http://www.gamedev.net/community/forums/topic.asp?topic_id=291163

As you can see, floating point variables cosist out of an exponent and a mantissa. The best thing you can do is:1. Compare the exponents. If they are 2 or more a part then there's at least a factor 2 difference, and you would call them different (so false), else:2. Compare the mantissae. Look at their difference, if it is bigger than some integer 'strictness' value (between 0 and 2^23), than they are not equal, else:3. They are 'equal'

As about performance, you can first check if difference of exponents of floats <= 1 , and then do rest, it should accelerate somewhat.

rant part: I see flaw in entire idea to make universal "FloatCmp". There is no single reliable way to "compare floats for equality". Every "method" will fail in some uses. With bigger numbers you typically get bigger error, but how much it grow depends to what calculations you do. Usually you get linear dependence, but in some cases you can get quadratic. It is much better if every time you have need to compare floats, you think what maximal tolerance is acceptable _for software user_ in that specific case. For example, you need to check if point is on line. It is better to check if point is inside some reasonable cone, (for example, cone that corresponds to pixel size) and if it fails, you need bigger precision than floats...

In most of my programs, i compare floats with certain tolerance that depends to specific case. For example, i do not need big precision for pixel color values.

Float comparasion is very "innatural" operation, you should be able to avoid it. Usually you don't really need to compare floats but need to compare for specific well-known range. If you don't know range, it might be a sign that you are doing something conceptually wrong.

edit: as about exponents, it's almost as basananas said, except that you can't just compare matissas.Example:1.999999 and 2.0000001 have different exponents and therefore matissas is very different too. First is something like 1.1111111...0...*2^0 , second is like 0.000000...1...*2^1 , so matissas differ. Branching is costly, so handling such cases specially is not really much faster.

edit: and strictly speaking, float epsilon is more or less well-defined thing, and is used in iterative algorithms to stop when precision is good enough and will not improve anymore.

Share this post

Link to post

Share on other sites

Original post by basananas1. Compare the exponents. If they don't match then there's at least a factor 2 difference, and you would call the different, else:

Not true, the difference can be much smaller than a factor 2. For instance, 1.0 and 0.99999999 have different exponents.

Yeah, that's true. You'll have to look for a difference of at least 2, and if the difference is smaller, you'll have to look at the mantissa of both of the values.

I've written an example for you (Mercenarey that is :)) Use it, tear it apart and do whatever you want to do with it.

#include <iostream>usingnamespace std;

#define STRICTNESS 1.2//The strictness function check returns true if:// (A/B) < STRICTNESS, in which A is larger than B//and A and B are the two compared floating points//STRICTNESS may be any value between 1.0 and 2.0

Share on other sites

cc will 1, even though the difference between the numbers is extremely small (percentage-wise), and your small epsilon will detect them as being different, even if you might wish to evaluate them to being equal.

Similarly, if you compare these:

float aa = 0.000000000000f;float bb = 0.000000000001f;

float cc = fabs(aa - bb);

cc will be extremely small, and you need an even smaller epsilon to detect a difference. Your simple epsilon of 0.000001f will have no chance to see a difference - it will first see a difference when the numbers are a million times bigger!That is the epsilon problem - you can not use the same epsilon for all magnitudes. And that is what I try to solve with my code.

For some numbers it will be far too small, for others it will be far too big.

basananas:Thanx alot! Those ways to evaluate mantissa/exponent was just what I was looking for :)

Share this post

Link to post

Share on other sites

Original post by DmytryBranching is costly, so handling such cases specially is not really much faster.

Actually brancing is quite fast nowdays. maybe on an 8080 it may have been very costly, but now, its really cheap. (read up on branch prediction)

Actually, on 8080 branching costed much less than on P4 if you count cost in additions (that is, how many additions spend same time. Also, on P4 branching is more costly than on P3, etc. Read about pipelines. "branching prediction" doesn't always work, it often works with 50/50 chances.

Share this post

Link to post

Share on other sites

AP: first code that uses "max" will not work. Second will work. In this thread, probability to get working code = 50/50 [grin]

Quote:

Original post by MercenareyDmyTry:Lol, good comment about the negatives :)I corrected my code, it should work with negatives now.

You can cut/paste from the original post, if you wanna try. Sorry for the inconvenience :)

I don't cut-n-paste, i just see mistakes.

FloatCmp(-0.000000001,0.0000000001);

You misundestood me. "but i suggest you not to use such things." does not mean you need to write your own "universal comparison routine" that fail. You just need to drop that idea about universal float comparison routine. I doubt you have good reasons to do that comparison. If you still do not want to drop, better use my code (with apporiate abs_tolerance and rel_tolerance), it is really much faster, clealer, safer, etc. (ote that you should avoid division as it is slow.)

Your code still sucks, sorry.

Note that with negatives, if you call FloatCmp(-1,-100);after sorting, -1 is the largest, if(fabs(a_value2) > fabs(a_value1)*10.0f)return false;just serves no purprose.Note thatFloatCmp(-0.00000001,0); is true butFloatCmp( 0.00000001,0);and FloatCmp(-0.00000001,-0.0000000001); is false.

Share this post

Link to post

Share on other sites

Original post by DmytryBranching is costly, so handling such cases specially is not really much faster.

Actually brancing is quite fast nowdays. maybe on an 8080 it may have been very costly, but now, its really cheap. (read up on branch prediction)

Actually, on 8080 branching costed much less than on P4 if you count cost in additions (that is, how many additions spend same time. Also, on P4 branching is more costly than on P3, etc. Read about pipelines. "branching prediction" doesn't always work, it often works with 50/50 chances.

I've already read about branch prediction, and its quite good.

Although the pipeline gets in the way (a lot), conditional jumps are pretty fast. It just depends on how you use them.

Share this post

Link to post

Share on other sites

DmyTry:Yes, I know that my code sucks. I didn't test it well enough before writing it here.But it still works with all those examples, that you brought up, where you say it would fail.

I see other problems with this approach, however: When comparing very small numbers, the epsilon will always be alot smaller, and therefore make small numbers always be not equal (unless they are 100% equal).And that is a problem, considering, that often you will be interested in having small numbers be equal...

What I asked for was an evaluation of this approach, and we got that. A debate about different problems and possible solutions can't be bad - it can only clarify. And our discussion have made me see alot of problems with my approach.

But since we have already established that my code has weaknesses, why can't we just forget it, and look at the concept? There is no reason to keep pointing out that the code sucks - especially not since this is still an interesting conceptual discussion.

--------------

I already have a FloatCmp with a provided epsilon, why do you want me to write that one more time? The idea was, that I didn't have to worry about epsilons and tolerances. What you suggest would just put me back on that track again.

Share this post

Link to post

Share on other sites

In code i posted, you can set abs_tolerance=0.0000001;andrel_tolerance=0.0000001;and it will work correctly with small numbers too.

Actually, i'm looking at that from physics point of view. Values are almost never exactly equal, but you can consider them to be equal if them is closer than error of instruments. "Close enough" is usually meant to be difference of values, as is or in logarithmic scale. First corresponds to abs_tolerance , second to rel_tolerance. If you *might* have negative numbers, relative tolerance is usually meaningless.

That is, if you measure themperatures in Kelvin degrees , rel_tolerance is meaningfull , error between 0.9K versus 1K and 9K versus 10K is similar.If in Celsius or Farenheit, not so.

Share this post

Link to post

Share on other sites

Original post by Nice CoderI've already read about branch prediction, and its quite good.

Although the pipeline gets in the way (a lot), conditional jumps are pretty fast. It just depends on how you use them.

Branch prediction is good when the condition is predictable. People are talking about branching on relative size of exponents on floats in a low-level library. That sounds like russian roulette to me. Conditional moves are fast though, and can be used for the min/max stuff.

As Dmytry said, trying to compare to within a certain percent is futile, it depends on the situation. You should know at call time to what tolerance you need them compared, so you pass it to the function (0.000001f is just a default tolerance). There is no way to determine an appropriate tolerance given only the two values: you need to have knowledge of the application. Thus, the programmer must pass in an appropriate tolerance. Your battle to calculate an ideal epsilon from the input values is pointless. KISS (Keep It Simple Stupid)

This code works with negatives.

[edit]

Quote:

I already have a FloatCmp with a provided epsilon, why do you want me to write that one more time? The idea was, that I didn't have to worry about epsilons and tolerances. What you suggest would just put me back on that track again.

Unfortunately, any solution which abstracts away the epsilon/tolerance is never going to be ideal for all circumstances.