results in winGRASS7 32bit vs 64bit: should they be identical?

Description

>>Like this you check if the difference is very small and below the
>>acceptable threshold.
>>In C, we use GRASS_EPSILON for these comparisons.
>
> now done with GRASS_EPSILON 1.0e-15
>
> r.mapcalc expression=difftestepsilon = if( abs( accum32bit at accumtest -
> accum64bit at accumtest) <= 1.0e-15, 1 , 100 )
>
> attached a screenshot of this map
>
> accum_diff_wingrass32bitvs64bit.png
> <http://osgeo-org.1560.x6.nabble.com/file/n5243286/accum_diff_wingrass32bitvs64bit.png>
Mhh..
Can you open a ticket with all relevant bits + the difference
screenshot along with a legend and "r.univar -e" output?

tested with the NC sample data set on the same 64bit- win7 box running 32bit and 64bit winGRASS binaries.

all these tests with 32bit and 64 bit binaries are done on the same machine. I find it interesting that some calculations are identical between 32 bit and 64 bit (e.g. Slope and aspect), others are not (e.g. accumulation by r.watershed or r.Topic).

though, the diff values are very low and results should be treated as identical?

Doesn't this qualify for "basically identical results as much as different architectures can achieve"?

In theory, it should be possible to obtain identical results on any architecture using IEEE-754 floating-point for calculations using only +, -, *, / and sqrt() (no transcendental functions such as sin, cos, log, exp, etc). But that may have a noticeable performance cost (e.g. using -ffloat-store to discard excess precision, forgoing certain optimisations, etc).

Regarding the use of an "epsilon":

The accuracy of a floating-point value is proportional to its magnitude. For IEEE-754, FLT_EPSILON is ~1.192e-7 while DBL_EPSILON is ~2.22e-16. This represents the difference between 1.0 and the smallest distinct, representable value greater than 1.0.

Consequently, it provides a measure of the worst-case rounding error for values between 1.0 (inclusive) and 2.0 (exclusive). For round-to-nearest, the worst-case rounding error will be plus or minus half that value. For larger or smaller values, the maximum rounding error will be scaled accordingly. I.e. a typical "tolerant comparison" would be along the lines of

if (fabs(a-b) < N * epsilon * max(fabs(a),fabs(b))) ...

Rounding error occurs at each stage in the computation. The expected absolute error for a complex calculation will be at least the sum of the absolute errors for the individual steps.

When subtractions are involved, you can't rely upon the absolute error in the result being proportional to the magnitude of the result, as subtracting similar values will produce a result whose magnitude may be much smaller than either of the individual values, but whose absolute error is proportional to that of the individual values.

(A concrete example of that last point is that a single-pass variance calculation using sum-of-squares minus sum-squared can yield a negative value for the variance if the deviations are small relative to the mean).

Other common cases where the accuracy of the result can be far worse than that of intermediate results or the inputs include finding roots of high-degree polynomials (​an extreme example) and acos(x) where x is close to 1.0 (as x tends to 1.0, the derivative of acos(x) tends to infinity while acos(x) itself tends to zero; the first magnifies the absolute error while the second further magnifies the relative error).

Doesn't this qualify for "basically identical results as much as different architectures can achieve"?

In theory, it should be possible to obtain identical results on any architecture using IEEE-754 floating-point for calculations using only +, -, *, / and sqrt() (no transcendental functions such as sin, cos, log, exp, etc). But that may have a noticeable performance cost (e.g. using -ffloat-store to discard excess precision, forgoing certain optimisations, etc).

The results are thus as precise as requested per GCC compiler options for the given architecture. The same GCC compiler options can lead to different results under different architectures.

Generally, with limited fp precision,

(a * b) / c != a * (b / c)

I tested this with r.watershed and got differences in the range also observed between 32bit and 64bit architectures. The differences between architectures are thus within the expected range of floating-point results. Identical results for different architectures could maybe obtained with gcc -fexcess-precision=standard -std=c99.

In other words, the results obtained with WinGRASS 32bit and WinGRASS 64bit are both correct, even if they differ from each other. The results are as precise as possible / as requested.

Doesn't this qualify for "basically identical results as much as different architectures can achieve"?

In theory, it should be possible to obtain identical results on any architecture using IEEE-754 floating-point for calculations using only +, -, *, / and sqrt() (no transcendental functions such as sin, cos, log, exp, etc). But that may have a noticeable performance cost (e.g. using -ffloat-store to discard excess precision, forgoing certain optimisations, etc).

The results are thus as precise as requested per GCC compiler options for the given architecture. The same GCC compiler options can lead to different results under different architectures.

Generally, with limited fp precision,

(a * b) / c != a * (b / c)

I tested this with r.watershed and got differences in the range also observed between 32bit and 64bit architectures. The differences between architectures are thus within the expected range of floating-point results. Identical results for different architectures could maybe obtained with gcc -fexcess-precision=standard -std=c99.

In other words, the results obtained with WinGRASS 32bit and WinGRASS 64bit are both correct, even if they differ from each other. The results are as precise as possible / as requested.

The results are thus as precise as requested per GCC compiler options for the given architecture. The same GCC compiler options can lead to different results under different architectures.

There should be some set of options which produce the correct result on any architecture which uses IEEE-754.

Generally, with limited fp precision,

(a * b) / c != a * (b / c)

This shouldn't be relevant. The source code won't change between architectures, and gcc shouldn't perform such approximations unless specifically requested via -ffast-math or similar (apparently -Ofast enables this, but none of the numbered optimisation levels do).

The main issue which occurs with the default options is excess precision. The x87 FPU uses 80 bits by default; you need to use e.g. -ffloat-store or -fexcess-precision=standard to force the result to be rounded to 64 bits. SSE always rounds to 64 bits.

In other words, the results obtained with WinGRASS 32bit and WinGRASS 64bit are both correct, even if they differ from each other. The results are as precise as possible / as requested.

The issue is how you test for correctness. You can either build for deterministic calculation so that exact equality comparisons will work, or you can use a tolerance. The latter option requires case-by-case analysis (see point 4 in comment:9).

The results are thus as precise as requested per GCC compiler options for the given architecture. The same GCC compiler options can lead to different results under different architectures.

There should be some set of options which produce the correct result on any architecture which uses IEEE-754.

Generally, with limited fp precision,

(a * b) / c != a * (b / c)

This shouldn't be relevant. The source code won't change between architectures, and gcc shouldn't perform such approximations unless specifically requested via -ffast-math or similar (apparently -Ofast enables this, but none of the numbered optimisation levels do).

The main issue which occurs with the default options is excess precision. The x87 FPU uses 80 bits by default; you need to use e.g. -ffloat-store or -fexcess-precision=standard to force the result to be rounded to 64 bits. SSE always rounds to 64 bits.

In other words, the results obtained with WinGRASS 32bit and WinGRASS 64bit are both correct, even if they differ from each other. The results are as precise as possible / as requested.

The issue is how you test for correctness. You can either build for deterministic calculation so that exact equality comparisons will work, or you can use a tolerance. The latter option requires case-by-case analysis (see point 4 in comment:9).

some tests shows that some examples in manuals (modules and addons) seems not to work due to this issue in winGRASS 7 64-it.