@hplus0603: seems to me we just had some communication issues and there actually is no noteworthy fundamental disagreement. Only a bit of semantic bickering ... right?

-----------------------------------------Anyway, I try to clarify:

"No. Floating point arithmetic is not really deterministic."

Ie. While floating point arithmetic, as far as actual CPU instructions go, is deterministic (given same FPU state etc) - no-one really writes in CPU instructions. A higher level language is used, like C++ ... which means the compiler is the one writing the instructions and the arithmetic written in such a language is not guaranteed to be deterministic. A'la:* FPU states do differ for x86, x64, ia64 (two of which everyone should care about).* different compilers (inc. different versions) are free to do things differently.* VC specific: "precise", which is absolutely required to get any determinism to begin with, disallows meddling with FPU state and is not a guarantee (plenty of "subject to change" notifications around it).

Also, the statement you made as quoted right here is clearly not factual.

Depends entirely how you choose to understand it. Did my clarification help? Ie. what i meant IS factual. Without listing a bunch of other limitations (specific target architecture / strict compiler restrictions / etc), which i did not, then this "Floating point arithmetic is not really deterministic." is factual.

Sure sounds like you agree with it - leaving the semantic confusion aside.

That's funny, given that the original Perlin noise was specified in terms of byte-size integer values, and it has been implemented on GPUs with 8 bits per component fixed-point colors. It can also be implemented with MMX or plain integer registers in fixed point with good performance. Thus, I have to also say that I don't agree that fixed-point implementations of Perlin noise are slower than floating point.

Like i said, whether or not he uses floating point in Perlin noise implementation is irrelevant. I could not know - i used it only as a basis to give my warning about floating point usage. Your choice of examples is unfair, irrelevant and not agreeable to me, but i am not willing to delve into it as all of this is tangential.

Your list omitted the primary source of inconsistencies (which is odd as i specifically mentioned it in my post): stack vs register usage.

No, I explicitly included it:

If some CPUs run with 80 bits internal precision, and others run with 64 bits internal precision, then on, they will not be deterministic. You have to avoid that, too. Set the floating point control word (rounding, precision) to a known value before you start computation to control for this.

If the intermediate value "snafu" is stored in stack then it must round down (64->32), but rounding is not done if register was used. To work around this problem:* one must be aware that there is a problem to begin with - which is why i posted my warning.* (VC specific) use "precise" compiler flag to enforce rounding after every calculation (so, it won't matter whether register or memory is used). Sucks a bit in regards of performance, but is absolutely necessary.* be aware that meddling with floating point control world is not allowed, you are stuck with whatever is used (compiler is allowed to silently ignore the _controlfp command). Iirc, starting with win64 platforms, changing internal precision from 64 is privileged and not allowed to change regardless ... but, my memory is a bit foggy about the specifics there.* be aware that you might now be bound to a specific compiler and its version for the lifetime of the program.

PS. obviously, all memory variables are affected not just stack.

----------------edit: yeah, i remembered correctly, changing internal precision is not allowed on x64 platforms at all and it triggers a crash:

On the x64 architecture, changing the floating-point precision is not supported. If the precision control mask is used on that platform, the invalid parameter handler is invoked, as described in Parameter Validation.

Yes, it absolutely does. If you use doubles for your variables, and doubles for your x87 registers, then whether you spill to stack or not matters on Intel x86 CPUs running in 80-bit precision mode, but does not matter on AMD CPUs because they silently truncate 80-bit precision mode to 64-bit precision mode. This is the main reason why intermediate stores/spills matters for determinism, because you can't control which CPU vendor your user chooses. (You can set the internal precision mode to 64 bits to match AMD and Intel, though.)

Spilling a float variable to stack so it truncates is not a problem for determinism, because it will do the same thing every time on all machines that run the same code. Your arguments seem to revolve around the fact that trying to run different versions of a binary together will not generate deterministic results. Similarly, trying to run "the same" program on different architectures, or with different compilers, also will not generate deterministic behavior. Those are true, and my argument is that, even when you control for those circumstances, you will run into trouble because of subtler behavior (the Intel/AMD difference, the DLL-load-sets-fpctlword problem on Windows, the signal-handlers-reset-control-words problem on Linux, etc.)