Hi,
I wrote:
> I have prepared two patches: The first marks the test skipped-on non-x86
> and adds a comment as to why, and the second one adds another test,
> skipped-on x86, that tests the libm range-reduction.
I have just committed these two. I am curious to learn if any
combinations of backend and operating system fail the second test
(the one added with commit 4de15256dd6387e5) as I tested it only
on x86-64 under Linux.
Greetings,
Lutz

Thread view

Hi,
the mentioned test is tagged as "Expected failure", currently only on
x86-64 (but also fails on PPC, see lp#964348 -- I agree with Nikodemus
that that is expected).
I am getting earnestly annoyed by this, so I investigated. It turns out
things are completely different.
The test is about range reduction for trigonometric functions on large
float arguments, like (sin (expt 2d0 70)). It was introduced with commit
ad9090dc91fc922a with a fix for lp#327192 where Gabor had noticed that
on x86 the above expression returns 0d0 and Paul committed a change
introducing range reduction into the relevant VOPs on x86.
In short: The test is bogus. Even on x86 it only doesn't fail due to an
(un)lucky choice of test values. Trigonometric functions on large float
arguments on x86 still deliver completely wrong results. The x86-64 port
gets everything right but the test punishes that.
Basically, x86 gets things wrong because it does range reduction (even
for arguments below (expt 2 63), it seems) modulo some limited-precision
approximation of 2 pi while libm uses an algorithm that delivers results
as if the infinitely precise pi was used to reduce mod 2 pi.
So what do we want:
- Document that x86 is imprecise.
- Completely revert the fix for lp#327192 on the account that delivering
0 or some random value are both equally bad (the bug that "tan"
crashes can be fixed independently; there is only an "fldz" missing).
- Change the test to test for correct results and mark x86 as failing.
- Drop the test as nobody would be using arguments this large to
trigonometric functions anyway.
- Implement a better range reduction on x86 (in Lisp, presumably) and
use it for arguments above (expt 2 20) or so.
- Convert x86 to use libm for these functions.
Any comments appreciated!
Some details:
Here are some results of (sin (expt 2d0 n)) on SBCL x86, respectively
(coerce (sin (expt 2l0 n)) 'double-float) on Clisp with the long-float
precision set to 3000 bits (the results of the first expression on SBCL
x86-64 are equal to Clisp's results and I assume that both are
mathematically correct within one ULP or so):
n = 20:
SBCL x86: 0.33049314002173596d0
Clisp: 0.3304931400217347d0
n = 40:
SBCL x86: -0.4057050128266265d0
Clisp: -0.40570501153282873d0
n = 70:
SBCL x86: 0.936042960761405d0
Clisp: -0.9981794021933068d0
So, with n = 70, x86 does not even get the sign of the result right.
Here is why the test only accidentally works on x86: For several values
it compares for example (sin value) with (sin (mod value (* 2 pi))). It
turns out that for the only value large enough to have a chance to make
"almost=" in the test fail (in light of the size of the errors seen
above), namely (* pi (expt 2d0 70)), the modular reductions by mod (Lisp)
and by fprem/fsin (x87) yield the same, namely 0d0:
(All following expressions are evaluated on x86)
* (sin (* pi (expt 2d0 70)))
0.0d0
* (mod (* pi (expt 2d0 70)) (* 2 pi))
0.0d0
* (sin (mod (* pi (expt 2d0 70)) (* 2 pi)))
0.0d0
But for some other values these are not equal (or "almost=") anymore
(and still both completely wrong, I need not mention):
* (let* ((value (* 1.625d0 pi (expt 2d0 70)))
(mod (mod value (* 2 pi))))
(values (sin value) mod (sin mod)))
0.1676180272408028d0
0.0d0
0.0d0
* (let* ((value (* 1.6d0 (expt 2d0 70)))
(mod (mod value (* 2 pi))))
(values (sin value) mod (sin mod)))
0.5720626950150568d0
262144.0d0
-0.08410702780950105d0
Greetings,
Lutz

On 28 May 2012 18:36, Lutz Euler <lutz.euler@...> wrote:
> - Document that x86 is imprecise.
To the degree that we can't fix it (or haven't yet fixed tis), we
should do this.
> - Completely revert the fix for lp#327192 on the account that delivering
> 0 or some random value are both equally bad (the bug that "tan"
> crashes can be fixed independently; there is only an "fldz" missing).
Adding the missing fldz sounds right TRT.
> - Change the test to test for correct results and mark x86 as failing.
Sounds right.
> - Drop the test as nobody would be using arguments this large to
> trigonometric functions anyway.
Someone ran across it... so I think there's some experimental proof to
the contrary.
> - Implement a better range reduction on x86 (in Lisp, presumably) and
> use it for arguments above (expt 2 20) or so.
> - Convert x86 to use libm for these functions.
I think I would prefer to use libm, but I foresee that even if we use
libm elsewhere, we may need to do something else on Windows -- but
let's not hope so...
Cheers,
-- Nikodemus

On Mon, May 28, 2012 at 5:36 PM, Lutz Euler <lutz.euler@...> wrote:
> The test is about range reduction for trigonometric functions on large
> float arguments, like (sin (expt 2d0 70)). It was introduced with commit
> ad9090dc91fc922a with a fix for lp#327192 where Gabor had noticed that
> on x86 the above expression returns 0d0 and Paul committed a change
> introducing range reduction into the relevant VOPs on x86.
>
> In short: The test is bogus. Even on x86 it only doesn't fail due to an
> (un)lucky choice of test values. Trigonometric functions on large float
> arguments on x86 still deliver completely wrong results. The x86-64 port
> gets everything right but the test punishes that.
> Basically, x86 gets things wrong because it does range reduction (even
> for arguments below (expt 2 63), it seems) modulo some limited-precision
> approximation of 2 pi.
The thing is, branching on C2 and going through FPREM/2Pi is exactly
what Intel recommends in their ISA reference: "It is up to the program
to check the C2 flag for out-of-range conditions. Source values
outside the range -263 to +263 can be reduced to the range of the
instruction by subtracting an appropriate integer multiple of 2π or by
using the FPREM instruction with a divisor of 2π." (I can't tell what
section 8.3.8 of volume 1 means when we use FLDPI's 66 bit constant
directly). If it's good enough for the official docs, I believe that's
good enough for SBCL as well. I think any issue might stem from our
using x87 in 64 bit mode.
> - Completely revert the fix for lp#327192 on the account that delivering
> 0 or some random value are both equally bad (the bug that "tan"
> crashes can be fixed independently; there is only an "fldz" missing).
Disagree. There's a lot of quibbling to be had wrt whether a FP value
denotes a point or a range, but it's pretty clear to me that I should
be able to depend on sin^2 + cos^2 = 1.
> - Change the test to test for correct results and mark x86 as failing.
It's a different test than what was originally intended, but sure.
Paul Khuong

Hi Paul,
> On Mon, May 28, 2012 at 5:36 PM, Lutz Euler <lutz.euler@...> wrote:
> > The test is about range reduction for trigonometric functions on large
> > float arguments, like (sin (expt 2d0 70)). It was introduced with commit
> > ad9090dc91fc922a with a fix for lp#327192 where Gabor had noticed that
> > on x86 the above expression returns 0d0 and Paul committed a change
> > introducing range reduction into the relevant VOPs on x86.
> >
> > In short: The test is bogus. Even on x86 it only doesn't fail due to an
> > (un)lucky choice of test values. Trigonometric functions on large float
> > arguments on x86 still deliver completely wrong results. The x86-64 port
> > gets everything right but the test punishes that.
> > Basically, x86 gets things wrong because it does range reduction (even
> > for arguments below (expt 2 63), it seems) modulo some limited-precision
> > approximation of 2 pi.
>
> The thing is, branching on C2 and going through FPREM/2Pi is exactly
> what Intel recommends in their ISA reference: "It is up to the program
> to check the C2 flag for out-of-range conditions. Source values
> outside the range -263 to +263 can be reduced to the range of the
> instruction by subtracting an appropriate integer multiple of 2π or by
> using the FPREM instruction with a divisor of 2π." (I can't tell what
> section 8.3.8 of volume 1 means when we use FLDPI's 66 bit constant
> directly).
I understand.
> If it's good enough for the official docs, I believe that's
> good enough for SBCL as well.
That is fine with me. Unfortunately, x86-64, using libm, calculates
completely different results for large inputs, which have IMO at least
the same right to be considered good enough as what x87 calculates.
(I might even say: Unfortunately, time has passed, the restrictions
necessary when x87 was conceived no longer apply (like wanting to use
only a coarse approximation to pi), so libm takes a much improved
approach and yields correct results even where x87 is completely off.)
I started this thread saying that I am annoyed. That I am only about the
fact that x86-64 fails this test. I don't mind so much what x86 does.
> > - Completely revert the fix for lp#327192 on the account that delivering
> > 0 or some random value are both equally bad (the bug that "tan"
> > crashes can be fixed independently; there is only an "fldz" missing).
>
> Disagree. There's a lot of quibbling to be had wrt whether a FP value
> denotes a point or a range, but it's pretty clear to me that I should
> be able to depend on sin^2 + cos^2 = 1.
Fine with me. The list of possible changes I sent was not meant to mean
"I want all this changed" but more "Which subset of these changes would
be sensible?".
> > - Change the test to test for correct results and mark x86 as failing.
>
> It's a different test than what was originally intended, but sure.
In light of what you wrote I don't insist on regarding x86's behaviour
as "failing".
Maybe we should just make the expected test results dependent on whether
the platform is x86 or not. With a bit of multi-precision arithmetic I
can build a reducer modulo 2 pi with sufficient precision to replicate
the results of the libm range-reduction and thus test that.
Just now I see that you enabled the test originally only for x86.
So another solution would be just to make the test x86-only again?
Greetings,
Lutz

On Tue, May 29, 2012 at 2:32 PM, Lutz Euler <lutz.euler@...> wrote:
> Maybe we should just make the expected test results dependent on whether
> the platform is x86 or not. With a bit of multi-precision arithmetic I
> can build a reducer modulo 2 pi with sufficient precision to replicate
> the results of the libm range-reduction and thus test that.
>
> Just now I see that you enabled the test originally only for x86.
>
> So another solution would be just to make the test x86-only again?
I think marking the test as requiring x86-only weirdness by whatever
means would be fine.
Long-term, +1 to just relying on libm everywhere. If some
sufficiently motivated person wants to enable fsin and friends for
appropriately-DECLARE'd code, that would be acceptable. (glibc has
basically done this.) If we run into imprecise platform libm's, it
should be more obvious with such a scheme. (Until some sufficiently
motivated person writes the appropriate routines in pure CL, of
course. ;)
-Nathan

Nathan Froyd wrote,
> On Tue, May 29, 2012 at 2:32 PM, Lutz Euler <lutz.euler@...> wrote:
> > Maybe we should just make the expected test results dependent on whether
> > the platform is x86 or not. With a bit of multi-precision arithmetic I
> > can build a reducer modulo 2 pi with sufficient precision to replicate
> > the results of the libm range-reduction and thus test that.
> >
> > Just now I see that you enabled the test originally only for x86.
> >
> > So another solution would be just to make the test x86-only again?
>
> I think marking the test as requiring x86-only weirdness by whatever
> means would be fine.
I have prepared two patches: The first marks the test skipped-on non-x86
and adds a comment as to why, and the second one adds another test,
skipped-on x86, that tests the libm range-reduction.
If no one objects, I will commit both in a few days.
Kind regards,
Lutz

Hi,
I wrote:
> I have prepared two patches: The first marks the test skipped-on non-x86
> and adds a comment as to why, and the second one adds another test,
> skipped-on x86, that tests the libm range-reduction.
I have just committed these two. I am curious to learn if any
combinations of backend and operating system fail the second test
(the one added with commit 4de15256dd6387e5) as I tested it only
on x86-64 under Linux.
Greetings,
Lutz