Comments

From: Joseph Myers <joseph@codesourcery.com>
The e500 SPE floating-point emulation code has several problems in how
it handles conversions to integer and fixed-point fractional types.
There are the following 20 relevant instructions. These can convert
to signed or unsigned 32-bit integers, either rounding towards zero
(as correct for C casts from floating-point to integer) or according
to the current rounding mode, or to signed or unsigned 32-bit
fixed-point values (values in the range [-1, 1) or [0, 1)). For
conversion from double precision there are also instructions to
convert to 64-bit integers, rounding towards zero, although as far as
I know those instructions are completely theoretical (they are only
defined for implementations that support both SPE and classic 64-bit,
and I'm not aware of any such hardware even though the architecture
definition permits that combination).
#define EFSCTUI 0x2d4
#define EFSCTSI 0x2d5
#define EFSCTUF 0x2d6
#define EFSCTSF 0x2d7
#define EFSCTUIZ 0x2d8
#define EFSCTSIZ 0x2da
#define EVFSCTUI 0x294
#define EVFSCTSI 0x295
#define EVFSCTUF 0x296
#define EVFSCTSF 0x297
#define EVFSCTUIZ 0x298
#define EVFSCTSIZ 0x29a
#define EFDCTUIDZ 0x2ea
#define EFDCTSIDZ 0x2eb
#define EFDCTUI 0x2f4
#define EFDCTSI 0x2f5
#define EFDCTUF 0x2f6
#define EFDCTSF 0x2f7
#define EFDCTUIZ 0x2f8
#define EFDCTSIZ 0x2fa
The emulation code, for the instructions that come in variants
rounding either towards zero or according to the current rounding
direction, uses "if (func & 0x4)" as a condition for using _FP_ROUND
(otherwise _FP_ROUND_ZERO is used). The condition is correct, but the
code it controls isn't. Whether _FP_ROUND or _FP_ROUND_ZERO is used
makes no difference, as the effect of those soft-fp macros is to round
an intermediate floating-point result using the low three bits (the
last one sticky) of the working format. As these operations are
dealing with a freshly unpacked floating-point input, those low bits
are zero and no rounding occurs. The emulation code then uses the
FP_TO_INT_* macros for the actual integer conversion, with the effect
of always rounding towards zero; for rounding according to the current
rounding direction, it should be using FP_TO_INT_ROUND_*.
The instructions in question have semantics defined (in the Power ISA
documents) for out-of-range values and NaNs: out-of-range values
saturate and NaNs are converted to zero. The emulation does nothing
to follow those semantics for NaNs (the soft-fp handling is to treat
them as infinities), and messes up the saturation semantics. For
single-precision conversion to integers, (((func & 0x3) != 0) || SB_s)
is the condition used for doing a signed conversion. The first part
is correct, but the second isn't: negative numbers should result in
saturation to 0 when converted to unsigned. Double-precision
conversion to 64-bit integers correctly uses ((func & 0x1) == 0).
Double-precision conversion to 32-bit integers uses (((func & 0x3) !=
0) || DB_s), with correct first part and incorrect second part. And
vector float conversion to integers uses (((func & 0x3) != 0) ||
SB0_s) (and similar for the other vector element), where the sign bit
check is again wrong.
The incorrect handling of negative numbers converted to unsigned was
introduced in commit afc0a07d4a283599ac3a6a31d7454e9baaeccca0. The
rationale given there was a C testcase with cast from float to
unsigned int. Conversion of out-of-range floating-point numbers to
integer types in C is undefined behavior in the base standard, defined
in Annex F to produce an unspecified value. That is, the C testcase
used to justify that patch is incorrect - there is no ISO C
requirement for a particular value resulting from this conversion -
and in any case, the correct semantics for such emulation are the
semantics for the instruction (unsigned saturation, which is what it
does in hardware when the emulation is disabled).
The conversion to fixed-point values has its own problems. That code
doesn't try to do a full emulation; it relies on the trap handler only
being called for arguments that are infinities, NaNs, subnormal or out
of range. That's fine, but the logic ((vb.wp[1] >> 23) == 0xff &&
((vb.wp[1] & 0x7fffff) > 0)) for NaN detection won't detect negative
NaNs as being NaNs (the same applies for the double-precision case),
and subnormals are mapped to 0 rather than respecting the rounding
mode; the code should also explicitly raise the "invalid" exception.
The code for vectors works by executing the scalar float instruction
with the trapping disabled, meaning at least subnormals won't be
handled correctly.
As well as all those problems in the main emulation code, the rounding
handler - used to emulate rounding upward and downward when not
supported in hardware and when no higher priority exception occurred -
has its own problems.
* It gets called in some cases even for the instructions rounding to
zero, and then acts according to the current rounding mode when it
should just leave alone the truncated result provided by hardware.
* It presumes that the result is a single-precision, double-precision
or single-precision vector as appropriate for the instruction type,
determines the sign of the result accordingly, and then adjusts the
result based on that sign and the rounding mode.
- In the single-precision cases at least the sign determination for
an integer result is the same as for a floating-point result; in
the double-precision case, converted to 32-bit integer or fixed
point, the sign of a double-precision value is in the high part of
the register but it's the low part of the register that has the
result of the conversion.
- If the result is unsigned fixed-point, its sign may be wrongly
determined as negative (does not actually cause problems, because
inexact unsigned fixed-point results with the high bit set can
only appear when converting from double, in which case the sign
determination is instead wrongly using the high part of the
register).
- If the sign of the result is correctly determined as negative, any
adjustment required to change the truncated result to one correct
for the rounding mode should be in the opposite direction for
two's-complement integers as for sign-magnitude floating-point
values.
- And if the integer result is zero, the correct sign can only be
determined by examining the original operand, and not at all (as
far as I can tell) if the operand and result are the same
register.
This patch fixes all these problems (as far as possible, given the
inability to determine the correct sign in the rounding handler when
the truncated result is 0, the conversion is to a signed type and the
truncated result has overwritten the original operand). Conversion to
fixed-point now uses full emulation, and does not use "asm" in the
vector case; the semantics are exactly those of converting to integer
according to the current rounding direction, once the exponent has
been adjusted, so the code makes such an adjustment then uses the
FP_TO_INT_ROUND macros.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
---
I'll send as a followup the testcase I used for verifying that the
instructions (other than the theoretical conversions to 64-bit
integers) produce the correct results. In addition, this has been
tested with the glibc testsuite (with the e500 port as posted at
<https://sourceware.org/ml/libc-alpha/2013-10/msg00195.html>, where it
improves the libm test results.
The patch depends on my previous patch
<http://lkml.org/lkml/2013/10/4/497> to fix inexactness detection in
the rounding handler. It does not depend on
<http://lkml.org/lkml/2013/10/4/495> (fix exception clearing),
<http://lkml.org/lkml/2013/10/8/694> (math-emu: fix floating-point to
integer unsigned saturation) or <http://lkml.org/lkml/2013/10/8/700>
(math-emu: fix floating-point to integer overflow detection), in that
I believe it can be applied independently of those other patches
without causing problems, but my testing has been in conjunction with
all those other patches and it may not fully fix all the affected
cases unless they are applied as well.