Jakub Jelinek <jakub@redhat.com> wrote:
>Hi!>>Attached are two versions of a patch to teach VRP about the int bitop>builtins. Both patches are identical for all builtins but>__builtin_c[lt]z*, which are the only two from these that are>documented>to have undefined behavior on some argument (0).>>The first version is strict, it assumes __builtin_c[lt]z* (0) doesn't>happen>in valid programs, while the second one attempts to be less strict to>avoid>breaking stuff too much.>>The reason for writing the second patch is that longlong.h on various>targets has stuff like:>#ifdef __alpha_cix__>#define count_leading_zeros(COUNT,X) ((COUNT) = __builtin_clzl (X))>#define count_trailing_zeros(COUNT,X) ((COUNT) = __builtin_ctzl (X))>#define COUNT_LEADING_ZEROS_0 64>#else>and documents that if COUNT_LEADING_ZEROS_0 is defined, then>count_leading_zeros (cnt, 0) should be well defined and set>cnt to COUNT_LEADING_ZEROS_0. While neither gcc nor glibc use>COUNT_LEADING_ZEROS_0 macro, I'm a little bit afraid some code in the>wild>might do, and it might even have its own copy of longlong.h, so even if>we've removed those COUNT_LEADING_ZEROS_0 macros for targets that>use the builtins, something could stay broken. So, what the patch does>is if an optab exists for the mode of the builtin's argument and>C?Z_DEFINED_VALUE_AT_ZERO is defined, then it will add that value to>the>range unless VR of argument is non-zero (well, it handles only a few>interesting commonly used values, for CLZ only precision of the mode>(seems right now when CLZ_DEFINED_VALUE_AT_ZERO is non-zero, it sets>it always to bitsize of the mode, and even widening or double word>narrowing expansion should maintain this property), for CTZ -1 and>bitsize). If there isn't an optab for it, for CLZ it still assumes>it might be bitsize, for CTZ it just assumes it is undefined behavior>otherwise, because if I understand the code right, for CTZ we really>could>return various values for 0 without hw support for the mode, e.g. when>CTZ is implemented using CLZ, it might return something, if we use>wider>mode hw CTZ and it would return bitsize, that would be bitsize of the>wider>mode etc. I bet longlong.h only uses __builtin_c?z builtins for modes>actually implemented in hw anyway (otherwise it couldn't be used safely>in>libgcc implementation of those libcalls).>>Both patches have been bootstrapped/regtested on x86_64-linux and>i686-linux, which one do you prefer?
The less strict variant.
Thanks
Richard.
> Jakub

On 05/07/13 16:01, Jakub Jelinek wrote:
> Hi!>> Attached are two versions of a patch to teach VRP about the int bitop> builtins. Both patches are identical for all builtins but> __builtin_c[lt]z*, which are the only two from these that are documented> to have undefined behavior on some argument (0).>> The first version is strict, it assumes __builtin_c[lt]z* (0) doesn't happen> in valid programs, while the second one attempts to be less strict to avoid> breaking stuff too much.>> The reason for writing the second patch is that longlong.h on various> targets has stuff like:> #ifdef __alpha_cix__> #define count_leading_zeros(COUNT,X) ((COUNT) = __builtin_clzl (X))> #define count_trailing_zeros(COUNT,X) ((COUNT) = __builtin_ctzl (X))> #define COUNT_LEADING_ZEROS_0 64> #else> and documents that if COUNT_LEADING_ZEROS_0 is defined, then> count_leading_zeros (cnt, 0) should be well defined and set> cnt to COUNT_LEADING_ZEROS_0. While neither gcc nor glibc use> COUNT_LEADING_ZEROS_0 macro, I'm a little bit afraid some code in the wild> might do, and it might even have its own copy of longlong.h, so even if> we've removed those COUNT_LEADING_ZEROS_0 macros for targets that> use the builtins, something could stay broken. So, what the patch does> is if an optab exists for the mode of the builtin's argument and> C?Z_DEFINED_VALUE_AT_ZERO is defined, then it will add that value to the> range unless VR of argument is non-zero (well, it handles only a few> interesting commonly used values, for CLZ only precision of the mode> (seems right now when CLZ_DEFINED_VALUE_AT_ZERO is non-zero, it sets> it always to bitsize of the mode, and even widening or double word> narrowing expansion should maintain this property), for CTZ -1 and> bitsize). If there isn't an optab for it, for CLZ it still assumes> it might be bitsize, for CTZ it just assumes it is undefined behavior> otherwise, because if I understand the code right, for CTZ we really could> return various values for 0 without hw support for the mode, e.g. when> CTZ is implemented using CLZ, it might return something, if we use wider> mode hw CTZ and it would return bitsize, that would be bitsize of the wider> mode etc. I bet longlong.h only uses __builtin_c?z builtins for modes> actually implemented in hw anyway (otherwise it couldn't be used safely in> libgcc implementation of those libcalls).>> Both patches have been bootstrapped/regtested on x86_64-linux and> i686-linux, which one do you prefer?>> Jakub>>
On ARM, CLZ has a defined result at zero (32). Furthermore, the ACLE
specification defines (in the header arm_acle.h) __clz(n) as an
intrinsic aimed at the CLZ instruction; __clz() has a defined result at
0. We want to use __builtin_clz as the implementation for __clz rather
than inventing another one; but that would require the compiler to
handle zero correctly.
R.

On Tue, Jul 09, 2013 at 10:20:18AM +0100, Richard Earnshaw wrote:
> On ARM, CLZ has a defined result at zero (32). Furthermore, the> ACLE specification defines (in the header arm_acle.h) __clz(n) as> an intrinsic aimed at the CLZ instruction; __clz() has a defined> result at 0. We want to use __builtin_clz as the implementation for> __clz rather than inventing another one; but that would require the> compiler to handle zero correctly.
The patch that has been committed is the conservative one, so it
handles any __builtin_clz{,l,ll,imax} (0) returning the mode bitsize
(because that is pretty much the only value used for 0 by targets if
they specify it). __builtin_ctz{,l,ll,imax} (0) is a different matter,
because the value at 0 varries a lot (-1, mode bitsize, undefined)
for the cases where there is optab, and for the case where __builtin_ctz
needs to be implemented say using clz, or in wider mode, or through library
routines you really can't expect anything meaningful.
So, for CTZ you get the target defined value for 0 if there is one only
if you have a ctz optab for that mode and CTZ_DEFINED_VALUE_AT_ZERO,
otherwise the patch treats it as undefined.
Jakub

On Tue, 9 Jul 2013, Richard Earnshaw wrote:
> On ARM, CLZ has a defined result at zero (32). Furthermore, the ACLE> specification defines (in the header arm_acle.h) __clz(n) as an intrinsic> aimed at the CLZ instruction; __clz() has a defined result at 0. We want to> use __builtin_clz as the implementation for __clz rather than inventing> another one; but that would require the compiler to handle zero correctly.
Assuming the header will come with GCC, it can assume semantics we don't
document for user code - such as that __builtin_clz is defined at 0, if
that is the case with a given GCC version and target architecture. If the
semantics change, the header can then be changed at the same time.
(I'd still encourage user code wanting a defined value at 0 to do e.g.
static inline int clz (int n) { return n == 0 ? 32 : __builtin_clz (n); }
and hope GCC will optimize away the test for 0 when the instruction
semantics make it unnecessary - if it doesn't, it should be fixed to do
so. And it's certainly fine to put such code in an intrinsic header if
useful.)