Re: Casting ctype lookups

Alan Barrett <apb%cequrux.com@localhost> writes:
> On Wed, 14 Nov 2012, D'Arcy J.M. Cain wrote:
>>Would this be safer?
>>
>>#define toupper(c)\
>> ((int)((_toupper_tab_ + 1)[(int)(unsigned char)(c)]))
>
> No. That would break when the caller passes the int value EOF.
>
> I believe that there's nothing wrong with NetBSD's definitions of
> these functions and macros. If the caller gets a warning about them,
> then there's a problem in the caller's code.
That may be true technically, but there's a larger social problem that
there's a lot of code out there that seems to work on other systems, and
doesn't produce warnings, but does produce warnings on NetBSD, and
people generally don't understand all the subtleties. So I think it
would be really helpful if there were a public language-lawyerly defense
of why the warning is legitimate (i.e., indicates accurately that the
code being warned about is probably wrong), and perhaps ctype(3) is the
place.
As I understand it, the hairy edge is that a caller passing a value in a
'char' (when char is signed) which is within 0-127 is not out of line,
because it meets the requirement that it be non-negative and
representable as unsigned char. However, a program passing an arbitrary
signed char is defective because it invokes undefined behavior.
It might also help to make the warning be "signed char used as array
subscript", because that actually seems to be the issue. I wonder
though, whether that is worse than a signed int used as an array
subscript, which seems common and does not provoke warnings.
> The only improvement that I can think of would be to abort with a
> descriptive message if the value is out of range, but to do so in a
> way that does not hurt performance for non-buggy callers.
I wonder if there's a way to have a macro definition that checks the
type of the argument at compile time, and if it's unsigned char or int,
produce the current code, and if it's signed char, add a test for being
non-negative with an abort.