Dynamically generates macros for detecting special charclasses in latin-1, utf8, and codepoint forms. Macros can be set to return the length (in bytes) of the matched codepoint, and/or the codepoint itself.

To regenerate regcharclass.h, run this script from perl-root. No arguments are necessary.

Using WHATEVER as an example the following macros can be produced, depending on the input parameters (how to get each is described by internal comments at the __DATA__ line):

A variant form of each of the macro types described above can be generated, in which the code point is returned by the macro, and an extra parameter (in the final position) is added, which is a pointer for the macro to set the byte length of the returned code point.

These forms all have a what_len prefix instead of the is_, for example what_len_WHATEVER_safe(s,e,is_utf8,len) and what_len_WHATEVER_utf8(s,len).

These forms should not be used except on small sets of mostly widely separated code points; otherwise the code generated is inefficient. For these cases, it is best to use the is_ forms, and then find the code point with utf8_to_uvchr_buf(). This program can fail with a "deep recursion" message on the worst of the inappropriate sets. Examine the generated macro to see if it is acceptable.

A variant form of each of the is_ macro types described above can be generated, in which the code point and not the length is returned by the macro. These have the same caveat as "what_len_WHATEVER_FOO(arg1, ..., len)", plus they should not be used where the set contains a NULL, as 0 is returned for two different cases: a) the set doesn't include the input code point; b) the set does include it, and it is a NULL.