>> Another way [...] is to supply a bit mask for each character being> compared. Only bits with a '1' in the mask are used in the comparison.

> I used that technic to build a LL(1) based Z80 disassembler/decompiler years> ago, to decode "structural bits" (vs. values bits) in instructions, but I> don't think its applicable to a AFD-based regular expression engine, before> knowing which bits can be ignored the AFD will first need to determine a> class for the character. it will simply result in a "tolower(c)" (or> toupper(c) ) called on each character of the input, but expressed and coded> in another way. The preparation phase still has to verify that there are no> two paths leaving a state with 'a' and 'A' for example (as they would become> a single path once the mask applied).

Yes, it would have to be built into the evaluation engine. In the
case of the Paracel FDF, it is part of the hardware, and also part of
PSL (Pattern Specification Language).

Also, the PSL compiler will figure out the optimal mask for a given
character position. If, for example, one put [bc] where the
characters only differ by one bit, the compiler would generate 'b'
with a mask of X'3e'. (The default mask for ASCII masks the high bit.
Case insensitivity is the default for lower case query characters.)

It seems to me that it could result in more compact compiled regular
expressions, that also might execute faster.