Add a case insensitive pattern. Although we have different calls for adding case sensitive and insensitive patterns, we make a single call for either case. No special treatment for either case. More...

Detailed Description

Aho-corasick MPM optimized for the Tilera Tile-Gx architecture.
Efficient String Matching: An Aid to Bibliographic Search
Alfred V. Aho and Margaret J. Corasick
- Started with util-mpm-ac.c:
- Uses the delta table for calculating transitions,
instead of having separate goto and failure
transitions.
- If we cross 2 ** 16 states, we use 4 bytes in the
transition table to hold each state, otherwise we use
2 bytes.
- This version of the MPM is heavy on memory, but it
performs well. If you can fit the ruleset with this
mpm on your box without hitting swap, this is the MPM
to go for.
- Added these optimizations:
- Compress the input alphabet from 256 characters down
to the actual characters used in the patterns, plus
one character for all the unused characters.
- Reduce the size of the delta table so that each state
is the smallest power of two that is larger than the
size of the compressed alphabet.
- Specialized the search function based on state count
(small for 8-bit large for 16-bit) and the size of
the alphabet, so that it is constant inside the
function for better optimization.

Do a proper analyis of our existing MPMs and suggest a good one based on the pattern distribution and the expected traffic(say http).

- Irrespective of whether we cross 2 ** 16 states or
not,shift to using uint32_t for state type, so that we can
integrate it's status as a final state or not in the
topmost byte. We are already doing it if state_count is >
2 ** 16.
- Test case-senstive patterns if they have any ascii chars.
If they don't treat them as nocase.
- Reorder the compressed alphabet to put the most common characters
first.