ICgrep embodies a completely new algorithmic approach to high-performance regular expression matching. In contrast to the byte-at-a-time approach of NFA, DFA, and backtracking matchers, ICgrep processes UTF-8 input streams 128 code units at a time (using SSE2 technology, or 256 code units at a time with Intel's new AVX2 instructions). Regular expressions are widely used to identify patterns in data files and data streams, with applications in security, search engines, biomedical and genome research, database systems, and in a wide range of big data applications.

While there are many libraries that have been developed to support regular expression processing, the most commonly used utility is called 'grep', which is built-in to the Linux and Mac-OS operating systems (and a commonly downloaded utility for the Windows platform). ICgrep accepts ASCII or UTF-8 input files and provides a full suite of Unicode processing features meeting the requirements of Unicode Level 1 support of Unicode Technical Standard #18. See Unicode Level 1 Support in ICgrep for details. Visit our Downloads page to try it out, or browse the source code and even build it yourself for your own machines and environments.

ICgrep builds on the patented parallel bitstream software technology of International Characters Inc (IC). ICs patented technologies are dedicated for free use in open source software, experimentation, research and teaching. (See the IC covenant for details).