Building a Regular Expression Stream Search with the .NET Framework

As you can see, the methods above, as fitting for a byte stream, read and write byte data. Converting bytes to a string is the role of some encoding classes in the System.Text namespace. In our solution, we used the ASCIIEncoding class. The following example illustrates how you can convert bytes to a string by using the ASCIIEncoding class.

At this point, we have all the tools to compose a solution. There is, however, one other issue to address before we're ready to assemble the solution.

A Buffered Solution for Easing Ingestion

Streams can be large. Although we could have loaded an entire Stream into a string, we wanted to avoid the overhead of storing an entire Stream in memory. So, one last issue to confront is: How do you load portions of the Stream when you need to search the entire Stream for a pattern?

Add a portion of the Stream to the front of the buffer and trim from the back the same number of bytes you added to the front.

The approach works well if you know how large the target search pattern can be. This may not always be the case with all Regular Expressions, but because we were looking for simple patterns, it was a safe assumption.

We also needed to avoid making the number of characters we trim and add too large. Too large of trim and add values compared to the size of the buffer risk cutting too many characters off of the end of the buffer, missing the pattern. So, in the example string below, a buffer of 7 and a trim and add value of 6 would miss the string pattern "zabcdef" embedded in the middle of the string.

An algorithm using the values above would split the target pattern in two once you reach characters in the pattern of the Stream.

Now, it's time to look at our complete solution embodied in a single class called StreamSearchExpression.

StreamSearchExpression

Earlier, you learned about the relationship between the buffer, trim/add, and the patterns you are matching. Rather than making these values dynamic, class users provide the values in the class constructor. The class constructor appears below.

As you can see, Check loops through the Stream, advancing the buffer until one of the patterns in the array of patterns is found. Although a Regular Expression can be written to work like an array of patterns, we opted for the array mostly to eliminate the need to write a more complicated Regular Expression.