Introduction

One of the best-known Regular Expression development and test systems is Expresso[^]; if you need to write Regular Expressions and haven't tried it, you should give it a try. I have tried it, but I have also decided that it does more than I need and I recently conceived of a handy feature that it doesn't offer.

This article is not an introduction to Regular Expressions and the application is not intended as a substitute for Expresso. You may find that you use Expresso for the initial development of your Regular Expressions, copy them to your program code, and thereafter use this application to test changes you make to the Regular Expressions.

Background

I don't consider myself a Regular Expressions expert, but I use them frequently enough that I spend more time than I like copying them from my source code, tweaking and testing them, and then copying them back and testing again. Because Regular Expressions often contain quotes and the backslash (\), part of the trouble is with escaping and unescaping such characters (this may be more of a problem for we C# developers than for VB developers).

Example:
A Regular Expression to match a sequence of alphanumeric characters within quotes:
"\w+"
When writing a string literal in C# to contain this value, you need to escape the quotes and backslash:
"\"\\w+\""
or, using a verbatim string literal:
@"""\w+"""
notice that the backslash no longer has to be escaped, and the escaping of the quotes changes.
In VB:
"""\w+"""

(I don't use VB, so please alert me about any errors in the VB code I present so I can fix them.)

I began writing this little utility application with the idea of allowing it to handle the unescaping automatically so I could copy my code right from the editing window to a TextBox in the tester without concern for removing the escapes and adding them back in again. That version worked well, but because I often write Regular Expressions that are long enough that I break them onto multiple lines, copying these to and from the tester still required more effort than I liked.

Copying such a beast would still have to be done in several steps, and copying it back, ensuring that it's split at valid spots, is even more tedious. As I was working with the above Regular Expression the thought struck me -- what if I could copy the lines, complete with the surrounding quotes and concatenation operators (+) and have the test application pass it through the compiler?

Modes of operation

RegexTester currently has four modes for passing the entered (or pasted) text from the Regex TextBox to the Regex constructor:

AsIs

The text will be passed unchanged.

Unescape

Escapes will be removed by the very simple, non-foolproof, technique of replacing "" with ", \\ with \, and \" with ".

CSharp

The text will be passed through the C# compiler.

VisualBasic

The text will be passed through the VisualBasic compiler.

Whitespace

I also realized that, when viewing the results of a Regex Match, I desired the ability to see any whitespace characters produced. So I added the Show Whitespace checkbox; when it is checked, RegexTester will replace whitespace characters in the results with an appropriate character. To accomplish this, I chose to use the Arial Unicode MS typeface. This typeface contains glyphs (graphemes?) that represent alphanumeric characters inside circles; I chose characters based on the characters they represent:

S

Space (' ')

0

Null ('\0')

A

Alert (Bell) ('\a')

B

Backspace ('\b')

F

Formfeed ('\f')

N

Newline (Linefeed) ('\n')

R

Return (Carriage Return) ('\r')

T

Tab (Horizontal Tab) ('\t')

V

Vertical Tab ('\v')

How it's done

There's really not much involved. Only three event handlers are implemented: frmRegexTester_Load -- loads the last saved state from an XML file (if any). frmRegexTester_FormClosing -- stores the current state as XML in "%USERPROFILE%\\Local Settings\\RegexTester.xml". bGetResult_Click -- initiates the GetResult process.

Because GetResult can be a lengthy process, the method is executed on a different thread and the Get result button will be disabled until the method is complete. Consequently, there are a number of plumbing methods and delegates involved to avoid cross-thread exceptions; I won't document them here though they may be of interest to any who need to see how that can be done.

Using the code

The ZIP file contains the source code, along with a BAT file to compile it into a .NET 2.0 WinExe. You may need to adjust the paths in the BAT file for your system; I'm running WinXP SP3 with .net 3.5.

You are welcome to use the code in a development environment of your own choosing, but as I don't know what that is, and it may not even exist at the time of this writing, I can't help you with that.

The ZIP also contains an installer (MSI) in case you just wish to use the application; the executable in the MSI targets .net 2.0.

Conclusion

This application neatly solves my problem with copying Regular Expressions between a code editing window and a Regex testing utility. It offers various modes for passing the text to the Regex constructor. It has the ability to display whitespace in the results. It persists its state.

I consider this to be version 0.0 of this application; I've only been working on it for a week. I'd like some feedback on usability and bugs; glowing testimonials are also welcome. Additional features don't come readily to mind (other than to expand Help (F1)) and I may not add any, but if you have suggestions, please post them.