Converting Wildcards to Regexes

Introduction

Ever wondered how to do wildcards in .NET? It's not hard, all you have to do is use regular expressions. But it's not always easy to figure it out either. I found that I had to dig around for a while to figure out how to do it properly.

Even though regexes are a lot more powerful, wildcards are still good in situations where you can't expect the user to know or learn the cryptic syntax of regexes. The most obvious example is in the file search functionality of practically all OSs -- there aren't many that don't accept wildcards. I personally need wildcards to handle the HttpHandlers tag in web.config files.

Note: This method is good enough for most uses, but if you need every ounce of performance with wildcards, here is a good place to start.

Using the Code

There are three steps to converting a wildcard to a regex:

Escape the pattern to make it regex-safe. Wildcards use only * and ?, so the rest of the text has to be converted to literals.

Once escaped, * becomes \* and ? becomes \?, so we have to convert \* and \? to their respective regex equivalents, .* and ..

Prepend ^ and append $ to specify the beginning and end of the pattern.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

About the Author

Comments and Discussions

According to Code Project, the license for this code is unspecified. Can you declare what license you would like this code to be considered under? We have a requirement to know the license for all code used in our projects.

To find in Word:
Open Word 2010 and paste the text.
On the home tab in the far right, select the find drop down and select advanced find. In the resulting dialog type "s*a". Click the more button and check use wildcards. Click find next. It will find "she sells sea" as the first match.

The regex pattern generated for s*a is "^s.*a$"

If you test that regex pattern, it comes back with 0 matches.

The current regex pattern looks like it will only get a match when 's' is at the beginning of the string or line and 'a' is at the end of the string or line.
I'm not too good with regex and could use a solution that would find the pattern anywhere in the string. I've tried a few modifications to the existing regex pattern, but desired result not reached yet.

****Update****
Found what I was looking for.
Changing the code to the following did the trick:

Removed '^', and '$' which says matches need to be at beginning and end of string or line. Changed ".*" to ".*?" - turned 'greedy quantifier' into 'lazy quantifier'. After that, it will still come back with only 2 matches. To compensate for that you could search the string multiple times bumping the start point of the search each time like below:

Thanks for the code. I liked it, especially the fact that it was derived from Regex. I use Regex static methods a lot, and hence added these methods to your Wildcard class so that its interface matches more the .NET's Regex class. Here are these methods:

I have released a new version of the RegEx Tester tool. You can download it free from http://www.codeproject.com/KB/string/regextester.aspx and http://sourceforge.net/projects/regextester

With RegEx Tester you can fully develop and test your regular expression against a target text. It's UI is designed to aid you in the RegEx developing. It uses and supports ALL of the features available in the .NET RegEx Class.

I think that with some reg expressions is better because you can declare exactly your intentions.
For example sometime I want to check if the pattern match the whole text, so in this case you can use Exact. (*pattern* is not the same as Exact, in this case I force that the pattern match from start to end adding ^ and $).
But probably you are right for the StartsWith and EndsWith, they are not very useful.