Archive

Regular Expressions are an amazing way to go. A regular expression (regex or regexp for short) is a special text string for describing a search pattern. You can think of regular expressions as wildcards on steroids. Almost all languages support them and with a little understanding, you can ace at them. What more, they condense tens and thousands of lines of logic into just a couple of simple lines. I’ve used regular expressions both through .NET framework and in JavaScript and found them to be immensely helpful. This blog post focuses majorly on them and here’s a little outline on how I’ll be approaching the concept on hand.

A problem – We engineers are problem solvers, so a simple problem followed by how regular expressions solved it in an even simpler way

Dissecting the solution – An understanding of the above simpler solution

Link to raw regular expression resources – Unless you can make up some of those regular expressions, you can never make it easier to use this language feature

RegEx class – The power of RegEx in .NET framework, what all you can achieve

Starting with this, followed by that, then this and then a dozen of that – A Problem!

I was given, the below problem statement and a laptop to code and solve. Developer instincts are hard to ignore and eventually I started scribbling off an elegant algorithm, or so I thought.

Domestic passport numbers either begin with the letter TW followed by 12 digits, or begin with 6 digits followed by 4 other characters and ending with the letters OFTW.

Pseudo logic:

Is Starting With TW?

Are the remaining 12 characters – digits?

i. Return “Domestic”

Is Ending with OFTW?

Does it begin with 6 digits?

i. Return “Domestic”

If all above fails

Return “Foreigner”

So with all the code logic and intricacies figured out, I exactly translate the pseudo code to C# code:

That’s when I was told; don’t you follow all the language features? Especially something called Regular Expressions?

Well, not to be an all knowing buff but yet, I knew what Regular Expression was and I perfectly knew how to use it. But surprisingly it never struck me. Given a problem, my initial mode of tackling it was to analytically approach it and solve. Although regular Expression’s way of doing it was straight opposite – it was elegant; and I hit myself why my mind didn’t think of it.

Or if you want a quick stop solution/pattern, visit Mark’s http://txt2re.com/, it’s an amazing site.

The almighty RegEx class of .NET

Let’s step back a bit and focus on the RegEx class in .NET framework, and see how better we can use it!

Check if a string is looking as it should

The above solution simply does that. The Match, IsMatch methods of the RegEx class tries to fits the given string into a pattern and says True, the string matches the pattern or False, the string looks nothing like the pattern! The simplest form of the method asks only for the pattern and the string.

Replace a questionable part in a string

Take for instance, the exclaimer. He gets too excited for nothing and you just need to dial his excitement down. How?

string SampleText = "This is Outrageous!!!!!!!! Regex can't solve all my problems!!!! What if it can!!!!!";

You can never track the count of the exclamations he has used (!) nor does he use a specific number all the time. The Replace method of Regex would help you there.

PS: The plus (+), here in the pattern says that the (!) can appear once or more in the string. And the last part is the replace with part. All questionable parts matching the pattern would be replaced with the input to the method.

This would give me an elegant output like thus:

This is Outrageous! Regex can’t solve all my problems! What if it can!

Pattern matching, only simpler:

Imagine how hard it was to match patterns or find the count of substrings actually matching a pattern you’re looking for. RegEx offers you an elegant way to the same, using RegEx.Matches. Let’s take the below statement.

“Are you working on any special projects at work? I am not reading any books right now. Aren’t you teaching at the university now?”

I need to fish out all the present-continuous forms verbs in them, i.e. the words ending with “ing”. Normally this would be a heavy code. But with RegEx we can do it faster and simpler using the pattern string “(\\b)(\\w+)(ing)(\\b)”

Through this post, I’ve summarized most of the key uses of RegEx with special emphasis on .NET code. Beyond a particular point, it depends on your creativity how you take it further to tailor the RegEx solution based on your problem. You can do away with ugly for loops or unnecessary if-else constructs in code.