.NET Regex balanced matches failure evaluation

Posted on February 10, 2016 // 0 Comments

While doing some regex demos, I noticed some strange behavior that doesn’t seem to comply with the docs. I’m working with balancing group definitions as described here. In conjunction with those, it should be possible to use the syntax (?(start)(?!)) to fail the expression in case the balanced elements don’t even out (or rather, if there aren’t sufficient end elements). This conditional expression is documented here in these words: Matches [] if [] a named [] capturing group has a match

In my tests, this does not appear to work correctly. Here are my tests and results.

As you can see, I have included two elements at the end of my expression, both commented in the code above. The final line shows the documented conditional expression that should fail the expression if the start group still has content at the end, in other words, if there are more starting elements (that’s opening parens in my case) than ending elements (closing parens).

To test this, I first “break” my input text by removing one of the closing parens, so this remains: "formula: (10 * (3 + (7 - 5) + (2 + 6)) &lt;- this is important"

This is correct. The start group still contains one opening paren because there was no closing counterpart found in the input.

Now I remove the comment sign from the last line of my regular expression and run yet again. I expect match.Success to return False now. Here's what I get instead:

Match: formula:
Formula:
start captures:
end captures:

Instead of failing the expression, the match is now empty. Two things I learn from that:

The engine has obviously reacted to the inclusion of the (?(start)(?!)) expression, since the behavior is no longer the same as it was before.

The intended result of failing the expression was not reached. The behavior I see instead seems rather inexplicable.

On that basis, I had the idea to try whether the (?!) on its own really has the ability of failing the match in the sense of setting Success to False. I comment the last line of the expression again and include the one right before it. Now the expression should always fail. And it does, regardless of the input string.

Match not successful

The next idea is that perhaps the conditional construct doesn't work as intended. However, a quick independent check shows that under simple circumstances, the construct works as intended.

Up to this point, the matter remains a mystery to me. I'm either missing something, or this is a special case that triggers some kind of unintended behavior in the engine. Here are a few further thoughts:

I tried to run the code in .NET 4.5 and mono – results are the same

I considered whether the match might be regarded successful by the engine for some reason, in spite of the "fail" triggered by the (?!) element. My point of view is, though, that the expression should definitely fail. This is supported by the fact that (?!) triggers the "fail" I want when used outside the conditional construct.

Please let me know if you have any ideas. It would be great to understand what's going on here, assuming there is a plausible reason for it!