I could have come up with a quick solution using two pattern matches, but I didn't like that. So I played around with some regex features I haven't been using up to now and I found a way to express what you want in a single regex:

Pretty cool Marcus. I gave this a try and it worked on my example... BUT (there's always a but) I found a couple of problems:

1) Perl 5.005_03 would not run the code. Perl 5.6.0 had no problem. Which version of Perl were you using?

2) I'm writing a report parser. The number of records in the report appears at the top of the report. I thought it would be great if I could check the real number of records against the record number at the top of the report ALL from within a single regexp. Ultimately, when I placed the more complicated record regexp in the (??{}) expression the match failed. The record regexp has things like \s* and grouped expressions like (a|b|c)...

I've been trying to sort of "factor out" the a expression as I think it would make your technique more powerful. I've tried things like:

(\d+)a(??{"{$1}"}) (\d+)a{(??{"$1"})}

If there was a way to evaluate just the part inside the brackets which actually denotes the multiplier then we'd have it... so close.

This internal check would be a real bonus to my little program, but I'm running out of things to try. Thanks again for your help. Let me know if you have any other ideas.

1) Perl 5.005_03 would not run the code. Perl 5.6.0 had no problem. Which version of Perl were you using?

I'm using 5.6.0, 5.6.1 and 5.7.2. I've tested the code with 5.6.0. Perl 5.6.0 introduced some new (experimental) regex features, two of which are (?{...}) and (??{...}). So this won't work with any version older than 5.6.0.

In Reply To

2) I'm writing a report parser. The number of records in the report appears at the top of the report. I thought it would be great if I could check the real number of records against the record number at the top of the report ALL from within a single regexp. Ultimately, when I placed the more complicated record regexp in the (??{}) expression the match failed. The record regexp has things like \s* and grouped expressions like (a|b|c)...

Sorry, I don't really get it. Perhaps you could post the source data and the regex (or even a greater code snippet) you're trying to use.

In Reply To

I've been trying to sort of "factor out" the a expression as I think it would make your technique more powerful. I've tried things like:

Code

(\d+)a(??{"{$1}"}) (\d+)a{(??{"$1"})}

If there was a way to evaluate just the part inside the brackets which actually denotes the multiplier then we'd have it... so close.

There's no such way to dynamically insert a repetition count. Only complete regexes can be inserted dynamically, like I did in my last post. I don't know how familiar you are with regular expressions, but perhaps you find some debug output of the regex engine useful while playing around with regexes. Just insert

Code

use re 'debug';

on top of your script and you will see lots of additional information about the regexes you're using.

The debugging technique was pretty helpful. I realized that the expression inside the '(??{})' must use double escapes instead of single ones. This is why my regexp with whitespace was failing. For instance:

$x = "5a xb xc x"; $x =~ /^(\d+)(??{"((a|b|c)\\s*x){$1}"})$;

This regexp works but the '\\' is necessary otherwise the expression tries to match the 's' character.

This works really well. Thanks for your help. Now I just have to see if they'll let me use 5.6.0 in production!?!