If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

regex inconsistency

05-Apr-2012, 01:35 PM

I have a slightly complex regex using a negative look ahead which works in the text compare area but fails in the file formats->grammar section. The regex looks for the start of a ColdFusion comment and then goes until a closing comment as long as there is not another start comment between the two. This is to deal with nested comments. See example below:
<!---(.(?!<!---))+.--->

Is there something I'm missing in the grammar section? Are there two different implementations of regex in BC3?

Yes, there are two implementations. The "Find" dialog uses a fairly recent version of PCRE, so it should support anything you need. The grammar parser is our own, because we need it to parse all of the REs in parallel, and PCRE can't handle that.

We do plan to replace it with a more powerful parser in the future, but for now these are the only things it supports (assuming our comments aren't out of date).

Code:

Metacharacters
--------------
\ Escape
^ Assert start of line
$ Assert end of line
. Match any character
[ Start character set
| Start alternative
( Start subpattern
) End subpattern
? 0 or 1 iterations (equal to {0,1})
* 0 or more iterations (equal to {0,})
+ 1 or more iterations (equal to {1,})
{ Start iterator
} End iterator
Metacharacters In Character Sets
--------------------------------
\ Escape
^ Invert, only if first char
- Range, only if surrounded by chars
] End character set, only if not first char (second if first is ^)
Escaped Characters
------------------
\a Alarm/bell (0x07)
\t Tab (0x09)
\f Formfeed (0x0C)
\e Esc (0x1B)
* \cx "Control-x", where x is any character
\ddd Character with the oct code ddd, 1 to 3 oct digits
\xhh Character with the hex code hh, 0 to 2 hex digits
\x{hhh..} Character with the hex code hhh.., max of 7FFFFFFF
Escaped Character Sets
----------------------
\d Any decimal digit (equal to [0-9])
\D Any non decimal digit (equal to [^0-9])
\s Any whitespace character (equal to [\t\f ])
\S Any non whitespace character (equal to [^\t\f ])
\w Any "word" character (equal to [_a-zA-Z0-9])
\W Any "non-word" character (equal to [^_a-zA-Z0-9])
Iterators (Greedy)
------------------
{n} n iterartions (equal to {n,n})
{n,} n or more iterartions
{n,m} at least n but no more than m iterartions

Comment

I've tried so many times to create complex regexes for what I need, regexes which work in other testers ... glad to finally understand why they don't work in BC. +1 vote for upgrading this to a standard parser
Also - this workaround for creating a grammar item for multiline comments was exactly what I needed: https://www.scootersoftware.com/vbul...-line-re-match