Am Freitag, 14. Oktober 2005 16:25 schrieben Sie:
> On Fri, Oct 14, 2005 at 04:20:24PM +0200, Wolfgang Jeltsch
> <wolfgang at jeltsch.net> wrote:
>> > I always couldn't understand why one has to write regular
> > expressions as strings
>> Because the language used inside these strings is standard,
> multi-language, widely used and documented?
Well, in my opinion, the standard regexp syntax is rather awkward so that
diverging from the standard might be a good thing. However, my proposal was
not about introducing a new syntax. If I had just used a different syntax, I
had used strings for representing regexps as well. But my main point is to
not use strings for representing regexps at runtime because this means that
parsing is done at runtime. This might result in a loss of efficiency. In
addition, no syntax checks can be done at runtime. The situation gets worse
if you try to manipulate regular expressions.
Now lets consider using an algebraic datatype for regexps:
data RegExp
= Empty | Single Char | RegExp :+: RegExp | RegExp :|: RegExpt | Iter RegExp
Manipulating regular expressions now becomes easy and safe – you are just not
able to create "syntactically incorrect regular expressions" since during
runtime you don't deal with syntax at all.
In addition, the usage of a special datatype can provide more flexibility.
Representing regexps as strings means that regexps can only denote sets of
strings. In contrast, the above datatype could easily be extendend to allow
arbitrary lists instead of just strings:
data RegExp token
= Empty | Single token | RegExp token :+: RegExp token | ...
If you really need a Perl-like syntax for regular expressions, the strings
representing the regexps should be parsed at compile-time and transformed
into expressions of a special regexp datatype like the one above.
However, I don't like the idea of extending the language with a special regexp
syntax. Why handle a specific, albeit common, syntax for a special case of
regexps (string-only) special? What about other things than regexps? Should
they also get a language extension?
I'd say that the better way would be to use Template Haskell for this purpose:
myRegExp = $(regExp "[a-z0-9]")
This way, special syntaxes are not hard-wired into the language but can be
activated by importing a corresponding module.
Best wishes,
Wolfgang