On Thu, 27 May 2004 14:29:17 +0200, Bruno Haible <bruno@...> wrote:
> Edi Weitz wrote:
>> Last time I checked it was comparable in speed with CLISP's regex
>> engine.
>
> Do you mean the speed when run in a Lisp implementation that
> compiles to native code, or when run in clisp?
I meant CL-PPCRE when run in CLISP. Of course it's much faster when
compiled to native code.
> Doing character-at-a-time processing in byte-compiled clisp is quite
> slow; that's why we are looking for an implementation written in
> C/C++.
While this is generally true the situation is not as black-and-white
as you seem to imply. First, regular expressions are not only about
character-at-a-time processing. There are several cases where CL-PPCRE
is actually (a lot) faster than CLISP's regex engine. (See end of mail
for some benchmarks.)
But apart from that I'd say that it is usually just fast enough[TM]
(unless you're throwing certain regular expressions at enormous
strings several times). Moreover:
1. It has more features (look-aheads, look-behinds, stand-alone
expressions, ...) than the current engine.
2. It has a syntax (the one from Perl) most users will be familiar
with.
3. It has an alternative S-expression syntax for regular
expressions. These are of course much easier to manipulate
programmatically from Lisp.
4. Because it's written in Lisp it'll use the same character encoding
that CLISP uses independently of external settings like your
locale.
5. If you get an error you get an error Lisp can handle. Disasters
like this one can't happen:
[1]> (regexp:regexp-compile "(a|(bc)){0,0}?xyz" :extended t)
*** - handle_fault error2 ! address = 0x14 not in [0x20248000,0x203d5a30) !
SIGSEGV cannot be cured. Fault address = 0x14.
Segmentation fault
6. It's written in Lisp (did I mention that already?) - for marketing
reasons it might be not too bad an idea if the regex engine used by
a Lisp implementation was also written in Lisp... :)
Anyway, you decide...
Cheers,
Edi.
Regarding the speed of CL-PPCRE I did some simple benchmarks based on
<http://weitz.de/cl-ppcre/#bench&gt;. The code I wrote can be found at
<http://miles.agharta.de/bench.lisp&gt;. This is on a Debian sid system
with CLISP (2.33) from Debian.
It should be noted that CL-PPCRE has been profiled with and optimized
for CMUCL only. (With the help of Duane Rettig from Franz I've done
some preliminary work to optimize it for AllegroCL as well but that's
not part of the official distribution yet.) I'm pretty sure there are
ways to tweak it for CLISP if someone who knows CLISP better than I
does it. I'll gladly accept patches.
edi@...:~$ uname -a
Linux bird 2.6.6 #1 Wed May 26 11:10:22 CEST 2004 i686 GNU/Linux
edi@...:~$ echo $LC_CTYPE
en_US.UTF-8
edi@...:~$ clisp
WARNING: *FOREIGN-ENCODING*: reset to ASCII
i i i i i i i ooooo o ooooooo ooooo ooooo
I I I I I I I 8 8 8 8 8 o 8 8
I \ `+' / I 8 8 8 8 8 8
\ `-+-' / 8 8 8 ooooo 8oooo
`-__|__-' 8 8 8 8 8
| 8 o 8 8 o 8 8
------+------ ooooo 8oooooo ooo8ooo ooooo 8
Copyright (c) Bruno Haible, Michael Stoll 1992, 1993
Copyright (c) Bruno Haible, Marcus Daniels 1994-1997
Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998
Copyright (c) Bruno Haible, Sam Steingold 1999-2000
Copyright (c) Sam Steingold, Bruno Haible 2001-2004
;; Loading file /home/edi/.clisprc ...
;; Loaded file /home/edi/.clisprc
[1]> (lisp-implementation-version)
"2.33 (2004-03-17) (built 3289141980) (memory 3294470374)"
[2]> (require :cl-ppcre)
;; Loading file /usr/share/common-lisp/systems/cl-ppcre.asd ...
;; Loaded file /usr/share/common-lisp/systems/cl-ppcre.asd
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/packages.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/packages.fas
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/specials.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/specials.fas
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/util.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/util.fas
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/errors.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/errors.fas
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/lexer.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/lexer.fas
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/parser.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/parser.fas
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/regex-class.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/regex-class.fas
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/convert.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/convert.fas
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/optimize.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/optimize.fas
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/closures.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/closures.fas
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/repetition-closures.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/repetition-closures.fas
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/scanner.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/scanner.fas
;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/api.fas ...
;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/api.fas
0 errors, 0 warnings
T
[3]> (load (compile-file "/tmp/bench.lisp"))
Compiling file /tmp/bench.lisp ...
Wrote file /tmp/bench.fas
0 errors, 0 warnings
;; Loading file /tmp/bench.fas ...
;; Loaded file /tmp/bench.fas
T
[4]> (test)
PPCRE wins by a factor of 31.2
PPCRE wins by a factor of 257.9
PPCRE wins by a factor of 2534.2
PPCRE wins by a factor of 20842.4
PPCRE wins by a factor of 3.6
PPCRE wins by a factor of 3.8
PPCRE wins by a factor of 3.9
PPCRE wins by a factor of 4.0
PPCRE wins by a factor of 9.5
PPCRE wins by a factor of 84.5
PPCRE wins by a factor of 846.4
PPCRE wins by a factor of 6337.0
CLISP wins by a factor of 1.3
CLISP wins by a factor of 1.3
CLISP wins by a factor of 1.3
CLISP wins by a factor of 1.2
CLISP wins by a factor of 2.1
CLISP wins by a factor of 2.3
CLISP wins by a factor of 2.3
CLISP wins by a factor of 2.0
PPCRE wins by a factor of 3.0
PPCRE wins by a factor of 3.2
PPCRE wins by a factor of 3.3
PPCRE wins by a factor of 3.3
PPCRE wins by a factor of 1.2
PPCRE wins by a factor of 1.2
PPCRE wins by a factor of 1.2
PPCRE wins by a factor of 1.2
CLISP wins by a factor of 2.3
CLISP wins by a factor of 2.6
CLISP wins by a factor of 2.7
CLISP wins by a factor of 2.4
PPCRE wins by a factor of 3.0
PPCRE wins by a factor of 3.1
PPCRE wins by a factor of 3.3
PPCRE wins by a factor of 3.3
PPCRE wins by a factor of 1.3
PPCRE wins by a factor of 1.5
PPCRE wins by a factor of 1.5
PPCRE wins by a factor of 1.5
CLISP wins by a factor of 1.9
CLISP wins by a factor of 1.9
CLISP wins by a factor of 1.8
PPCRE wins by a factor of 48.6
PPCRE wins by a factor of 617.9
PPCRE wins by a factor of 6168.3
CLISP wins by a factor of 14.6
CLISP wins by a factor of 20.1
CLISP wins by a factor of 21.3
PPCRE wins by a factor of 1.2
PPCRE wins by a factor of 1.8
PPCRE wins by a factor of 1.8
CLISP wins by a factor of 14.6
CLISP wins by a factor of 21.3
CLISP wins by a factor of 21.9
PPCRE wins by a factor of 1.3
PPCRE wins by a factor of 1.8
PPCRE wins by a factor of 1.9
PPCRE wins by a factor of 1.6
PPCRE wins by a factor of 2.0
PPCRE wins by a factor of 2.1
NIL
[5]>

On Wed, May 26, 2004 at 12:02:15PM -0400, John K. Hinsdale wrote:
>=20
> I think this will do what you want:
>=20
> (setf patt "<[/]\\?issuer\\(.*\\)\\?>")
> (regexp:match patt "<issuerName>CITIZENS BANKING CORP</issuerName>")
>=20
> Note that the backslashes are doubled so that the literal string
> assigned to "patt" is:
>=20
> <[/]\?issuer\(.*\)\?>")
>=20
> and the ? quantifier and grouping parens are preceded by backslash
>=20
> > I already have some useful regexes to do what I want on this data that
> > run in production under awk and/or guile - I can pretty much just cut
> > and paste expressions from one to the other.
>=20
> There is a module called "pcre" (Perl-Compatible Regular Expressions)
> that will use regex syntax ala perl. I prefer this actually, mainly
> because perl's syntax avoids what Larry Wall called "backslashitis",
> an overabundance of backslashes which under POSIX appear in the very
> commonly used operators. Adding to this the fact that to get a
> backslash INTO a literal string (in perl, C, lisp) you have to double
> it as "\\" then you go crazy. Perl's syntax considers the special
> operators like ? to be special by DEFAULT and if you want them
> literally THEN you must quote them.
>=20
> I think the POSIX designers did this because it is conceptually more
> elegant and consistent to have all regex operators be escaped w/
> backslash, but then it looks like hell when you actually use it.
>=20
> Sam: the syntax page at:
>=20
> http://clisp.cons.org/impnotes/regexp.html
>=20
> should probably say it's POSIX, and draw a distinction b/w
> other syntaxes (GNU grep, perl, etc.)
>=20
> I agree all this different syntaxes are reall pain=20
>=20
No pain, no gain! ;-) (And as we used to say in the marines: "Pain is
just weakness leaving the body!")
The reference material that for some reason made things click for me
on this was the regex.h link from the clisp extensions page for regexp
- if I am not mistaken, that is *the* description for how the engine
works. It might be 'just' a man page, but it is *much* better than
the one for the same subject on my Debian box.
http://www.opengroup.org/onlinepubs/007904975/basedefs/regex.h.html
Anyway, I think I've found the answers to about 90% of my questions in
regard to the clisp regexp package, and the last one Ive got a fairly
decent workaround for. I'm beginning to get comfortable with this
package - so far, it seems to meet my needs quite well.
If you are at all interested, starting with the pattern you so kindly
suggested to me, here is what I've come up with so far.
(defvar xml-tree " <notSubjectToSection16>0</notSubjectToSection16>
<issuer>
<issuerCik>0000351077</issuerCik>
<issuerName>CITIZENS BANKING CORP</issuerName>
<issuerTradingSymbol>CBCF</issuerTradingSymbol>
</issuer>
<reportingOwner>
")
(defvar issuer-leaf=3D20
(match-string xml-tree
(match "<issuer>.*?</issuer>" xml-tree :extended t)))
Sets 'issuer-leaf' to just the content between and inclusive of the
<issuer/> tags, and illustrates my last remaining question about this.
If I read the entire original file into xml-tree (about 13100 bytes),
and use "<issuer>(.*)?</issuer>", the match fails. Remove the the
grouping parentheses, and it works as above. *However*, if we now use
the pattern *with* the parentheses on 'issuer-leaf' as in:
(match-string issuer-leaf=3D20
(cadr (multiple-value-list=3D20
(match "<issuer>(.*)?</issuer>" issuer-leaf :extended t))))
Returns:
"
<issuerCik>0000351077</issuerCik>
<issuerName>CITIZENS BANKING CORP</issuerName>
<issuerTradingSymbol>CBCF</issuerTradingSymbol>
"
The ':extended t' argument removes the need for most of, if not all,
ofthe painful backslash escaping as well, as per the referenced man
page - _that_ makes things a bit easier, IMO.
The 'match' form in the example returns two match structures: first,
the 'overall match', then next, any groups specified in the pattern -
in this case, the stuff *between* the <issuer/> tags. Woot!
*Now* we are getting somewhere! Using this strategy it is very
straightforward to get to the juicy bits.
But why, oh why doesn't parentheses/grouping work against the whole
xml tree, I wonder? Hmmm. I'm not going to spend much more time on
it, since I've now got a working strategy, but I *am* curious about
it. =3D20
I'd thought at first that it was due to '.* greediness', so I put in
the '?' to make them lazy instead - nope, no difference. As a nice
benefit though, the '?' seems to provide a consistent 0.01 second
speed up on my ooold, slooow box against this very tiny sample.
(BTW, I only used 'cadr' in the example because I know from
experimentation that there are two matches, and the second is the
group that I am interested in - I doubt I would use it in a real
program.)
Aloha,
+Chris
__
No single drop of water thinks it is responsible for the flood.
-- Old adage

On Thu, May 27, 2004 at 11:15:50AM -0400, Sam Steingold wrote:
> > * Chris Hall <unyy.pw@...> [2004-05-27 00:18:24 -1000]:
> >
> > My question - how 'safe' would it be to use one of these builds?
>=20
> either is perfectly fine.
Yesssss! I am *so* glad to hear this!
>=20
> > $ less /home/rocktiger/lisp/build/clisp-2.33/deb-test/suite/*.erg
> >
> > CORRECT: (100 200)
> > CLISP : ERROR
> > LOAD: A file with name
> > /home/lisp/clocc/src/tools/ansi-test/make-load-form-demo.fas does not
> > exist
>=20
> does this directory exist?
> why is it used? this is very strange.
It does indeed, and there is also a make-load-form-demo.lisp file in
it. I was wondering about that as well. I've looked in my bash
environment, and the only lisp-related item I can find was CMUCLLIB.
I do (now) have a .clisprc.lisp, but that is just since I installed
(finally!) clisp 2.33, and all it does so far is load asdf.
As an aside, CLOCC is indirectly kind of responsible for my decision
to switch to clisp (long story).
One thing I have become aware of during the whole 'familiarization
process' of learning how clisp is built is the sheer quantity of
high-quality work you folks have put in to this and best of all have
contributed to the world at large.
I find it to be a truly impressive achievement and an equally
impressive gesture, and I've been developing software systems of
various sizes full-time for 25+ years. You guys rock!
Thank you all, and thank you again, Sam, in particular for the time
you've taken to nurse me along.
Aloha,
+Chris
__
No single drop of water thinks it is responsible for the flood.
-- Old adage

Edi Weitz wrote:
> Last time I checked it was comparable in speed with CLISP's regex
> engine.
Do you mean the speed when run in a Lisp implementation that compiles to
native code, or when run in clisp? Doing character-at-a-time processing
in byte-compiled clisp is quite slow; that's why we are looking for an
implementation written in C/C++.
Bruno

Sam wrote:
> I am not sure what you mean here.
> At any rate, the regexp module comes with a 10 y.o. implementation
> whose origin is unclear.
The origin is simple: The code is the GNU regex 0.12 that was the common
regular expression implementation in GNU programs for years. It's only 8-bit,
though. The documentation has a part copied from GNU ed, a part from GNU
emacs with modifications, and the Lisp interface description that I wrote.
> (One big problem is unicode).
Yes, this is the big problem.
> I would prefer either finding a unicode regexp and ignoring the OS
Yes. Did you try http://sourceforge.net/projects/ustring/ ?
> or dumping the bundled regexp and requiring OS to offer one.
This doesn't help, because the glibc regex is Unicode capable only within
an UTF-8 locale.
Bruno

> I really must get into the habit of poking around in CLHS a bit more -
> esp. since I have a local copy - I might have found this for myself.
And don't forget about comp.lang.lisp newsgroup!
--
Best regards,
Arseny mailto:ampy@...

On Wed, May 26, 2004 at 10:51:26PM -0400, Sam Steingold wrote:
> > (BTW, how do I access the second match? Lisp newbie here, I'm afraid.
> > I've tried all sort of things, but can't seem to figure it out. Use
> > 'multiple-value-something' perhaps?)
>=20
> yes, MULTIPLE-VALUE-BIND or MULTIPLE-VALUE-LIST
>=20
Woot! MULTIPLE-VALUE-LIST works for me. I had looked at
MULTIPLE-VALUE-BIND, but according to CLHS if one doesn't know the
proper number of values ahead time, extra ones get discarded. I
couldn't figure out how to the count, so I stopped looking there.
I really must get into the habit of poking around in CLHS a bit more -
esp. since I have a local copy - I might have found this for myself.
> 1. you want to use EQ to check for EOF (this is actually a bug in your co=
de!)
>=20
> 2. also, you want to pass the stream itself as the 3rd argument to
> READ-LINE (this is a standard idiom).
>=20
> 3. if you want to read the whole file into one string, you should do it
> like this:
>=20
> (defun read-whole-file (name)
> (with-open-file (s name)
> (let ((ret (make-string (file-length s))))
> (read-sequence ret s)
> ret)))
>=20
> your method is extremely inefficient.
>=20
With the emphasis on *extremely* - yikes!
(Experienced lispers might want to avoid looking at this - too
painful! I sheepishly wonder if I could possibly have made it even
slower. :-})
My naive newbie version:
[133]> (time (setf xml (get-file "srchsec-xmple.xml")))
Real time: 0.644871 sec.
Run time: 0.42 sec.
Space: 4916144 Bytes
GC: 10, GC time: 0.32 sec.
Sam's *much, much, much* better version.
[134]> (time (setf xml (read-whole-file "srchsec-xmple.xml")))
Real time: 0.015718 sec.
Run time: 0.02 sec.
Space: 48392 Bytes
A million thanks, Sam - very gracious and kind of you to share your
experience and to set me straight.
I'm sure that I have *lot* of lisp idioms such as this to learn - I
guess the best place to get started is looking at other peoples' code
and experimenting with my own. And CLHS, I suspect.
> --=20
> Sam Steingold (http://www.podval.org/~sds) running w2k
> <http://www.camera.org&gt; <http://www.iris.org.il&gt; <http://www.memri.org/&gt;
> <http://www.mideasttruth.com/&gt; <http://www.honestreporting.com&gt;
> Do not tell me what to do and I will not tell you where to go.
(Another great sig, Sam)
Aloha,
+Chris
__
No single drop of water thinks it is responsible for the flood.
-- Old adage

Hello, I have a special_offer for you...
WANT TO LOSE WEIGHT?
The most powerful weightloss is now available
without prescription. All natural Adipren720
100% Money Back Guarantée!
- Lose up to 19% Total Body Weight.
- Up to 300% more Weight Loss while dieting.
- Loss of 20-35% abdominal Fat.
- Reduction of 40-70% overall Fat under skin.
- Increase metabolic rate by 76.9% without Exercise.
- Boost your Confidence level and Self Esteem.
- Burns calorized fat.
- Suppresses appetite for sugar.
Get the facts about all-natural Adipren720 <http://www.8721diet.biz/default45.htm&gt;
If you wish not to be contacted again please
enter your email address here. <http://www.8721diet.biz/r.html&gt;

> * Bruno Haible <oehab@...> [2004-05-26 20:11:36 +0200]:
>
> Sam wrote:
>> I am inclined to _remove_ this page at all because we already link to
>> the canonical regexp reference
>> <http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html&gt;
>> and because lack of DocBook/XML source makes it a distribution nightmare.
>
> Please don't remove these two syntax descriptions. A manual page is
> something that a user can understand; the POSIX regexp reference is
> not readable like this.
I am not sure what you mean here.
At any rate, the regexp module comes with a 10 y.o. implementation
whose origin is unclear. (One big problem is unicode).
The man page has, apparently, the same origin.
Note that we use the OS regexp when it is available, so this page is
irrelevant most of the time.
I would prefer either finding a unicode regexp and ignoring the OS, or
dumping the bundled regexp and requiring OS to offer one.
OTOH, win32 loses then.
forget it, it's a can of worms.
--
Sam Steingold (http://www.podval.org/~sds) running w2k
<http://www.camera.org&gt; <http://www.iris.org.il&gt; <http://www.memri.org/&gt;
<http://www.mideasttruth.com/&gt; <http://www.honestreporting.com&gt;
main(a){a="main(a){a=%c%s%c;printf(a,34,a,34);}";printf(a,34,a,34);}

On Wed, May 26, 2004 at 12:02:15PM -0400, John K. Hinsdale wrote:
>=20
> I think this will do what you want:
>=20
> (setf patt "<[/]\\?issuer\\(.*\\)\\?>")
> (regexp:match patt "<issuerName>CITIZENS BANKING CORP</issuerName>")
>=20
> Note that the backslashes are doubled so that the literal string
> assigned to "patt" is:
>=20
> <[/]\?issuer\(.*\)\?>")
>=20
> and the ? quantifier and grouping parens are preceded by backslash
>=20
> > I already have some useful regexes to do what I want on this data that
> > run in production under awk and/or guile - I can pretty much just cut
> > and paste expressions from one to the other.
>=20
> There is a module called "pcre" (Perl-Compatible Regular Expressions)
> that will use regex syntax ala perl. I prefer this actually, mainly
> because perl's syntax avoids what Larry Wall called "backslashitis",
Like in Emacs regexes? ;-)
Module "pcre" doesn't seem to be available for 2.29, I'm still working
on getting current clisp to build on my box. At any rate, I'd prefer
to not have to add yet another piece to the list of required
supporting software for my app, and my needs, I am fairly certain, can
be met with the basic package, if I can just get my mind around it.
>=20
> I agree all this different syntaxes are reall pain=20
>=20
Naaaw, they just make life that much more 'interesting'. ;-D
I *am* trying to upgrade to the current clisp - I noticed that I have
gcc 3.0 installed already, so I think I will try that for building
clisp.
FWIW, I keep the http://clisp.cons.org/impnotes/modules.html#regexp
page open in my browser when I am working with regexp in clisp.
According to the man _and_ info pages grep/egrep/fgrep, if the env
variable POSIXLY_CORRECT is set, they will behave appropriately. More
specifically on the quoting, they inform us that 'basic' expressions
require quoting by backslash, 'grep -E'/egrep do not, but from my
testing will accept them.
And thanks for the proper quoted version - as it turns out I *had*
tried it but not on a free-standing string, as in your example.
Please see the following as to what I mean by this, and why I am still
a bit confused.
I find the following to be some 'interesting' ;-) behavior.
[125]> test-rx
"<[/]\\?issuer\\(.*\\)\\?>"
[126]> test-str
" <notSubjectToSection16>0</notSubjectToSection16>
<issuer>
<issuerCik>0000351077</issuerCik>
<issuerName>CITIZENS BANKING CORP</issuerName>
<issuerTradingSymbol>CBCF</issuerTradingSymbol>
</issuer>
<reportingOwner>
"
[127]> (match test-rx test-str)
#S(REGEXP::REGMATCH_T :RM_SO 58 :RM_EO 255) ;
#S(REGEXP::REGMATCH_T :RM_SO 65 :RM_EO 254)
[128]>=20
So far so good. 'K. (And thanks again.)
(BTW, how do I access the second match? Lisp newbie here, I'm afraid.
I've tried all sort of things, but can't seem to figure it out. Use
'multiple-value-something' perhaps?)
Now, I'm using this (probably very naive and newbie-like) function to
read the file containing the data to be matched against.
(defun get-file (fname)
(let ((in-data "")
(in-name fname)
(new-ln (coerce (list #\Newline) 'string)))
(with-open-file (in-file in-name :direction :input)
(do ((cur-line
(read-line in-file nil "<<<EOF")
(read-line in-file nil "<<<EOF")))
((equal "<<<EOF" cur-line))
; (setf in-data (concatenate 'string in-data cur-line new-ln)=
)))
(setf in-data (concatenate 'string in-data cur-line))))
in-data))
(defvar xml (get-file "srchsec-xmple.xml"))
So then I try:
[128]> (match test-rx xml)
NIL
[129]>=20
As shown, 'get-file', *doesn't* include newlines, since read-line
doesn't pass them on, but I get the same result if I *do* include the
newlines. It is my understanding that the interaction of regexes and
newlines can sometimes be a bit subtle, so I am unsure of the
consequences.
I've also tried:
[129]> (setf xml (coerce (get-file "srchsec-xmple.xml") 'string))
[130]> (match test-rx xml)
NIL
[131]>=20
I noticed the following at the very start of the contents of var
'xml':
"<?xml version=3D\\\"1.0\\\"?>"
so I also tried (setf xml (subseq xml 24 (length xml))) to get rid of
the offending data, thinking that the quoted/escapes characters might
confuse the regexp engine somehow, but still get NIL on the match.
Any idea if those characters would adversely affect a match in some
way?
I really appreciate your patience and time in helping a clueless
newbie luser like me!
Aloha,
+Chris
--=20
Good judgment comes from experience. Experience comes from bad
judgment.
- Jim Horning