Newbie needs help with split() and "<"

I am trying to split the following line into a list of just the
numbers. It is a list of xy coordinates.

<-250,-850> <-250,800> <200,800> <200,-850> <-250,-850>

I can use split() with comma, and ">", but not "<". The following
code works, but I can not add "<" to the regular expression used by
split(). I have tried various combinations of "\<" with and without
quotes without success. Any ideas?
Thanks.

Advertisements

Bill wrote:
> I am trying to split the following line into a list of just the
> numbers. It is a list of xy coordinates.
>
> <-250,-850> <-250,800> <200,800> <200,-850> <-250,-850>
>
> I can use split() with comma, and ">", but not "<". The following
> code works, but I can not add "<" to the regular expression used by
> split(). I have tried various combinations of "\<" with and
> without quotes without success. Any ideas?

Since it's easier to tell what it is you want than what it is you do
not want, you'd better use the m// operator instead of split().

Advertisements

Bill wrote:
> I am trying to split the following line into a list of just the
> numbers. It is a list of xy coordinates.
>
> <-250,-850> <-250,800> <200,800> <200,-850> <-250,-850>
>
> I can use split() with comma, and ">", but not "<". The following
> code works, but I can not add "<" to the regular expression used by
> split(). I have tried various combinations of "\<" with and without
> quotes without success. Any ideas?
> Thanks.
>
>
> $tmpline = "<-250,-850> <-250,800> <200,800> <200,-850> <-250,-850>";
> (@grphdata) = split(/[\,>]/,$tmpline);
> print $tmpline . "\n";
> $i2 = 0;
> while ($grphdata[$i2]){
> print $i2 . " " . $grphdata[$i2] . "\n";
> $i2++;
> }

split *is* working. It's the test in your while loop that is faulty
If you use "<" as a separator, the first item will be an empty string,
i.e. the empty string before the initial "<".
See

On Thu, 15 Jul 2004, Bill wrote:
> I am trying to split the following line into a list of just the
> numbers. It is a list of xy coordinates.
>
> <-250,-850> <-250,800> <200,800> <200,-850> <-250,-850>
>
> I can use split() with comma, and ">", but not "<". The following
> code works, but I can not add "<" to the regular expression used by
> split(). I have tried various combinations of "\<" with and without
> quotes without success. Any ideas?
> Thanks.

Here's some very basic advice. Use split() when you know exactly what you
want to throw away. Use m// when you know exactly what you want to keep.
In this case, it's far easier to define what you want to keep:
> $tmpline = "<-250,-850> <-250,800> <200,800> <200,-850> <-250,-850>";
> (@grphdata) = split(/[\,>]/,$tmpline);

(Bill) wrote in news:7dbe2fe9.0407151125.33e89ac3
@posting.google.com:
> I am trying to split the following line into a list of just the
> numbers. It is a list of xy coordinates.
>
> <-250,-850> <-250,800> <200,800> <200,-850> <-250,-850>
>
> I can use split() with comma, and ">", but not "<". The following
> code works, but I can not add "<" to the regular expression used by
> split(). I have tried various combinations of "\<" with and without
> quotes without success. Any ideas?

Gunnar Hjalmarsson wrote:
>
> Bill wrote:
> > I am trying to split the following line into a list of just the
> > numbers. It is a list of xy coordinates.
> >
> > <-250,-850> <-250,800> <200,800> <200,-850> <-250,-850>
> >
> > I can use split() with comma, and ">", but not "<". The following
> > code works, but I can not add "<" to the regular expression used by
> > split(). I have tried various combinations of "\<" with and
> > without quotes without success. Any ideas?
>
> Since it's easier to tell what it is you want than what it is you do
> not want, you'd better use the m// operator instead of split().
>
> push @grphdata, $1 while $tmpline =~ /(-?\d+)/g;

(Bill) writes:
> I am trying to split the following line into a list of just the
> numbers. It is a list of xy coordinates.
>
> <-250,-850> <-250,800> <200,800> <200,-850> <-250,-850>
>
> I can use split() with comma, and ">", but not "<". The following
> code works, but I can not add "<" to the regular expression used by
> split(). I have tried various combinations of "\<" with and without
> quotes without success. Any ideas?
> Thanks.
>
>
> $tmpline = "<-250,-850> <-250,800> <200,800> <200,-850> <-250,-850>";
> (@grphdata) = split(/[\,>]/,$tmpline);

This does what I think you mean:

(@grphdata) = split(/[\,>< ]+/,$tmpline);

Without a space in the character class and without a + afterwards,
you'd be getting an extra element consisting of a single space between
each of the pairs. The space includes the space character, and the +
gobbles up the whole thing.

(Bill) wrote in message news:<>...
> I am trying to split the following line into a list of just the
> numbers. It is a list of xy coordinates.
>
> <-250,-850> <-250,800> <200,800> <200,-850> <-250,-850>
>
> I can use split() with comma, and ">", but not "<". The following
> code works, but I can not add "<" to the regular expression used by
> split(). I have tried various combinations of "\<" with and without
> quotes without success. Any ideas?
> Thanks.
>
>
> $tmpline = "<-250,-850> <-250,800> <200,800> <200,-850> <-250,-850>";
> (@grphdata) = split(/[\,>]/,$tmpline);
> print $tmpline . "\n";
> $i2 = 0;
> while ($grphdata[$i2]){
> print $i2 . " " . $grphdata[$i2] . "\n";
> $i2++;
> }

wow… great replies everyone. I have learned a lot just now. Thanks.
I thought the problem was with the "<", and trying to search using
"<", was driving me nuts.

Sorry about the messy code, just pieces I cut out of the script to
test the split function, I will clean it up, I promise

Interesting. I was for a while myself trying to get this to work with just

foreach ($line =~ m/<(-?\d+),(-?\d+)>/g) {
print "$1 $2\n";
}

.... but for some reaosn that just printed the first pair of numbers over
and over (the correct total amount, though). Do you have any idea why this
would be so? Still in the above $_ is updated correctly for each match,
but $1 and $2 stay set to the first pair of numbers. I ended up with

Juha Laiho wrote:
> I was for a while myself trying to get this to work with just
>
> foreach ($line =~ m/<(-?\d+),(-?\d+)>/g) {
> print "$1 $2\n";
> }
>
> ... but for some reaosn that just printed the first pair of numbers
> over and over (the correct total amount, though). Do you have any
> idea why this would be so?

Gunnar Hjalmarsson <> said:
>Juha Laiho wrote:
>> I was for a while myself trying to get this to work with just
>>
>> foreach ($line =~ m/<(-?\d+),(-?\d+)>/g) {
>> print "$1 $2\n";
>> }
>>
>> ... but for some reaosn that just printed the first pair of numbers
>> over and over (the correct total amount, though). Do you have any
>> idea why this would be so?
>
>Try "while" instead of "foreach".

Ok, that works as I expected... but now I'm even more stumped -- could
I have a language-lawyer explanation for the differences between these
two cases? Hmm.. is it a context issue -- while apparently evaluates
its condition expression in scalar context where I'd foreach uses
list context? But still I seem to have slight problem in fully
understanding why $1 and $2 are only set once in the foreach case
(esp. that foreach does update $_ for each round through the loop).
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)

Juha Laiho wrote:
> Gunnar Hjalmarsson <> said:
>> Juha Laiho wrote:
>>> I was for a while myself trying to get this to work with just
>>>
>>> foreach ($line =~ m/<(-?\d+),(-?\d+)>/g) {
>>> print "$1 $2\n";
>>> }
>>>
>>> ... but for some reaosn that just printed the first pair of
>>> numbers over and over (the correct total amount, though). Do
>>> you have any idea why this would be so?
>>
>> Try "while" instead of "foreach".
>
> Ok, that works as I expected... but now I'm even more stumped --
> could I have a language-lawyer explanation for the differences
> between these two cases?

I think it is because foreach (or for) creates the whole list to loop
over before the loop actually starts.
> But still I seem to have slight problem in fully understanding why
> $1 and $2 are only set once in the foreach case

It's set multiple times - before the loop starts. Consequently, $1 and
$2 will contain the values from the last time the regex matches (i.e.
they contain the last pair of numbers, not the first pair as you said
in another message).

On Fri, 16 Jul 2004, Juha Laiho wrote:
> Gunnar Hjalmarsson <> said:
> >Juha Laiho wrote:
> >> I was for a while myself trying to get this to work with just
> >>
> >> foreach ($line =~ m/<(-?\d+),(-?\d+)>/g) {
> >> print "$1 $2\n";
> >> }
> >>
> >> ... but for some reaosn that just printed the first pair of numbers
> >> over and over (the correct total amount, though). Do you have any
> >> idea why this would be so?
> >
> >Try "while" instead of "foreach".
>
> Ok, that works as I expected... but now I'm even more stumped -- could
> I have a language-lawyer explanation for the differences between these
> two cases? Hmm.. is it a context issue -- while apparently evaluates
> its condition expression in scalar context where I'd foreach uses
> list context? But still I seem to have slight problem in fully
> understanding why $1 and $2 are only set once in the foreach case
> (esp. that foreach does update $_ for each round through the loop).

You are correct. The two syntaxes are:
while (EXPR) { }
and
foreach SCALAR (LIST) { }

Using a while, you are evaluating m//g in a scalar context. Each time
through the loop, $1 and $2 are set to the captured sub patterns in that
pattern match. The /g modifier remembers where the last one left off and
starts the next match at that point.

Using a foreach, the m//g is evaluated in list context exactly once. It
is as though you had actually said:
@matches = $line =~ m/<(-?\d+),(-?\d+)>/g
foreach (@matches){
print "$1 $2\n";
}

As you can see, the pattern match is only executed once. Therefore $1 and
$2 are only set once - they are set to the captured parentheses that
represent the first pattern match. In a list context, however, m//g
returns all the parenthesized matches. So the foreach loop is still
executed the number of times you expect it to be.

On Fri, 16 Jul 2004, Paul Lalli wrote:
> Using a foreach, the m//g is evaluated in list context exactly once. It
> is as though you had actually said:
> @matches = $line =~ m/<(-?\d+),(-?\d+)>/g
> foreach (@matches){
> print "$1 $2\n";
> }
>
> As you can see, the pattern match is only executed once. Therefore $1 and
> $2 are only set once - they are set to the captured parentheses that
> represent the first pattern match. In a list context, however, m//g
> returns all the parenthesized matches. So the foreach loop is still
> executed the number of times you expect it to be.

Hrm. Gunnar's explanation (in another post to this thread) is correct.
$1 and $2 get the last value matched, not the first. I misunderstood my
own test case.

[captions re-ordered a bit, to cut the message to a reasonable length]

Gunnar Hjalmarsson <> said:
>Juha Laiho wrote:
>> But still I seem to have slight problem in fully understanding why
>> $1 and $2 are only set once in the foreach case
>I think it is because foreach (or for) creates the whole list to loop
>over before the loop actually starts.
....
>It's set multiple times - before the loop starts. Consequently, $1 and
>$2 will contain the values from the last time the regex matches (i.e.
>they contain the last pair of numbers, not the first pair as you said
>in another message).

Gunnar, thanks -- this does make sense. Rewriting my test case so that
the first and last value pairs were different confirms what you write
above (silly me!) -- $1 and $2 keep set to the values found with the
last match.

Also, this provides a very good insight for the rationale to have
these two loop constructs that on seem so similar at first sight.

Even though this isn't a FAQ as such, I think having a FAQ entry
describing the differences in loop constructs in more detail might
make sense (and yes, I know by making the suggestion I'm setting
myself as the volunteer to provide the question and answer -- let's
see whether I can accomplish that or not).

Juha Laiho wrote:
> Even though this isn't a FAQ as such, I think having a FAQ entry
> describing the differences in loop constructs in more detail might
> make sense (and yes, I know by making the suggestion I'm setting
> myself as the volunteer to provide the question and answer -- let's
> see whether I can accomplish that or not).

I'd like to encourage you to write a suggestion. Personally I have
mixed up for and while many times, and the docs seem not to include
clear and concise descriptions of their syntax (or have I missed
something?), merely a few examples in "perldoc perlsyn".

Share This Page

Welcome to The Coding Forums!

Welcome to the Coding Forums, the place to chat about anything related to programming and coding languages.

Please join our friendly community by clicking the button below - it only takes a few seconds and is totally free. You'll be able to ask questions about coding or chat with the community and help others.
Sign up now!