may apply the regex /ab/ or /ef/ (or some previous regex, if that
match didn't succeed) to $str2. (Note that "previous" here means
execution order, not linear order in the source). If no previous
regex succeeded, an empty regex is actually used.

This is a fairly volatile feature, since any intervening code that
uses a regex will change the results (e.g. a regex in a tie method
implicitly invoked, or in the main code of a require'd file). It's
also a barrier to integration of the defined-or patch (see defined or: // and //= and Re: Perl 5.8.3) to 5.8.x, since
a // where perl may be expecting either an operator or a term could
mean defined-or or could mean ($_ =~ //). Without the feature, the
latter would be overwhelmingly less likely to occur in real code.

People more often use this "feature" by accident than on purpose, with
code like m/$myregex/ where $myregex is empty (since the
"is it an empty regex" test occurs after interpolation). One
solution is to use m/(?#)$myregex/ if you anticipate that
$myregex may be empty.

But all that is beside the point, because special treatment of //
(documented with respect to s/// and m// in perlop) is not a feature
of perl's regexes but a feature of the match and substitution
operators, and doesn't apply to split at all.

So what does happen when you say split //, $str?

Well, in general terms, split returns X pieces of a string that result
from applying a regex X-1 times and removing the parts that matched,
so split /b/, "abc" produces the list
("a","c"). (Throughout, I will ignore the effects of
placing capturing parentheses in the regex.)

The analytic of mind will note that there are also empty strings
before and after the "ac". Spreading it out, the regex will match at
each //: "// a // c //", making 3 divisions
in the string, so you might expect split to return a list of the four
pieces produced ("","a","c",""), but instead a little
dwimmery comes into play here.

Dealing first with the empty string at the end, split has a third
parameter for limiting the number of substrings to produce (which
normally defaults to 0, and where <= 0 means unlimited),
so split /b/, "abcba", 2 returns
("a","cba"). As a special case, if the limit is 0,
trailing empty fields are not returned. However, if the limit is less
than zero or large enough to include empty trailing fields, they will
be returned:
split /b/, "ab", 2 for example does return "a" and an
empty trailing field, while split /b/, "ab" returns only
an "a".

The same provision applies to the empty string following the
zero-width match at the end of the string. split //, "a"
returns only the "a", while split //, "a", 2 returns
("a","").

(I said "normally defaults to 0" because in one case, this
doesn't apply: if the split is the only thing on the right of an
assignment to a list of scalars, the limit will default to one more
than the number of scalars. This is intended as an optimization, but
can have odd consequences. For instance,
my ($a,$b,$c) = split //, "a" will result in the split
having a default limit of 4, obverting the usual suppression of the
empty trailing field: split will return ("a",""), leaving
$b blank and $c undefined.)

But there is also an empty string before the zero-width match at the
beginning of the string. The above methodology doesn't apply to that.
If you say split /a/, "ab" it will break "ab" into two
strings: ("","b"), whether or not limit is specified (unless you limit
it to one return, which basically will always ignore the pattern and
return the whole original string).

Similarly, split //, "b"doesn't base returning or
not returning the leading "" on limit. Instead, a different rule
applies. That rule is that zero-width matches at the beginning of the
string won't pare off the preceding empty string; instead, it is
discarded. So while split /a/, "ab" does produce
("","b"), split //, "b" only produces
("b").

This rule applies not only to the empty regex //, but to any regex
that produces a zero-width match, e.g. /^/m. (While on
the topic of /^/, that is special-cased for split to mean
the equivalent of /^/m, as it would otherwise be pretty
useless.) So split /(?=b)/, "b" returns
("b"), not ("","b").

One last consideration, that also plays a part with s/// and m//: if
you match a zero-width string, why doesn't the next attempt at a match
also do so in the same place? For instance,
$_ = "a"; print "at pos:",pos," matched <$&>" while /(?=a)/g
should loop forever, since after the first match, the position is
still at 0 and there is an "a" following. Applying this logic to
split //, you can see that the // should match over and
over without advancing in the string. To prevent this, any match that
advances through the string is only allowed to zero-width match once
at any given position. If a subsequent match would have come up with
a zero width at the same position, the match is not allowed. This
rule applies whether perl is in a match loop within a single operation
(s///, split, or list-context m//g) or in a loop in perl code
(e.g. the above 1 while m//g), or even two independent m//g matches.

For example: $_ = "3"; /(?=\w)/g && /\d??/g && print $&;
does print "3", even though the ?? requests a 0 digit match be
preferred over 1 digit, because a 0-length match isn't allowed at that
position.

Update: this isn't really a tutorial, or at least it's an inside out one. (That is, it's taking a single line of code and explaining how lots of different things affect (or don't affect) it, rather than setting out to explain those different things generically). If time allows, I may rewrite it as one. There's lots of good stuff to talk about with split.

It's also a barrier to integration of the defined-or patch to 5.8.x, since a // where perl may be expecting either an operator or a term could mean defined-or or could mean ($_ =~ //). Without the feature, the latter would be overwhelmingly less likely to occur in real code.

Whoa... I tried really hard, but I really didn't get this at all. It's obvious you understand what you're talking about, but I think a large number of readers here (certainly most of those who would go to the Tutorials wing where this is likely to end up) won't have a clue what "the defined-or patch to 5.8" refers to, let alone what sort of distinction you're trying to make here. If this is really an important point, provide some more detail, and perhaps some code snippet(s) with comments or contrasting outputs to clarify the point. If it's not that important, then take it out, because it isn't helping.

The rest provides some useful detail (i.e. things that folks would want to know when using split // to best effect), but there is also a bit of useless detail (i.e. pedantry), which I would not commend in a "tutorial" piece.

I'd suggest you give it a day or two, then re-read it and consider how you would write it differently...

Yes, I know what the term "defined-or" refers to, but an average perl user knowing the arcanery involved in "the defined-or patch to 5.8" is sort of like a plumber knowing the particular alloy properties that distinguish the steel in his old hammer from that of his new one. Sure, a few plumbers may know something about this...

It's also a barrier to integration of the defined-or patch to 5.8.x, since a // where perl may be expecting either an operator or a term could mean defined-or or could mean ($_ =~ //). Without the feature, the latter would be overwhelmingly less likely to occur in real code.

I'm not the world's best lexer, but I cannot imagine a situation where // could be misinterpreted. Your example, if I remember right, implies that I could write $_ =~ +;. =~ is the operator that requires a term on the RHS.

Unless, as is often the case, I'm missing something ...

------
We are the carpenters and bricklayers of the Information Age.

Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

I always used ()=$string=~/(.)/gs; to
split a string to characters. Also, it's
sometimes useful to iterate
through the characters of a list, for which you can
use the scalar-context variation:
while($string=~/(.)/gs){DO SOMETHING WITH ($1)};
-- you can't do this with split, can you?