The first thing this tutorial will do is explain what lexical scoping means, so
as to keep things simple from the start.

Firstly, don't go to a dictionary as that won't
help you in this particular case. In perl, when we speak of something in the terms
of it being lexically scoped, we are talking about the area of code where the
given thing is visible e.g

In the above code $foo can only be seen between the opening and closing
braces. This is because they delimit the length of the lexical scope, and after
the ending brace that particular instance of $foo no longer exists.

So a lexical scope is a section of code where things can live temporarily.
I say they live temporarily because anything created within a lexical scope
will be deleted once the scope has been exited e.g

There is an exception to this
rule however - if something is still referring to something created within a
lexical scope upon exit of the scope, that thing will not be deleted since it
is still being referred to by something. This does not mean you can still refer
to it directly, it just means that perl has yet to clean it up.

So we can see that $foo is still being referred to by $ref
but the user can't refer directly to it.

my variables

Notice how all the variables are being declared with my()?
This wasn't done to comply with strict (although strict
does encourage the use of lexical variables, and with good reason too),
but because my() creates lexically scoped variables, or
simply, lexical variables.

So every variable created with my() lives within the
current lexical scope. What about other variables you may ask? Well anything
that is not declared with a my() lives in the current
package (for more info on package global variables see. Of Symbol Tables and Globs).

Here's a brief example to illustrate the difference between lexical variables
and package global variables

There $foo lives within its lexical scope, $bar lives
within the current package, so doesn't disappear until it is explicitly deleted from the symbol table.

Another thing to be noted about my() is that it is
a compile-time directive (this is because all things lexical are
calculated at compile-time).
This is the phase when the perl interpreter is putting the code together.
So once our scopes and variables
have been set they cannot be changed at runtime, like package globals can.

What this means is that lexical variables are
declared at compile-time, not initialised e.g

Although somewhat convoluted the above example demonstrates the fact that
the condition of the if and the loop assignment in the foreach
are lexically scoped to the braces which delimit the respective statements.

Note, however, that statement modifiers do not create a new lexical
scope (this should be obvious through their lack of braces) e.g

A lot of literature when talking about lexical variables refers to them as
private variables. This is because they cannot be seen outside their
given lexical scope. As has already been illustrated, lexical variables are
deleted once the end of their given scope is reached (exceptions withstanding),
so they really are private to their respective scope.

A feature which is an essential part of lexical scoping is that scopes
can be nested and inner scopes will not effect outer scopes e.g

There, the inner scope is a new scope (much like the outer scope is a new
sub scope of the file scope), so a new instance of $foo
is created leaving the outer $foo untouched when the inner scope
exits. And because the inner $foo only lives within that scope,
it private to that scope, and nothing else can see it.

This is not to say that nested scopes do not affect the rest of the program
(as any new scopes are just sub scopes of the file level lexical scope),
it just means that anything created within them is private to that given scope e.g

So even though we create a new scope with the if/else statement,
we're still changing $w in the scope above (which in turn is modifying
the elements of list since $w is just an alias to each element) as
we haven't created a new $w for that particular scope (and of course,
it wouldn't do us a lot of good as it would've fallen out of scope by the
time we came to print it).

local debunked

Well, we've been putting it off long enough and now it is time face that most
confounding of functions - local.

The first thing that we absolutely must declare is that localdoes not create variables! Not only does it not create variables, it has
nothing to do with lexical variables.

With that said, what local does do is change the
value of an existing package global for the length of a given dynamic scope.
A dynamic scope is just like a lexical scope but is defined by the length of scope,
not the visibility of the scope. So local is
localising a package globals value for the length of a given lexical scope e.g

As we can see the value of $x is still set to 'altered state' in
foo() even though its outside of the initial lexical scope. But
because $x has been dynamically scoped with
local and foo() was called within the
surrounding lexical scope $x will stay set to 'altered state'
until the lexical scope exits.

You might also see examples of it being used to create private variables - this
is rather misguided as it is auto-vivifying (creating it upon request of its
existence) the variable e.g

In the first case we've set the input separator to undefined, so when
$fh is read, it reads right to the end of the file. And in the second
case we localise the list separator for stringfied lists to a comma followed
by a space, and the original list describes its final output.

our variables

This is somewhat of an oddball in the world of variables in that it
creates a package level variable which is visible for the remaining lexical scope e.g

So our $x has created the package global $foo::x, but it is also visible in the remaining lexical scope which can still be seen in the package bar.
This illustrates why our is somewhat of a two-faced
function and best left alone unless the behaviour is specifically desired (at
least in this humble tutorial author's opinion).

Scoping schmoping

Ok, you say, I can see what lexical scoping is about and have an understanding of
how it works, but what use is it to me?

Firstly, you can neatly encapsulate separate groups of operations into individual lexical
scopes to avoid namespace collision and the like (this is widely demonstrated
through the use of subroutines and modules). This in turn leads to nicely
encapsulated sections of code which can be isolated from the main body of code,
which in turns means that the variables will tie very closely to the
surrounding code.

Secondly, because lexical
scoping is determined at compile-time, if there are any errors they will be
picked up before the program can even run (this is doubly true if you're
running with strictures on, you are use()ing strict
right?).

Thirdly, at the exit of a lexical scope all the variables are destroyed
(except of course, for those that are still in use),
which means your memory won't keep growing and growing as more variables
are created. Also quite handily, any objects will have the DESTROY
method called upon exit, so you can handle how your objects are cleaned up.

Something useful

Now we're done with our learning, let's have some doing!

The below example will recurse through a given directory and will list
each .pl and .pm with the amount of lines in the file.

Wow, there's quite a lot of lexical scoping going on there, both explicitly
(i.e the naked block containing the core of the program) and implicitly
(i.e count_lines()' lexical scope) and at this point it should all
be pretty straight forward (and I imagine the comments help too :).

In review

A lexical scope defines an area of code in which any variables declared within
that area will live for only duration of the execution of that area of code,
unless a variable is still referenced after the area of code has been left.
A dynamic scope is orthogonal to a lexical scope and is defined by the
length of the scope (as opposed to the visibility of the scope).

mydeclares lexically scoped variables at compile time, local changes a package global's value throughout a dynamic scope and our creates a package global which is visible throughout its given lexical scope.

And there we have it! I hope you've enjoyed this tutorial and gotten everything
out of it that you had intended to, and can now go forth and frolic in the
land of lexical scoping with glee and pride!

The big nit here is about destruction. Lexicals are not always destroyed when their declaring scope exits. If they were then closures wouldn't work, and neither would returning references to lexicals. While I know what you mean, it's important to describe what happens correctly. (Cheat and say that when a scope exits perl cleans up any variables that aren't otherwise in use)

Touching on the subject of recursion is important as well. It's easy enough to read this as if there were only one set of variables for a block, and each time you enter the block you get the same set over again, which isn't the case.

You don't necesarily have to go into detail on either of these, but it's important to make sure you don't give folks a mistaken impression that'll get in their way when they hit the more complex stuff later.

Cheat and say that when a scope exits perl cleans up any variables that aren't otherwise in use

Well I do say something along these lines right at the beginning of the tutorial, but then later on say all variables are destroyed on exit, which of course isn't true. I'll try and clear this area up.

It's easy enough to read this as if there were only one set of variables for a block, and each time you enter the block you get the same set over again, which isn't the case.

Ok, will try and put this in towards the end of the tutorial (will probably go at the end of In private). Should be fun ;)

To follow up on Elian's comment about the misleading description of "destruction". This is an area that has always confused (and contused) me. I think you could try adding this snippet to your nested lexical scope example and then things will become clearer:

What this snippet shows is that the inner $foo remains defined to the subroutine in the inner scope even though the program has passed on through it and down to the other scopes. You could mention that this is a bit like a C static if you want ...
Note: numerous puns about finding one's inner $elf are undoubtedly available if required

The way that you have used the term "lexical scope" when talking about local doesn't match the common definition of the term. For example see lexical scope and dynamic scope in FOLDOC.

Lexical scope is a compile-time construct relating to a region of the source code. With the commonly accepted definitions things like this:

With that said, what local does do is change the value of an existing package global for the length of a given lexical scope. Once that scope is exits, the package global returns to its former value e.g

and

If you've managed to succesfully understand and take in the tutorial up to this point you can see that local changes the value of $foo for the length of the lexical scope, and that it reverts to its original value once the scope exits.

use "lexical scope" incorrectly. local affects dynamic scoping - and is completely orthogonal to lexical scope. Almost the essence of local is that it can affect a variable outside of it's lexical scope :-)

Since I'm waiting for my laptop to finish reformatting so I can re-install Mac OS X 10.2.2 (again) I'm going to get picky - hopefully it will come out as constructive criticism (it's meant that way... honest :-)

In the above code $foo can only be seen between the opening and closing braces. This is because they delimit the length of the lexical scope, and after the ending brace that particular instance of $foo no longer exists.

From this description and the examples it might not be clear that $foo is in scope from its declaration to the end of the enclosing block or file - rather than from the first brace. For example:

So a lexical scope is a section of code where things can live temporarily. I say they live temporarily because anything created within a lexical scope will be deleted once the scope has been exited ... There is an exception to this rule however ...

I think that the point Elian makes about destruction is still valid. The lifetime of an item is a separate issue to its scope (lexical or otherwise).

A novice could read your description and thing garbage collection only applied to lexically scoped variables, when it is equally true of dynamically scoped variables (in the fact that an item will not be destroyed if a dynamically scoped variable refers to it).

The connection between a variable falling out of scope and it being destroyed will become even more tenuous when we get a proper GC in perl6. Once you have, for example, mark and sweep GC the item can be destroyed some time after it falls out of scope.

When talking about this sort of thing to novices I find it helps to keep variables and the things they label as separate concepts. In:

There is an exception to this rule however - if something is still referring to something created within a lexical scope upon exit of the scope, that thing will not be deleted since it is still being referred to by something. This does not mean you can still refer to it directly, it just means that perl has yet to clean it up.

what is "it"? The first "it" would seem to be talking about the variable, the second the item it identifies. Combining the two can be very confusing.

Variable scope (dynamic or lexical) is only tangentially related to whether the item the variable referred to will be destroyed at the end of the block or not.

The analogy I always use when explaining variables and scoping is luggage labels (the old fashioned kind - a bit of cardboard attached to a piece of string) and suitcases.

The label is the variable. It's attached to the luggage (scalar, hash, or whatever) and can be used to identify it.

You can have more than one label attached to each bit of luggage (when multiple variables identify the same scalar, hash, or whatever).

Variable scoping is about adding and removing labels - it doesn't affect the luggage.

The garbage collector will throw away any luggage without any labels (I'm sure I could get some joke in here about airports if I tried hard enough)

(yes, the analogy falls down when you start talking about references and compound objects, but I'm sure you get the idea :-)

So once our scopes and variables have been set they cannot be changed at runtime, like package globals can.

Might be more clearer to say that lexical scope is defined by the structure of the code at compile time, while dynamic scope is defined by the runtime environment.

What this means is that lexical variables are declared at compile-time, not initialised

With that said, what local does do is change the value of an existing package global for the length of a given dynamic scope. A dynamic scope is just like a lexical scope but is defined by the length of scope, not the visibility of the scope. So local is localising a package globals value for the length of a given lexical scope

Not entirely sure that this is quite clear enough - especially the phrase "length of a given lexical scope". We need to define what "length" means in this context :-)

Maybe something like:

"
When a package variable is dynamically scoped with local it's current value is saved, and then restored once the block containing the local is exited.
"

Hmmm... that's not very clear either... <sigh>... :-)

You might also see examples of it being used to create private variables - this is rather misguided as it is auto-vivifying (creating it upon request of its existence) the variable

Might be worth mentioning the historical context (some of us can remember the perl4 days when we didn't have lexical variables and using local was your only option ;-)

Thirdly, at the exit of a lexical scope all the variables are destroyed (except of course, for those that are still in use), which means your memory won't keep growing and growing as more variables are created.

This is, of course, equally true of dynamic scoped variables... which is why destruction is really a separate issue :-).

Anyway... time to go back to those install CDs... Hope this makes sense. If not, blame my annoyance with hard disk failures.

From this description and the examples it might not be clear that $foo is in scope from its declaration to the end of the enclosing block or file

Good point, but I do mention file-scoped lexicals later on in the tutorial and I don't like forward-referencing in learning material. Will see if I can clear it up somehow though.

I think that the point Elian makes about destruction is still valid. The lifetime of an item is a separate issue to its scope (lexical or otherwise).

This is true, but I didn't think it was necessary to go into the details of reference counting for something as simple as a lexical scoping tutorial. I was trying to keep it as straight forward as possible and adding memory management into the fray would almost certainly confuse the reader. Perhaps I should put a reference to Matts' Proxy Objects article as further reading.

Another way you can demonstrate this nicely is with a BEGIN block.

Marvellous! That illustrates the compile time vs runtime concept beautifully.

Not entirely sure that this is quite clear enough - especially the phrase "length of a given lexical scope".

I do labour the meaning of the 'length' of a lexical scope shortly after, and I can't think of another way of clearing stating how a dynamic scope is defined (perhaps a more judicious use of commenting the code would do the trick) so it'll have to do for now :)

Might be worth mentioning the historical context

Indeed, think I'll stick a line in there to elaborate on why local has such an ambiguous definition.

This is, of course, equally true of dynamic scoped variables

True, but I didn't want to mention the fact that localised dynamic variables are in fact *new* variables because I reckon it would add yet another layer of complexity that the tutorial could do without.

Thanks again for the input, it is most insightful indeed! I think the whole tutorial will need get another revision and then posted to Tutorials.
HTH

at beginning of BLOCK value of $r is: set in file scope
in conditional statement value of $r is: set in file scope
at end of BLOCK value of $r is: set in if condition
after BLOCK value of $r is: set in file scope

In this context I would say that a variable is composed of a label and a value. It is neither one nor the other - it is the combination. Having said this, I would then use label where you have used variable, or perhaps symbol (as in symbol table) or identifier. But for consistency here, I will use label.

To continue with your analogy... What is in the luggage is variable and which piece of luggage a label is tied to is also variable. The local function/declaration puts aside the original piece of luggage, with its contents undisturbed, and ties the label to a new, empty piece of luggage. You can put anything you like into this new piece of luggage. When execution of the program leaves the scope in which local was used, the label is removed from the new piece of luggage and tied, once again, to the original piece of luggage. The new piece of luggage may then be destroyed, if it has no other labels attached to it. Note also that the original piece of luggage may have had other labels attached to it before local was used, and these other labels can still be used to access the original luggage, even while the first label is attached to the new luggage.

Digging a little deeper... There is even more potential for confusion when one realizes that values are stored in a set of nested data structures. Labels (entries in symbol tables and pads) are associated with globs which are associated with scalars, arrays, hashes, etc. and these scalars, arrays, hashes, etc. are associated with values (I say associated as a euphemism for refer to, point at, have or contain, as the case may be). Among all these parts, which is/are the variable? Which is/are the value? Which parts change when local is used? What changes when the variable is assigned to? This is too much detail for this tutorial, yet the terminology used in the tutorial should be consistent with and facilitate an easy transition to a more detailed understanding and discussion, as much as possible (after all, simplification and abstraction aren't if they perpetuate all the detail).

The distinctions become important when one deals with references, and even more so if the variables (or is it the values) have magic. Consider the following

In a simple model, neither the labels nor the values are variable. What is variable is the relationship between them. A label may be associated first with one and then later with another value. A value may have many labels associated with it. Thus, again, I would say the variable (noun) is the combination, not one part or the other. To refer to the parts, it is better to use the terms label (or perhaps symbol or identifier) and value. And, as noted earlier, values may themselves be complex and require further terminology to identify their parts and relationships.

As these terms are used so variably throughout the documentation, it is best to be explicit where a specific meaning is intended.

I think that it would be good if this tutorial not only explained what closures are, but also the common "variable will not stay shared" gotcha. This tends to be hit in mod_perl when people use Apache::Registry and wrap what was a working program inside of a function.

With that said, what local does do is change the value of an existing package global for the length of a given dynamic scope. A dynamic scope is just like a lexical scope but is defined by the length of scope, not the visibility of the scope.

I dropped your italics, and added <strong> to stress the part I can't understand. Could you rephrase it?

Maybe it would be easier to express the distinction as spatial versus temporal scoping, which is an approach that I think makes sense at least until you start throwing threads and processes into the mix.

Now, in this example, the line 'my $a = 1;' introduces a lexical variable. Any reference to $a in the statements following that introduction until the end of the scope refer to this newly introduced variable - and only those lines. This is a spatial scope: if the foo subroutine refers to a $a variable, it won't see this $a, because foo is not defined within this scope.

The line 'local $b = 1;' introduces a localisation - it tells the interpreter to save the current copy of the (package) variable $b, to be restored when execution reaches the end of the scope. Now it is a matter of time: after this localisation has been executed, anyone that looks at this package variable $b before the end of this scope is reached will see the localised version. If the foo subroutine refers to the package variable $b, it will see the localised version (for this call, at least).

Couple other questions/comments
1)using input record separator as newline indicator is very interesting. Never seen it before but works well. Any reason? I am assuming it's just a style issue?
2)Below code from tutorial does not work when I insert my next to variable.. can someone explain this?

3)can someone explain further on "lexical variables are declared at compile-time, not initialised?
Is this because BEGIN runs during compile-time? I sneaked in a new my $foo = something inside of BEGIN
block and execution of the code came out w/ foo is something during BEGIN phase