In the Beginning, some time around 1960, every part of your program had
access to all the variables in every other part of the program. That turned
out to be a problem, so language designers invented local variables, which
were visible in only a small part of the program. That way, programmers who
used a variable x could be sure that nobody was able to tamper with
the contents of x behind their back. They could also be sure that by
using x they weren't tampering with someone else's variable by mistake.

Every programming language has a philosophy, and these days most of these
philosophies have to do with the way the names of variables are managed.
Details of which variables are visible to which parts of the program, and
what names mean what, and when, are of prime importance. The details vary
from somewhat baroque, in languages like Lisp, to extremely baroque, in
languages like C++. Perl unfortunately, falls somewhere towards the rococo
end of this scale.

The problem with Perl isn't that it has no clearly-defined system of
name management, but rather that it two systems, both working at
once. Here's the Big Secret about Perl variables that most people
learn too late: Perl has two completely separate, independent
sets of variables. One is left over from Perl 4, and the other is
new. The two sets of variables are called `package variables' and
`lexical variables', and they have nothing to do with each other.

Package variables came first, so we'll talk about them first. Then we'll
see some problems with package variables, and how lexical variables were
introduced in Perl 5 to avoid these problems. Finally, we'll see how to get
Perl to automatically diagnose places where you might not be getting the
variable you meant to get, which can find mistakes before they turn into
bugs.

Here, $x is a package variable. There are two important things
to know about package variables:

Package variables are what you get if you don't say otherwise.

Package variables are always global.

Global means that package variables are always visible
everywhere in every program. After you do $x = 1, any other
part of the program, even some other subroutine defined in some other
file, can inspect and modify the value of $x. There's no
exception to this; package variables are always global.

Package variables are divided into families, called packages. Every
package variable has a name with two parts. The two parts are analogous to
the variable's given name and family name. You can call the Vice-President
of the United States `Al', if you want, but that's really short for his
full name, which is `Al Gore'. Similarly, $x has a full name,
which is something like $main::x. The main part is the
package qualifier, analogous to the `Gore' part of `Al Gore'. Al Gore and Al Capone are
different people even though they're both named `Al'. In the same way,
$Gore::Al and $Capone::Al are different variables, and $main::x and $DBI::x
are different variables.

You're always allowed to include the package part of the variable's name,
and if you do, Perl will know exactly which variable you mean. But for
brevity, you usually like to leave the package qualifier off. What happens
if you do?

If you just say $x, perl assumes that you mean the variable
$x in the current package. What's the current package? It's
normally main, but you can change the current package by writing

package Mypackage;

in your program; from that point on, the current package is Mypackage.
The only thing the current package does is affect the interpretation of
package variables that you wrote without package names. If the current
package is Mypackage, then $x really means $Mypackage::x. If
the current package is main, then $x really means $main::x.

If you were writing a module, let's say the MyModule module, you would
probably put a line like this at the top of the module file:

package MyModule;

From there on, all the package variables you used in the module file would
be in package MyModule, and you could be pretty sure that those variables
wouldn't conflict with the variables in the rest of the program. It
wouldn't matter if both you and the author of DBI were to use a variable
named $x, because one of those $xes would be $MyModule::x and the other
would be $DBI::x.

Remember that package variables are always global. Even if you're
not in package DBI, even if you've never heard of package
DBI, nothing can stop you from reading from or writing to
$DBI::errstr. You don't have to do anything
special. $DBI::errstr, like all package variables, is a
global variable, and it's available globally; all you have to do is
mention its full name to get it. You could even say

There are only three other things to know about package variables, and you
might want to skip them on the first reading:

The package with the empty name is the same as main. So $::x is the same
as $main::x for any x.

Some variables are always forced to be in package main. For example, if you
mention %ENV, Perl assumes that you mean %main::ENV, even if the current
package isn't main. If you want %Fred::ENV, you have to say so
explicitly, even if the current package is Fred. Other names that are
special this way include INC, all the one-punctuation-character names like
$_ and $$, @ARGV, and STDIN, STDOUT, and STDERR.

Package names, but not variable names, can contain ::. You
can have a variable named $DBD::Oracle::x. This means the
variable x in the package DBD::Oracle; it has
nothing at all to do with the package DBD which is
unrelated. Isaac Newton is not related to Olivia Newton-John, and
Newton::Isaac is not related to
Newton::John::Olivia. Even though it appears that they both
begin with Newton, the appearance is deceptive.
Newton::John::Olivia is in package Newton::John, not
package Newton.

That's all there is to know about package variables.

Package variables are global, which is dangerous, because you can never be
sure that someone else isn't tampering with them behind your back. Up
through Perl 4, all variables were package variables, which was worrisome.
So Perl 5 added new variables that aren't global.

Perl's other set of variables are called lexical variables
(we'll see why later) or private variables because they're
private. They're also sometimes called my variables because
they're always declared with my. It's tempting to call them
`local variables', because their effect is confined to a small part of
the program, but don't do that, because people might think you're
talking about Perl's local operator, which we'll see later.
When you want a `local variable', think my, not
local.

The declaration

my $x;

creates a new variable, named x, which is totally
inaccessible to most parts of the program---anything outside the block
where the variable was declared. This block is called the
scope of the variable. If the variable wasn't declared in any
block, its scope is from the place it was declared to the end of the
file.

You can also declare and initialize a my variable by writing something like

my $x = 119;

You can declare and initialize several at once:

my ($x, $y, $z, @args) = (5, 23, @_);

Let's see an example of where some private variables will be useful.
Consider this subroutine:

If lookup_salary happens to also use a variable named
$employee, that's going to be the same variable as the one
used in print_report, and the works might get gummed up. The
two programmers responsible for print_report and
lookup_salary will have to coordinate to make sure they don't
use the same variables. That's a pain. In fact, in even a medium-sized
project, it's an intolerable pain.

my @employee_list creates a new array variable which is
totally inaccessible outside the print_report
function. for my $employee creates a new scalar variable
which is totally inaccessible outside the foreach loop, as
does my $salary. You don't have to worry that the other
functions in the program are tampering with these variables, because
they can't; they don't know where to find them, because the names have
different meanings outside the scope of the my
declarations. These `my variables' are sometimes called `lexical'
because their scope depends only on the program text itself, and not
on details of execution, such as what gets executed in what order. You
can determine the scope by inspecting the source code without knowing
what it does. Whenever you see a variable, look for a my
declaration higher up in the same block. If you find one, you can be
sure that the variable is inaccessible outside that block. If you
don't find a declaration in the smallest block, look at the next
larger block that contains it, and so on, until you do find one. If
there is no my declaration anywhere, then the variable is a
package variable.

my variables are not package variables. They're not part of a package,
and they don't have package qualifiers. The current package has no effect
on the way they're interpreted. Here's an example:

The declaration my $x = 17 at the top creates a new lexical
variable named x whose scope continues to the end of the file. This new
meaning of $x overrides the default meaning, which was that
$x meant the package variable $x in the current
package.

package A changes the current package, but because
$x refers to the lexical variable, not to the package
variable, $x=12 doesn't have any effect on $A::x. Similarly,
after package B, $x=20 modifies the lexical
variable, and not any of the package variables.

At the end of the file, the lexical variable $x holds 20, and
the package variables $main::x, $A::x, and
$B::x are still undefined. If you had wanted them, you could
still have accessed them by using their full names.

The maxim you must remember is:

Package variables are global variables.
For private variables, you must use my.

Almost everyone already knows that there's a local function that has
something to do with local variables. What is it, and how does it related
to my? The answer is simple, but bizarre:

my creates a local variable.
local doesn't.

First, here's what local $x really does: It saves the current value of
the package variable $x in a safe place, and replaces it with a new value,
or with undef if no new value was specified. It also arranges for the old value to be
restored when control leaves the current block. The variables that it
affects are package variables, which get local values. But package
variables are always global, and a local package variable is no
exception. To see the difference, try this:

B can see the value of lo set by A.
B cannot see the value of m set by A.

What happened here? The local declaration in A saved
a new temporary value, AAA, in the package variable
$lo. The old value, global, will be restored when
A returns, but before that happens, A calls
B. B has no problem accessing the contents of
$lo, because $lo is a package variable and package
variables are always available everywhere, and so it sees the value
AAA set by A.

In contrast, the my declaration created a new, lexically
scoped variable named $m, which is only visible inside of
function A. Outside of A, $m retains its
old meaning: It refers the the package variable $m; which is
still set to global. This is the variable that B
sees. It doesn't see the AAA because the variable with that
value is a lexical variable, and only exists inside of A.

Because local does not actually create local variables, it is
not very much use. If, in the example above, B happened to
modify the value of $lo, then the value set by A would be
overwritten. That is exactly what we don't want to happen. We want
each function to have its own variables that are untouchable by the
others. This is what my does.

Why have local at all? The answer is 90% history. Early
versions of Perl only had global variables. local was very
easy to implement, and was added to Perl 4 as a partial solution to
the local variable problem. Later, in Perl 5, more work was done, and
real local variables were put into the language. But the name
local was already taken, so the new feature was invoked with
the word my. my was chosen because it suggests
privacy, and also because it's very short; the shortness is supposed
to encourage you to use it instead of local. my is
also faster than local.

Every time control reaches a my declaration, Perl creates a new, fresh
variable. For example, this code prints x=1 fifty times:

for (1 .. 50) {
my $x;
$x++;
print "x=$x\n";
}

You get a new $x, initialized to undef, every
time through the loop.

If the declaration were outside the loop, control would only pass by it
once, so there would only be one variable:

{ my $x;
for (1 .. 50) {
$x++;
print "x=$x\n";
}
}

This prints x=1, x=2, x=3, ... x=50.

You can use this to play a useful trick. Suppose you have a function that
needs to remember a value from one call to the next. For example, consider
a random number generator. A typical random number generator (like Perl's
rand function) has a seed in it. The seed is just a number. When you
ask the random number generator for a random number, the function performs
some arithmetic operation that scrambles the seed, and it returns the
result. It also saves the result and uses it as the seed for the next time
it is called.

Here's typical code: (I stole it from the ANSI C standard, but it behaves
poorly, so don't use it for anything important.)

There's a problem here, which is that $seed is a global
variable, and that means we have to worry that someone might inadvertently
tamper with it. Or they might tamper with it on purpose, which could affect
the rest of the program. What if the function were used in a gambling
program, and someone tampered with the random number generator?

The declaration is outside the function, so it only happens once, at the
time the program is compiled, not every time the function is called. But
it's a my variable, and it's in a block, so it's only accessible to code
inside the block. my_rand is the only other thing in the block, so the
$seed variable is only accessible to the my_rand function.

$seed here is sometimes called a `static' variable, because it stays the
same in between calls to the function. (And because there's a similar
feature in the C language that is activated by the static keyword.)

You can't declare a variable my if its name is a punctuation
character, like $_, @_, or $$. You can't
declare the backreference variables $1, $2, ... as
my. The authors of my thought that that would be too
confusing.

Obviously, you can't say my $DBI::errstr, because that's
contradictory---it says that the package variable $DBI::errstr is now a
lexical variable. But you can say local $DBI::errstr; it saves the current value of $DBI::errstr and
arranges for it to be restored at the end of the block.

New in Perl 5.004, you can write

foreach my $i (@list) {

instead, to confine the $i to the scope of the loop instead.
Similarly,

If you're writing a function, and you want it to have private variables,
you need to declare the variables with my. What happens if you forget?

sub function {
$x = 42; # Oops, should have been my $x = 42.
}

In this case, your function modifies the global package variable
$x. If you were using that variable for something else, it
could be a disaster for your program.

Recent versions of Perl have an optional protection against this that you
can enable if you want. If you put

use strict 'vars';

at the top of your program, Perl will require that
package variables have an explicit package qualifier. The $x
in $x=42 has no such qualifier, so the program won't even
compile; instead, the compiler will abort and deliver this error
message:

Global symbol "$x" requires explicit package name at ...

If you wanted $x to be a private my variable, you can go
back and add the my. If you really wanted to use the global package
variable, you could go back and change it to

$main::x = 42;

or whatever would be appropriate.

Just saying use strict turns on strict vars, and
several other checks besides. See perldoc strict for more
details.

Now suppose you're writing the Algorithms::KnuthBendix
modules, and you want the protections of strict vars But
you're afraid that you won't be able to finish the module because your
fingers are starting to fall off from typing
$Algorithms::KnuthBendix::Error all the time.

Package variables are always global. They have a name and a package
qualifier. You can omit the package qualifier, in which case Perl uses a
default, which you can set with the package declaration. For private
variables, use my. Don't use local; it's obsolete.

You should avoid using global variables because it can be hard to be sure
that no two parts of the program are using one another's variables by
mistake.

To avoid using global variables by accident, add use strict 'vars' to
your program. It checks to make sure that all variables are either declared
private, are explicitly qualified with package qualifiers, or are
explicitly declared with use vars.

The tech editors complained about my maxim `Never use
local.' But 97% of the time, the maxim is exactly right.
local has a few uses, but only a few, and they don't come up
too often, so I left them out, because the whole point of a tutorial
article is to present 97% of the utility in 50% of the space.

I was still afraid I'd get a lot of tiresome email from people
saying ``You forgot to mention that local can be used for
such-and-so, you know.'' So in the colophon at the end of the
article, I threatened to deliver Seven Useful Uses for
local in three months. I mostly said it to get people
off my back about local. But it turned out that I did
write it, and it was published some time later.

Here's another potentially interesting matter that I left out for
space and clarity. I got email from Robert Watkins with a program he
was writing that didn't work. The essence of the bug looked like
this:

my $x;
for $x (1..5) {
s();
}
sub s { print "$x, " }

Robert wanted this to print 1, 2, 3, 4, 5, but it did
not. Instead, it printed , , , , , . Where did the values
of $x go?

The deal here is that normally, when you write something like this:

for $x (...) { }

Perl wants to confine the value of the index variable to inside the
loop. If $x is a package variable, it
pretends that you wrote this instead:

{ local $x; for $x (...) { } }

But if $x is a lexical variable, it pretends you wrote this instead, instead:

{ my $x; for $x (...) { } }

This means that the loop index variable won't get propagated to
subroutines, even if they're in the scope of the original declaration.

I probably shouldn't have gone on at such length, because the
perlsyn manual page describes it pretty well:

...the variable is implicitly local to the loop and regains its
former value upon exiting the loop. If the variable was
previously declared with my, it uses that variable
instead of the global one, but it's still localized to the
loop. (Note that a lexically scoped variable can cause
problems if you have subroutine or format declarations within
the loop which refer to it.)

In my opinion, lexically scoping the index variable was probably a
mistake. If you had wanted that, you would have written for my
$x ... in the first place. What I would have liked it to do was
to localize the lexical variable: It could save the value of the
lexical variable before the loop, and restore it again afterwards.
But there may be technical reasons why that couldn't be done,
because this doesn't work either:

my $m;
{ local $m = 12;
...
}

The local fails with this error message:

Can't localize lexical variable $m...

There's been talk on P5P about making this work, but I gather it's
not trivial.

Added 2000-01-05: Perl 5.6.0 introduced a new our(...)
declaration. Its syntax is the same as for my(), and it is a
replacement for use vars.

Without getting into the details, our() is just like
use vars; its only effect is to declare variables so that
they are exempt from the strict 'vars' checking. It has two
possible advantages over use vars, however: Its
syntax is less weird, and its effect is lexical. That is, the
exception that it creates to the strict checking continues
only to the end of the current block:

So whereas use vars '$x' declares that it is OK to use
the global variable $x everywhere, our($x) allows
you to say that global $x should be permitted only in certain
parts of your program, and should still be flagged as an error if you
accidentally use it elsewhere.

Added 2000-01-05: Here's a little wart that takes people by
surprise. Consider the following program:

Here we have not declared $a or $b, so they are
global variables. In fact, they have to be global, because the
sort operator must to be able to set them up for the
backwards function. Why doesn't strict produce a
failure?

The variables $a and $b are exempted from
strict vars checking, for exactly this reason.