References are absolutely essential for creating complex data structures. Since the next chapter is devoted solely to this topic, we will not say more here. This section lists the other advantages of Perl's support for indirection and memory management.

When you pass more than one array or hash to a subroutine, Perl merges all of them into the
@_
array available within the subroutine. The only way to avoid this merger is to pass references to the input arrays or hashes. Here's an example that adds elements of one array to the corresponding elements of the other:

Using references, you can efficiently pass large amounts of data to and from a subroutine.

However, passing references to
scalars
typically turns out not to be an optimization at all. I have often seen code like this, in which the programmer has intended to minimize copying while reading lines from a file:

You might be surprised how little an effect this strategy has on the overall performance, because most of the time is taken by reading the file and subsequently working on
$line
. Meanwhile, the user of
GetNextLine
is forced to deal with indirections (
$$ref_line
) instead of the more straightforward buffer
$line
.[
4
]

The module defines a subroutine called
timethis
that takes a piece of code, runs it as many times as you tell it to, and prints out the elapsed time. We'll cover the
use
statement in
Chapter 6,
Modules
.

This notation not only allocates anonymous storage, it also returns a reference to it, much as
malloc(3)
returns a pointer in C.

What happens if you use
parentheses instead of square brackets? Recall again that Perl evaluates the right side as a comma-separated expression and returns the value of the last element;
$ra
contains the value "hello", which is likely not what you are looking for.

Both these notations are easy to remember since they represent the bracketing characters used by the two datatypes - brackets for arrays and braces for hashes. Contrast this to the way you'd normally create a named hash:

# An ordinary hash uses the prefix and is initialized with a list
# within
parentheses
%hash = ("flock" => "birds", "pride" => "lions");
# An anonymous hash is a list contained within
curly braces
.
# The result of the expression is a scalar reference to that hash.
$rhash = {"flock" => "birds", "pride" => "lions"};

What about dynamically allocated
scalars
? It turns out that Perl doesn't have any notation for doing something like this, presumably because you almost never need it. If you really do, you can use the following trick: Create a reference to an existing variable, and then let the variable pass out of scope.

{
my $a = "hello world"; # 1
$ra = \$a; # 2
}
print "$$ra \n"; # 3

The
my
operator tags a variable as private (or
localizes
it, in Perl-speak). You can use the
local
operator instead, but there is a subtle yet very important difference between the two that we will clarify in
Chapter 3
. For this example, both work equally well.

Now,
$ra
is a global variable that refers to the local variable
$a
(not the keyword
local
). Normally,
$a
would be deleted at the end of the block, but since
$ra
continues to refer to it, the memory allocated for
$a
is not thrown away. Of course, if you reassign
$ra
to some other value, this space is deallocated before
$ra
is prepared to accept the new value.

You can create references to constant scalars like this:

$r = \10; $rs = \"hello";

Constants are statically allocated and anonymous.

A reference variable does not care to know or remember whether it points to an anonymous value or to an existing variable's value. This is identical to the way pointers behave in C.

Incidentally, this example illustrates a convention known to Microsoft Windows programmers as "
Hungarian notation."[
5
] Each variable name is prefixed by its type ("r" for reference, "rh" for reference to a hash, "i" for integer, "d" for double, and so on). Something like the following would immediately trigger some suspicion:

$$rh_collections[0] = 10; # RED FLAG : 'rh' being used as an array?

You have a variable called
$rh_collections
, which is presumably a reference to a hash because of its naming convention (the prefix
rh
), but you are using it instead as a reference to an array. Sure, Perl will alert you to this by raising a run-time exception ("Not an ARRAY reference at - line 2."). But it is easier to check the code while you are writing it than to painstakingly exercise all the code paths during the testing phase to rule out the possibility of run-time errors.

[5]
After Charles Simonyi who started this convention at Microsoft. This convention is a topic of raging debates on the Internet; people either love it or hate it. Apparently, even at Microsoft, the systems folks use it, while the application folks don't. In a language without enforced type checking such as Perl, I recommend using it where convenient.

Earlier, while discussing precedence, we showed that
$$rarray[1]
is actually the same as
${$rarray}[1]
. It wasn't entirely by accident that we chose braces to denote the grouping. It so happens that there is a more general rule.

The braces signify a block of code, and Perl doesn't care what you put in there as long as it yields a reference of the required type. Something like
{$rarray}
is a straightforward expression that yields a reference readily. By contrast, the following example calls a subroutine within the block, which in turn returns a reference:

To summarize, a block that yields a reference can occur wherever the name of a variable can occur. Instead of
$a
, you can have
${$ra}
or
${$array[1]}
(assuming
$array[1]
has a reference to
$a
), for example.

Recall that a block can have any number of statements inside it, and the last expression evaluated inside that block represents its result value. Unless you want to be a serious contender for the Obfuscated Perl contest, avoid using blocks containing more than two expressions while using the general dereferencing rule stated above.

While we are talking about obfuscation, it is worth talking about a very insidious way of including executable code within strings. Normally, when Perl sees a string such as "
$a
", it does variable interpolation. But you now know that "
a
" can be replaced by a block as long as it returns a reference to a scalar, so something like this is completely acceptable, even within a string:

print "
${foo()}
";

Replace
foo()
by
system (
'
/bin/rm *
'
)
and you have an unpleasant Trojan horse:

print "${system('/bin/rm *')}"

Perl treats it like any other function and
trusts
system
to return a reference to a scalar. The parameters given to
system
do their damage before Perl has a chance to figure out that
system
doesn't return a scalar reference.

Moral of the story: Be very careful of strings that you get from untrusted sources. Use the
taint-mode option (invoke Perl as
perl
-T
) or the Safe module that comes with the Perl distribution. Please see the Perl documentation for taint checking, and see the index for some pointers to the Safe
module.