AboutTheAuthor:[Eine kleine Biographie über den Autor]

Abstract:[Here you write a little summary]

Perl part I provided a general overview about Perl.
In perl part II the first useful program was written. In part III we will now take a closer look at
arrays.

ArticleIllustration:[This is the title picture for your article]

ArticleBody:[The article body]

Arrays

An array consists of a list of variables which can be accessed by an
index. We have seen that "normal variables", also called scalar
variables, start their name with a dollar sign ($). Arrays start with a
@-sign however the data inside the array consists of several scalar variables.
You must therefore again write a dollar sign when you refer to the individual
fields in the array. Let's look at an example:

As you can see we write @myarray when we refer to the whole thing
and $myarray[0] when we refer to an individual element.
Perl arrays start at index 0. New indices are automatically created
as soon as you assign data. You do not have to know how big your array
will be at declaration time. As you can see above you can
initialize arrays with a whole bunch of data by
listing the data comma separated inside round braces.
("data1","data2","data3")
is really an anonymous array. You can therefore write
("data1","data2","data3")[1]
to get the second element from this anonymous array:

element number 0 is data1
element number 1 is data2
element number 2 is data3

The foreach statement takes each element out of the array and puts it
into the loop variable ($lvar in the example above). It is important
to note that the values are not copied out to the array into the loop
variable. Instead the loop variable is some kind of pointer and modifying
the loop variable modifies the elements in the array.
The following program makes all elements in the array upper case.
The perl tr/a-z/A-Z/ is similar to the unix command "tr". It
translates in this case all letters to upper case.

When you run the program then you can see that @myarray contains
in the second loop only upper case values:

before:
data1
data2
data3
after:
DATA1
DATA2
DATA3

The command line

We have seen in Perl II that a function &getopt can be used to read
the command line and any options provided on the command line.
&getopt is like the C equivalent. It is a library function. The values
of the command line get in Perl assigned to an array called @ARGV.
&getopt only takes this @ARGV and evaluates the elements.
Unlike in C the content of the first element in the array is not
the program name but the first command line argument. If you want
to know the name of the perl program then you need to read $0 but
that is not the subject of this article. Here is an
example program called add. It takes 2 numbers
from the command line and adds them:

Pop removes the elements from the end of the array and the while loop runs until
the array is empty.

Reading directories

Perl offers the functions opendir, readdir and closedir to read out the
content of a directory. readdir returns an array with all the file names.
Using a foreach loop you can iterate over all the file names and
search for a given name. Here is a simple program that searches for
a given filename in the current directory:

Let's look at the program. First we check that the user provided a
command line argument. If not then we print usage information and exit.
Next we open the current directory ("."). opendir is similar to
the open functions for files. The first argument is a file descriptor that
you need to pass to the readdir and closedir functions. The second
argument is the path to the directory.
Next comes the foreach loop. The first interesting thing is that
the loop variable is missing. Perl does in this case something magic
for you and creates a variable called $_ which is then used as loop
variable. readdir(DIRHANDLE) returns an array and we use foreach to
look at each element. /$ARGV[0]/io matches (compares) the regular expressions contained in $ARGV[0] against the variable $_.
The io means search case insensitive and compile the regular expressions
only once. The latter one is an optimization which makes the program faster.
You can use it when you have a variable inside a regular expression and
you can guarantee that this variable does not change at run time.
Let's try it. Assuming we have the files article.html, array1.txt
and array2.txt in the current directory then searching for "HTML"
will print:

As you can see the readdir function found 2 more files. "." and
"..". These are the names of the current and
previous directory.

A file finder

I would like to finish this article with a more complex and useful
program. It should be a file finder program. We call it pff (perl file
finder). It shall work basically like the program above but search also
sub-directories. How can we design such a program? Above we have some code
that reads the current directory and searches for files in it. We need
to start with the current directory but if one of the files (except
. and ..) is again a directory then we need to search in there. This
is a typical recursive algorithm:

You can test in perl if a file is a directory and not a symlink to a directory by using
if (-d "$file" && ! -l "$dir/$_"){....}. Now we have all functionality
that we need and we can write the actual code (pff.gz).

Let's look at the program a bit. First we test if the user has
provided an argument on the command line. If not then this
is an error and we print a little help text. We print also a help
text if option -h was given.
Otherwise we start to search in the current directory. We use the
recursive algorithm as described above. Read the directory, search the files,
test if a file is a directory, if yes call search_file_in_dir() again.

In the statement where we check for directories we check also that
it is not a link to a directory. We need to do that because someone may
have created a sym-link to "..". Such a link would cause the program
to run for ever if we did not have that check.

The next if ($_ eq "." || $_ eq ".."); is a statement which we did not discuss yet.
The "eq" operator is the perl string compare operator. Here we
test if the content of variable $_ is equal to ".." or ".".
If it is equal then the "next" statement is executed. "next"inside a foreach loop means start again at the top of the loop with the next
element in the array. It is similar to the C-statement "continue".