Copyright Notice

This text is copyright by CMP Media, LLC, and is used with
their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in
SysAdmin/PerformanceComputing/UnixReview magazine.
However, the version you are reading here is as the author
originally submitted the article for publication, not after their
editors applied their creativity.

Symbolic links were not present in the first version of Unix that I
used. That would be Unix V6, back in 1977, when the Unix kernel size
was under 32K. It's hard to imagine anything being under 32K
associated with Unix these days.

But somewhere in the bowels of the University of California at
Berkeley, in the early 80s, the boys working on BSD concocted a scheme
to rectify two of the biggest problems with hard links: they couldn't
be made to a directory, and they didn't want to point to another
mounted filesystems. And their solution was that now common feature,
a symbolic link.

A symbolic link is essentially a text string that sits in place of a
file. When the symbolic link's filename is accessed, the Unix kernel
replaces the filename with its text value instead, like a macro
expansion. This all happens transparently to the executing program
(unlike some other popular operating systems).

From the shell, symbolic links are easy enough to create:

ln -s /usr/lib/perl5 ./Lib

which makes a reference to Lib in the current directory hop over
to /usr/lib/perl5/. From Perl, this same step is:

symlink("/usr/lib/perl5", "./Lib") or die "$!";

And we can see this is so with:

ls -l

which will show something like:

..... Lib -> /usr/lib/perl5

indicating this redirection is going on. And that same fact is apparent
to Perl like so:

my $where = readlink("Lib");
print "Lib => $where\n";

But what if /usr/lib itself is also a symbolic link, say to
/lib? Well, the system nicely picks that up when it's looking down
the steps from /usr to /usr/lib, and redirects that to /lib,
and continues from there to look for perl5.

Thus, following a symlink may involve multiple expansions. There's a
limit to the number of expansions in a path to prevent runaway loops,
but generally it's enough that you won't worry about it.

What's the easiest way to really know where the symlink ends up then?
Well, you could keep typing a lot of ls -l invocations, and take
careful notes, or just write a Perl program to do the expansion for
you.

And while were at it, let's also make this work recursively from a
starting directory in a filetree, dumping out all the symlinks and
their ultimate expansions in all directories contained within. Cool.

So, here's a program that does just that, presented a few lines at a time.

#!/usr/bin/perl -w
use strict;
$|++;

These first three lines tell us where to find Perl, and enable
warnings and the usually good compiler restrictions. We'll also
disable buffering on STDOUT, so I can see how far the program has
gotten during a long run.

use File::Find;
use Cwd;

Next, we'll pull in two modules from the standard Perl distribution
library. File::Find helps us recurse through a directory hierarchy
without thinking too hard about it, and Cwd gets the current
working directory, usually without forking off a child process.

my $dir = cwd;

Now, we'll get the current directory via cwd (imported from
Cwd). We'll need this to properly expand relative names into
absolute names.

find sub {
##### contents here presented below
}, @ARGV;

Next, the outer part of the body of the program. We'll call find
(imported from the File::Find module), passing it an anonymous
subroutine reference, and the command-line argument array @ARGV.
The subroutine (whose contents are defined below) will be called
for each file or directory in all directories and subdirectories
starting at the top-level directories named in @ARGV.

Now for the guts of the subroutine. In the real program, these are
really located where ##### is marked above.

return unless -l;

When this subroutine is called, $_ is set to the name of the file
or directory of interest, and the current directory is set to the
directory that contains this item. Here, we'll end up returning if
the item is not a symbolic link.

The next two lines set up the core of the routine. I'm gonna have a
@left and an @right variable. Think of @left as ``where in
the filetree am I at so far?'' and @right as ``where else am I being
told to go?''. The basic task is to take one element at at time from
the front of @right, and try to glue it onto the end of @left,
until we have no more @right to go. If at any step, the path of
@left is a symlink however, we'll have to expand it and start
again. Also, if the element being examined from @right is a dot or
dot-dot, we'll need to back up on @left instead.

my @right = split /\//, $File::Find::name;

The variable $File::Find::name has the full pathname starting from
the kind of name we gave on the command line. If that was a relative
name, this will also be a relative name to the original working
directory (now saved in $dir). Here, I'm splitting the name apart
into individual elements.

This is a bit more complicated, so I'll take it slowly. We're setting
up @left to be the value of this expression coming from a do
block. If the first element of @right is empty, then the original
string began with a slash, and we need to be relative to the root
directory. That's handled by moving that empty element from the
beginning of @right to become the only element of @left.
Otherwise, we had a relative name, and we'll preload @left with a
split-apart version of the initial working directory.

while (@right) {

Now, as long as we have items to keep walking, we'll do this...

my $item = shift @right;
next if $item eq "." or $item eq "";

This grabs the next step, and discards it if it's just an empty string
or a single dot, meaning that we would have stayed at the current
directory.

if ($item eq "..") {
pop @left if @left > 1;
next;
}

And if it's dot-dot, we'll have to pop up a level on our current
position (unless it would have us back up over the top).

my $link = readlink (join "/", @left, $item);

Now, if the path of @left, together with the next step, form a
symbolic link, the value of $link will be defined to be what we
need to replace $item with. Otherwise, we can just slide along.

So, if it's a symbolic link, we'll split it apart. If it's absolute,
@left gets reset to the top. Otherwise, @left stays as is.
We'll also push whatever we got in front of the remainder of
@right, as it will influence the interpretation of that remaining
path.

} else {
push @left, $item;
next;
}

If it wasn't a symbolic link at this step, it's simple; we just move
along to that point in @left.

}
print "$File::Find::name is ", join("/", @left), "\n";

When the loop is over, we'll dump out the resulting path of @left.

And there you have it. It's a bit tricky, since the macro expansion
of a symbolic link is somewhat recursive, but Perl's data structures
and full access to the right system calls give us a straightforward
way of interpreting symbolic links.

Now you'll never have to wonder where those links point again.
Until next time, enjoy!