Just a note before you continue: this is based on my personal
experience with Perl. I know other people have other opinions of this
language and they are welcome to them. I just want to present mine,
because I see lots of messages on Usenet from people who seem to be
about to learn Perl and I keep wanting to tell them that perhaps it's
not a good idea. So, I wrote this article to get this off my chest
once and for all.

If you think that anything in this article is objectively wrong, then
please email me about it. I'd like this article to be as factually
correct as possible. If you just disagree with me you can tell me that
too.

I should perhaps explain why I refer to Perl as "the Camel": the Bible
on all things Perl is Larry Walls "Programming Perl", which is
published by O'Reilly. O'Reilly usually puts a nice 19th century
engraving of an animal on the cover of their books. "Programming Perl"
got a camel and has been known as "the camel book" ever since. Larry
Wall also often refers to Perl as the camel.

The article should be up to date as of Python 1.5.2 and Perl 5.005.
If anything has changed since those versions, feel free to tell me
about it.

I learned Perl 5 in early '97. I downloaded Patrick M. Ryans
excellent introduction to Perl and found that a lot of the things
I'd been doing the hard way with C, Pascal and Java were much much
easier in Perl. For text processing and access to system functions
Perl looked like a real God-send. Great, I thought, and bought myself
"Learning Perl", by Randal Schwartz. (Also known as the "llama
book".)

I read it pretty quickly and sucked up all these wonderful new
features. The first program I made read a web server log and counted
the number of times each page had been accessed. I wrote it in half an
hour and it worked immediately! Not only did it work, but Perl also
seemed to be able to overlook unimportant errors instead of crashing
or aborting like C/Pascal/Java programs would.

(That program was released in many versions, and had quite a few
users. So Perl isn't useless, just inconvenient.)

I was really enchanted with this language and started using it more
and more. However, as I kept using it I kept discovering new things I
didn't like. After a while it added up to a pretty sizable list.

One of the first things I discovered I didn't like was the syntax.
It's very complex and there are lots of operators and special
syntaxes. This means that you get short, but complex code with many
syntax errors that take some time to sort out.

It also means that reading someone elses code is difficult. You can't
easily understand someone elses scripts and adapt them to your own
needs, nor can you easily take over the maintenance of someone elses
program.

I write pretty clean Perl code, because I stay away from most of
the obfuscating features, but even so it gets pretty hard to
read. This is one ordinary example:

However, even this isn't really bad. The Perl Journal has conducted an
Obfuscated Perl contest. The winners are here
.
Be warned, though. These programs gives the word unreadable entirely
new and previously unimagined meanings. (And no, this isn't an
argument for Perl being unreadable, but mainly included as a funny and
curious item.)

Some people reading this have complained that 'But anyone can write
unreadable code in any language!' and this is certainly true. However,
some languages seem to encourage hard-to-read code, while others seem
to discourage it.

From what I have seen of my own and other people's code Perl really
encourages hard-to-read programs. The Soundex example above comes from
the Perl distribution and was found by just randomly looking through 2
or 3 files. Looking through it again now I see many other examples
(like lib/pod/text.pm and lib/file/copy.pm), even though most scripts
are too short to be hard to read.

Some have argued that Perl is more like natural languages than most
programming languages, and this certainly seems correct to me. And to
me that is a disadvantage: natural language is extremely complex,
ambiguous and full of subtle nuances and meanings. I certainly don't
want my programs to be like that, but it seems that some do. I guess
the reader will have to find out for him/herself which category s/he
belongs in.

Many of Perl's features are built right into the language core itself,
instead of being placed in separate libraries. One example of this is
the handling of regular expressions, for which there is a special
syntax. This has the advantage that there are some convenient ways of
doing the things that are done most often, but it also means that you
don't get the advantages of objects.

To take one example, Perl has a special construct called formats,
which are a sort of templates you can use to generate nice textual
reports. Quite handy, but built into the language. So, you can't
create a list of formats, return them from functions and so on, which
will in many cases be a serious inconvenience.

I think you can do these things with file handles, but since they are
also handled as special cases I've never been able to figure out how.
I tried using references, but never made it work.

In the Perl documentation there is a separate manual page for how to
create arrays within arrays and hashes within hashes. And it's really
necessary. I had a lot of pain trying to figure out how to do
this, even after reading it several times. This is something that
really shook me, because in other languages this is something you just
do, without thinking about it.

In Lisp, this sets the variable a to a list:

(setq a '(1 2 3 4))

Here we create list b where the first element is another list:

(setq b '((0.8 0.9 1) 2 3 4))

Here's the first list in Perl:

@a=(1,2,3,4);

and here's the second:

@b=((0.8,0.9,1),2,3,4);

(The @s before
the variable names tells Perl that these are array variables.) That
wasn't so bad, was it? Well, let's try to use this.

To pick out the first element of the first list in Lisp, you just
write

(first a)

and Lisp gives you

1

To get the first element of the second list you
write

(first b)

and Lisp gives you

(0.8 0.9 1)

Let's try this in Perl.

$a[0]

gives us

1

The $ before the variable name tells Perl that we
want a single value (scalar in Perl lingo), not an array. The [0]
tells Perl that we want the first value of the array. Perl, like many
other languages and APIs counts from 0.

Then we do

$b[0]

and Perl happily gives us

0.8

That's right, Perl has broken into the
list inside the b list and retrieved the first value of it. Or,
rather, it flattened b into one list when we created it, so it's now
really one consecutive list with 6 elements.

To do this right we should have written

@b=([(0.8,0.9,1)],2,3,4);

when we created the
list. The []s enter a reference to the inner list as the first element
of the outer list instead of flattening the inner list into the outer
one.

OK. So we try again:

$b[0]

gives us

ARRAY(0xb75eb0)

So obviously we manage to find the
array, but something still goes wrong along the way. The problem is
that we use $b, which makes Perl think that we want a scalar and so it
gives us a reference to the array instead of the array itself (which
is not a scalar).

Aha! Of course! We must use

@b[0]

because @
tells Perl we want an array value. Not so. We get

ARRAY(0xb75eb0)

once again. I've never managed to
understand why this is so and at this point I gave up on the entire
thing.

Some weeks later I saw a helpful posting on no.perl: one should
request a reference to the array, like this

@{$b[0]}

which actually gives us

(0.8 0.9 1)

So now I can write code with arrays
inside arrays and hashes inside hashes.

Now, ask yourself: do you really think you should have to go through
all this in order to put one list inside another?

Another major disadvantage to Perl is that of function (or subroutine,
in Perl lingo) signatures, or rather, the lack of signatures. In most
programming languages when you declare a function you also declare its
signature, listing the names of the parameters and in some languages
also their types. Perl doesn't do this.

So what in Java is

public String substring(String str, int from, int to) {

becomes

sub substring {
local($str, $from, $to) = @_;

in Perl. In other words, you have to manually decode the parameter
list. Perl has lately been extended with the notion of prototypes,
which means that you can write

sub substring($, $, $) {
local($str, $from, $to) = @_;

and have Perl check that the number of arguments is correct. This is
not required, though, and there is much Perl code that does not use
this syntax.

The disadvantages don't stop there, though. Many programmers don't
destructure the parameter array like in the example above, which makes
the code much harder to read at a glance, and this also makes it
impossible to automatically generate good documentation.

And, what's more, you don't get the advantages more advanced languages
have from features such as keyword arguments (without re-implementing
them yourself with a hash). For example, when you want to create a
hash table in Common Lisp you call the make-hash-table function, which
takes the following keyword arguments:

test (what function to use to test for key equality),

size (a suggestion of the number of entries expected),

rehash-size (hints for rehashing the table),

rehash-threshold (how full the table can get before being
rehashed)

This means that all of the following invocations will create hash
tables correctly:

It also means that you can have function which take a large number of
parameters (make-array takes 7) and still keep both readability and
ease of use. You can do the same in Perl, but you are certainly not
encouraged to, documentation tools won't understand it, readers may
not either and it certainly isn't convenient compared to the way it is
in Common Lisp:

Although object-orientation is not as fantastic as many would like us
to think, Perl does support it. However, it does so only
half-heartedly, since objects were added rather late in the life of
the language.

The result of this is that normal files, sockets, hashes and lists are
not objects, which means that the interfaces to them are not as
convenient as they could have been. Newer versions of Perl come with
object-oriented modules with wrappers for these kinds of objects,
which means that Perl has a protocol for such objects and you can
write your own implementations of these protocols. However, it also
means that you need to distinguish between ordinary file handles and
file objects, which is a bit inconvenient.

Another thing is the fact that when creating objects you need to
explicitly manage the internals of your objects. In Perl, object
creation is manual. A class is declared as a package, and the
functions in the package then become the methods of the class. To
create an object, you make a hash table and then bless it (using the
built-in function 'bless') to make it an object. The 'perlobj' man
page, which explains the Perl object features, recommends this form of
object initializer:

There are other ways of doing object initialization, some of which
cause problems for inheritance. Personally, I find it amazing that
this sort of thing should be necessary at all. The above is equivalent
to this Python code:

class MyClass:
pass

The result is that one can easily get object construction wrong (such
as by not catering for inheritance), defining classes is awkward and
it's hard to tell from code when a class is defined (for both human
readers and software documentation tools).

In general, what this means is that Perl is a large and complex
language, which takes a long time to learn properly. In my opinion,
this complexity is unnecessary and a simpler language would have been
much better. I think this also means that many non-expert Perl
developers write suboptimal code.

Another thing is that I think few Perl developers (percentage-wise)
write general and reusable modules, because you need to learn the
language well before doing so, something that is relatively hard and
takes time. Another thing is that the language itself does not
encourage this.

Programming languages teach you not to want what they cannot provide.
You have to think in a language to write programs in it, and it's hard
to want something you can't describe. When I first started programming
- in BASIC - I didn't miss recursion, because I didn't know there was
such a thing. I thought in BASIC. I could only conceive iterative
algorithms, so why should I miss recursion?

--Paul Graham, ANSI Common Lisp.

So, after discovering all these bad things about Perl, what did I do?
I kept using it. After all, as bad as it was, it was still better than
doing text processing and web programming with C, Pascal or Java, and
there were no better alternatives.

Or so I thought. At the University bookstore there was this book
called "Internet Programming with Python". Being both a language freak
and an internet freak I thought this was interesting and picked it
up. Somewhere in the beginning of the book there was an anecdote
about a Python programmer who wrote all his Python programs so that
when an error occurred (ie: when an exception was thrown) the error
handler called his beeper.

Wow, I thought. This sounds interesting. So I went home, found the
Python tutorial, printed it out and started playing with Python. That
night I wrote a POP3 client library in Python. Just like that. After
going to bed I had an idea: wouldn't it be a lot nicer if I cached the
messages and made this invisible to the user? In the morning I added
that in half an hour.

I've since used this library to delete email without downloading it,
moving 150 emails from one POP account to another and many other
things. (Yes, I made a small SMTP library as well.) I can even use it
as an email client using the Python interpreter as a command line.

I kept using Python more and more after this. I wrote a link checker
that went over my web pages checking them for errors and kept adding
more protocols and features to it. After a while I thought: this
program spends a lot of time waiting for server responses. Maybe I can
speed it up by using multi-threading so that it can wait for several
servers at the same time?

I'd never really used multithreading before, but knew the theory
behind it. I added this to the link checker in an hour and that
includes the digging in the library documentation and removing the few
bugs I did introduce. (Multithreading is much more complex than it
sounds at first because things happen simultaneously. That's not
Python's fault, though.)

Having read this you may now be convinced that I'm a master
programmer, rather than that Python is a great programming language
for this kind of thing. Personally, I don't think this is true.
(Remember, I'm the guy who can't even make a Perl subroutine return a
file handle.) Also, from what I hear, many other people have had
similar experiences with Python. Here's one example:

In my first 15 minutes programming Python I wrote a program which
would download all the articles in a newsgroup into an mh folder for
me - and comp.lang.python was my first newsgroup!

Did it stop there? It certainly didn't. Since then I've discovered
these things in the Python libraries:

Support for serializing objects

Support for storing serialized objects in simple databases

A Python parser

Support for simple text databases (Perl has this too)

Libraries for ZIP file compression and decompression

A profiler (and a really nice one too!)

A CGI library

A URL parser library

A general URL connection library

Simplified HTML, SGML and XML parsers

A simple web server with CGI support

POP, IMAP and SMTP libraries (Python 1.5.1; I started with 1.4)

And these are things that come with the standard Python distribution!
Perl also has most of this stuff, but it doesn't come with the
interpreter and the quality of documentation and interfaces varies.

In Python you not only get these things delivered with the
interpreter, complete with documentation, they are also extremely
simple to use and provide exactly the sort of things one commonly
wants. Say you're writing a web robot and the robot has the URL to
the current page in a string (cur_url) and a relative URL from this
page to the next page in another (next_url) and you want to compute
the absolute URL of the next page. This is the code:

next_url=urlparse.urljoin(cur_url,next_url)

Python also supports GUI programming via Tk on Win32, Mac and Unix.
It's really easy to install, but not too well documented. There are
also at many other ways to do GUI programming with Python. (Yes, there
is for Perl as well.)

Python may become as big as Perl or Tcl. It is more "lovable"
than those -- though perhaps also more controversial. It has an
extremely supportive user community, and that is what will make it big.

--Guido van Rossum, creator of Python

When they want to create new libraries or "standards", the Pythoners
form Special Interest Groups
of volunteers, which anyone can join. Some of the results of this have
so far been a common API for database modules (which means you can use
the Sybase module and then exchange it with the Oracle one and only
change 2 lines of code), an IDL-to-Python mapping for CORBA, a common
text format for documentation strings and common tools and APIs for
XML parsing are under development.

Python is the most readable language I've ever programmed in. It took
me half an hour to understand Medusa (even though it's pretty weird in
concept) and another half-hour to change it so that I could map URL
paths in the web server to Python functions. An hour after that I
could read my Gnus mail boxes through the web server. Another hour,
and I could read news through it.

Here's my own implementation of the soundex function, written in
November '97, when I was still new at this:

# no_tbl is an array I've constructed that maps characters to numbers
# is_letter is written by me, but I've since discovered that Python
# has it
def soundex(string):
"""Returns the Soundex code of the string. Characters not A-Z skipped."""
string=lower(string)
if not is_letter(string[0]): string=string[1:]
last=no_tbl[ord(string[0])-97]
res =upper(string[0]) # This is where the result will end up
for char in string[1:]:
if is_letter(char):
new=no_tbl[ord(char)-97]
if (new!="0" and new!=last):
res=res+new
last=new
if len(res)<4:
return res+"0"*(4-len(res))
else:
return res[:4]

Turning off output buffering isn't blindingly obvious, but not too
difficult, either. You pretty quickly learn that sys.stdout, sys.stdin
and sys.stderr are file objects that represent standard out, standard
in and standard error. So, since stdout is an ordinary file object, it
should behave as one with respect to output buffering as well. And it
does. These objects have no methods to turn off buffering, but you can
flush data with the flush method:

sys.stdout.flush()

If you find that awkward there is a command-line switch for the
interpreter that lets you turn off buffering. However, you have to run
the interpreter with 'python -?' to find this out, so I couldn't
really claim that this is too well documented.

Files are also objects, as are Python modules and most other things.
This means that you can pass them as parameters, stuff them in lists,
subclass them and even create classes with the same methods and use
them where code expects to see a file object or a module.