The @data array was very big, and the subroutine above actually makes a copy of it. I didn't appreciate this when I first wrote it, but it was slow, and was consuming silly amounts of memory (even by Perl standards). Changing the relevant lines as follows makes it run faster, and it now uses less memory - about half as much.

I am now a happy perl loving bunny, and I can read in my very big files ;-)

--
Anthony Staines

PS - What's happening here, for monks whose Perl skills are closer to my own, is that I passed the array as a reference ($data_ref) to the subroutine. There, partly from force of habit I was turning it back into an array. This, as was obvious with hindsight, doubled the memory requirements of the program. Now @data was big, 10 MB to 50MB or so, and the program was not working well. Accessing the array, through the reference only, fixed the problem - hence the @{$data_ref} bit.

Well, I guess I had that in me from years spent coding C/C++, but alwasy pass references to data instead of replicating it. Undoubtfully this is a rather trivial skill to grasp and anyone who'd spent at least 6 month coding would *I hope* know that.

Perl is actually quite good when it comes to references. Certainly much easier to handle than C pointers for example. Take a look at this snippet:

Perl is farily well optimized and in this for loop no memory gets duplicated. Instead, the s/// operator works on data stored in the original place (allocated for the a,b,c variables).

There are numerous other examples one could bring up. Not the least is the ease with which you can pass references around in perl. In fact, I believe this is one of the first basic things that any novice Perl book would mention to its reader. For those who don't have a book however, I suggest refering to this excellent 'article' on Perl references here (written by fair brother dominus.

Perl really needs better aliasing features. Currently, you can only create an alias with a package variable (using a glob) or in a for-loop. Other methods require you to use tie() or some other less elegant means.

Using arrow notation makes it easier for me to recognize
that the thingys in question are references (I find that
the extra leading sigil gets lost in the code soup to my
eyes). It also makes doing nested dereferences easier:

Anyway IMHO the $$something notation is the worst to read or understand again after some time... If you code alone it's up to you which one to choose, but if I'm a part of a team I just try to be polite, writing my code more readable and easy to understand to my co-workers...

The notation is what I always use - I find it easier to stick to one form throughout. Don't get me wrong I appreciate Perl's flexibility, but consistency makes my life simpler. I make no claim for superiority here! However, I do agree that the $$ notation is hard to follow, and I never use it

Maybe it's just me, but I find reading function arguments with shift is usually pointless since you can just declare the works in a single line. It's also nice to have the function argument declaration in a similar format to how you call it, so you can see if things match up.

As of Perl 5.6 or thereabouts, you can open a filehandle that is put into a scalar (glob reference) using the open function:

These filehandles are a lot easier to pass back and forth than the regular globs. If you're even thinking about using them as function arguments, don't use globs.

As a note: ${$foo} and $$foo are equivalent. It's often simpler to just leave off the extra braces since they don't serve any practical purpose. There are occasions like ${$foo->{bar}} where they more appropriate.