I assumed that it would read one line at a time, but apparently it tries to read the whole file into memory. I don't have that much memory, in fact I don't have that much hard drive space, so that's not a good idea, and I don't see why Perl would think so.

it seems to work. Unfortunately it's still a bit slow - it initially processes about 10 MB per minute, which means it would take upwards of four days to process the whole thing. And that's assuming that it actually continues at constant speed.

I tried changing the program so that I can pause it and continue later, so I can run it at night, or something. I let the program tell me which line it's on, and then take that as input the next time it starts. But of course looking up the right line number when it starts again takes a certain amount of time, which appears to be more than linear in number of lines, so that's not going to work.

Is there some way to convince Perl to read one line at a time? Or some other clever workaround?

Aside from the excellent advice above about using while rather than for, you may also want to consider looking at the File::SortedSeek module.

It's not directly applicable to what you're doing, but you could use it for inspiration - if your XML is in any kind of sorted order, you can save ENORMOUS amounts of time by doing binary searches if you only need to process a small subset of the file.

lanX's comment at Re: Going through a big file excellently explains the difference between how for slurps the whole lot into an array, but while does it line by line.
The only comment I would add is why bother to have it open the files manually when it can be done implicitly?

does the job. Arguments other than the file(s) you want to process should be flags. In fact, you don't even need to write this code as it is automatically assumed when you run perl with flags such as -n or -p. See perlrun for more information.

A Monk aims to give answers to those who have none, and to learn from those who know more.

I'd recommend following the advice pointed to by Anonymous Monk's question, and investigating a CPAN module to assist you with the XML processing. For example, I know XML::Twig can handle huge XML files, and I'm certain there are others available, too.

I love regular expressions, and they've helped me solve all sorts of nightmare formats, but XML is already a structured data source, using a tool that lets you take advantage of that structure will usually make things easier on you and more robust.