No include

Some simple languages, like PHP, offer an include() to include a file literally. Perl does not. Then how can you still load code from other files?

There are many ways to do this in Perl, but there is no easy way to get close to what most people expect of an include(). People who want to include files usually come from simpler languages, or have no programming experience at all. The hardest thing they will have to learn is that you should not want to include a file.

That may sound harsh, but it's a Perl reality. In Perl, we don't include or link libraries and code reuse isn't usually done by copying and pasting. Instead, Perl programmers use modules. The keywords use and require are meant for use with modules. Only in the simplest of situations you can get away with abusing require as if it were some kind of include(), but it's easy to get bitten by how it works: it only loads any given file once.

Creating a module

What is a module, anyway? perlmod defines it as: just a set of related functions in a library file, i.e., a Perl package with the same name as the file.

Creating a module is EASY, but it is a little more work than just creating a file with code in it. First, you need to think of a module name. This is the basis for both the filename and the package name, and these have to be synchronised for some features to work. For project specific modules, I always create a new top-level namespace with the name of the project. For example, NameOfTheProject/SomeModule.pm is a good filename (in real life, use something a little more meaningful). The corresponding package is NameOfTheProject::SomeModule. Use CamelCaps like this, because everyone else does too.

The file must be in a directory that is listed in @INC. To find out what your @INC is, run perl -V. The current working directory (listed as its symbolic name . (a single dot)) should be listed. To begin with, putting the module in the script's directory is a good idea. It is the easiest way to keep things organized. If you want to put the module somewhere else, you will have to update @INC, so that perl knows where to find it. An easy way to do get code like this in the main script:

use lib 'path/to/the/modules';
# The example module could be path/to/the/modules/NameOfTheProject/Som+eModule.pm

Just list all the subs as you usually would, put a package statement at the top and a true value at the bottom (1 will suffice). Obviously, usestrict is a good idea. You should never code a module that does not have this. Beware: A use strict; statement in the main script has no effect on the module.

A module runs only once. That means that for code to be reusable, it must be in a sub. In fact, it's considered very bad style to do anything (except declaring and initializing some variables) in the module's main code. Just don't do that.

To have it exported automatically, use @EXPORT instead of @EXPORT_OK. This will eventually bite you if the function names are generic. For example, many people get bitten by LWP::Simple's get function. It is not unlikely that you already have one.

There are more ways to export functions. I of course prefer to use my own module Exporter::Tidy. Not only because it's smaller and often faster, but mostly because it lets the user of a module define a prefix to avoid clashes. Read its documentation for instructions.

For the export/import mechanism, it is very important that the filename, the package name and the name used with use are equal. This is case sensitive. (Ever wondered why under Windows, use Strict; doesn't enable strict, but also doesn't emit any warning or error message? It has everything to do with the case insensitive filesystem that Windows uses.)

Stubbornly still wanting to include

Sometimes, a module just isn't logical. For example, when you want to use an external configuration file. (Many beginners and people who post code online put configuration in the script itself for ease of use, but this makes upgrading the script harder.) There are many configuration file reader modules you can use, but why use one of those if you can just use bare Perl?

This is where do comes in. What do does is very close to what an include would do, but with a very annoying exception: the new file gets its own lexical scope. In other words: a variable declared with my is not accessible externally. This follows all logical rules attached to lexical variables, but can be very annoying. Fortunately, this does not have to be a problem. do returns whatever the included script returned, and if you make that script just the contents of a hash, here's my favourite way to offer configurability:

Because we used only the return value of the script, and never even touched a variable in config.pl, the inaccessibility of lexical variables is no longer a problem. Besides that, the code looks very clean and we have a very powerful config file format that automatically supports comments and all kinds of useful functions. How about interval => 24 * 60 * 60, for self-documentation? :)

Still not good enough

do updates %INC, which you may or may not want. To avoid this, use evalread_file instead. To find out if you want this, read perlvar.

There is a way to get an include() the way other languages have it. This is a very ugly hack that uses an internal exception made for Perl's debugger, and is possibly not future proof. As said before, you should not want to include a file. Still, because it is possible, I feel I have to tell you how. Just don't actually use it.

If you read the documentation for eval (which you of course should (don't use an operator without having read its documentation first)), you see that if it is called from within the DB package, it is executed in the caller's scope. This means that lexical values are made visible and the file behaves as a code block.

Here is an example to get an include() function that actually works the way most people expect:

This example Acme::Include does not have any error checking. In practice, you will want to check $@ somewhere (but you also want to retain the value returned by the included file, and context to propagate properly. Good luck!).

Learning more

I wrote this tutorial to have an answer ready for the nth time someone in EFnet's #perlhelp asks why require works only once, or asks how to really include a file. Explaining the same thing over and over gets annoying over time. This is not a good guide to writing modules. For that, read chapter 10 of Beginning Perl and perlmod and perlnewmod. Of course, good code always comes with good documentation; so learn POD in 5 minutes.

One last thing

If you name your module Test, don't be surprised if it doesn't work. The current working directory comes last in @INC, so the Test module that is in the Perl distribution is probably loaded instead. This bites me at least once per year, this time while writing this tutorial :).

Couldn't one write ACME::Include based on source filters
instead of the DB way you gave above?

Source filters are a bad idea in general, but note that
this one does not attempt to parse perl code,
so it can not be fooled by wierd-looking perl code like
some source filters can be.
I still don't say that soing such things would be a good idea,
but it might be cleaner that the DB hack.

Here's my example, which just a quick draft,
does not handle line numbers etc.

IMO, the DB hack is a cleaner solution, because it is faster, uses a documented feature and has clean syntax for using it. To achieve such clean syntax with a source filter, you need to write a complex regex.

Besides that, source filters don't work everywhere (like in eval) and I really think the included file should by itself be syntactically correct. Including code is bad, but including partitial expressions is, IMHO, even worse.

On the other hand, the source filter really literally includes, while the DB hack only loads and runs during runtime.

IMO, the DB hack is a cleaner solution, because it is faster, uses a
documented feature ...

I don't think any of them would be much faster than the other.
The source filter is documented too.

and has clean syntax for using it. To achieve such clean syntax with a source filter, you need to write a complex regex.

It's not that simple.
The difference is that the code I gave above does the include
in compile time, and its syntax is use Filter::Include "file"
(the do is not needed when including complete statements).
The DB way includes the code at run-time, that's why it's
possible to use a simple subroutine include "file"
is possible. While it is indeed not possible to make
my solution work with such a simple syntax, without
actually interpreting the code (incidentally that's what the actual
Filter::Include cpan module does);
if you wanted to modify the DB solution so that it includes
the code in compile time (which can be a difference
in semantics, depending on what the include file contains),
you'll have to use a use or BEGIN
syntax too, or try to parse the code.

Besides that, source filters don't work everywhere (like in eval)...

That's true. More generally, source filters can be used
only at compile-time, not runtime. Also, source filters
can not be used from command line (-e) it seems.

and I
really think the included file should by itself be syntactically correct.
Including code is bad, but including partitial expressions is, IMHO, even
worse.

True. I just wanted to show that this is really including the file.

Finally let me note that some include facility is already
built in perl: the -P switch. If the file third
contains

Yeah, well, I can't speak for the person you're sighing about but I did read the article. It says "Only in the simplest of situations you can get away with abusing require as if it were some kind of include(), but it's easy to get bitten by how it works: it only loads any given file once."

I don't get it. That's *exactly* how I want includes to work. In fact, in C I have to add macro definitions and ifs to get include files to include only once. So now tell me again why I'm not supposed to just use require to read in a library of functions? I guess I'm just simple. So what am I missing other than modules are a cooler way to do it?

You provide list context to read_file, only to then glue all the lines back together anyway. Except that you didn't chomp them, so joining with \n doubles all EOLs which throws off your line numbers. You want a simple concatenation instead.

You provide list context to read_file, only to then glue all the lines back together anyway.

Oops. That join is a left over bit of an earlier, more complex, attempt. Because Perl doesn't care about double newlines, and I didn't test with multi-line strings, and haven't even looked at line numbers, I never noticed that anything was wrong.

You are of course right that simple concatenation is better here. I'll update the node right away.

Did i get it right that, basically, there is a difference between the "current" directory and the directory of the pl file. Or, put another way, if i keep all my project files in the same directory, like we do during development, and the directory may change, and the directory i execute it from may change, "use lib" will work "as expected".

When putting a smiley right before a closing parenthesis, do you:

Use two parentheses: (Like this: :) )
Use one parenthesis: (Like this: :)
Reverse direction of the smiley: (Like this: (: )
Use angle/square brackets instead of parentheses
Use C-style commenting to set the smiley off from the closing parenthesis
Make the smiley a dunce: (:>
I disapprove of emoticons
Other