Embperl: Modern Templates

Mr. Lerner introduces us to a template system for Perl: what it is, how it works and how to use it.

Earlier this year, I described mod_perl,
a module for the Apache web server that embeds a full version of
Perl inside Apache. Not only does this allow you to write CGI-style
programs that overcome CGI's bottleneck problems, but it also gives
you access to Apache's innards, letting you configure your server
in many new ways. A number of developers have begun to take
advantage of this flexibility, configuring Apache in new and clever
ways.

One such clever idea is Embperl, written by Gerald Richter
(richter@dev.ecos.de). Embperl allows you to create hybrid pages of
HTML and Perl. As we have seen in several previous columns,
templates allow designers and programmers to modify their
respective parts of a web site without getting in each other's way.
If the programmer wants to modify the logic, he or she can do so by
modifying the Perl parts of a template. By the same token,
designers can modify the look and feel of a page without having to
ask the programmer to change a few print
statements in a CGI program.

Embperl is but one of several template systems available for
mod_perl. Another contender for this role is ePerl, about which I
have read quite a bit, but haven't yet had a chance to try. Another
solution, which uses Perl but doesn't depend on mod_perl or Apache,
is Text::Template, a module I have used in previous columns when
discussing templates. Finally, PHP is an embedded scripting
language that resembles C and Perl in many ways, and is designed to
be interspersed with HTML inside of documents. To find more
information about all of these, including URLs, see
Resources.

How does Embperl work?

Before we can use Embperl, it's important to understand how
HTTP requests and responses are formed, and how a web server
performs its job. When you click on a web page link, your browser
connects to the host name in the URL and sends a short request to
the server. The request consists of a verb (typically GET or POST),
the name of the document being requested, and the version of HTTP
that the browser supports. For example, to request the root
document from a web server, a browser will typically send

GET / HTTP/1.0

to the server. It is the server's responsibility to handle
the request, responding with an error message or a document.
Depending on which version of HTTP the browser is running, the
server might return multiple documents in the same response, demand
some sort of user authentication before continuing, or redirect the
user's browser to a different URL.

In many cases, though, the server will not return a document
at all. Instead, it will run a program, returning the program's
output, rather than its contents. This is how CGI programs work:
the server is configured such that all files in a certain directory
are treated as programs, rather than documents to be retrieved
verbatim. (Indeed, security concerns arise when users can retrieve
programs' contents, rather than seeing their output.) As far as the
browser is concerned, it requested a document and received one in
response. The magic happens on the server side, where the program
is executed and produces its output.

A price is paid for CGI programs, above and beyond their
execution times: because web servers fork a separate process for
each CGI program, and Perl (and other popular scripting languages)
can have a long start-up time, it often takes longer for the
program to get started than for it to actually run.

For this reason, each web server has developed its own native
API that allows programs to bind more closely to the server's
internal code than would be possible with CGI. Netscape's NSAPI and
Microsoft's ISAPI are two examples of such proprietary systems, and
Apache's mod_perl is an example of how similar functionality can be
given to Perl programmers. With mod_perl installed in your server,
operations speed up tremendously, because the server compiles the
program once, rather than each time it is run. In addition, because
the program never requires creating a separate process, the
overhead associated with executing such programs is relatively
low.

Mod_perl is perhaps best known for allowing programmers to
write very fast CGI-like programs. However, since Apache's
internals are available via mod_perl, it is possible to write Perl
programs that change one or more steps in Apache's processing of
outgoing documents. These can range from the mundane to the fancy;
in Embperl's case, we are setting a special PerlHandler for
particular documents. In the Apache world, a “handler” is a
program that does something special with the files in a directory
before returning them to the HTTP client. You can think of a
handler as a middleman between Apache and the file; the handler
grabs the file and modifies it as necessary, handing the finished
product to Apache. Apache then takes this finished product and
returns it to the user's browser in the HTTP response.