Search

Embperl: Modern Templates

Earlier this year, I described mod_perl,
a module for the Apache web server that embeds a full version of
Perl inside Apache. Not only does this allow you to write CGI-style
programs that overcome CGI's bottleneck problems, but it also gives
you access to Apache's innards, letting you configure your server
in many new ways. A number of developers have begun to take
advantage of this flexibility, configuring Apache in new and clever
ways.

One such clever idea is Embperl, written by Gerald Richter
(richter@dev.ecos.de). Embperl allows you to create hybrid pages of
HTML and Perl. As we have seen in several previous columns,
templates allow designers and programmers to modify their
respective parts of a web site without getting in each other's way.
If the programmer wants to modify the logic, he or she can do so by
modifying the Perl parts of a template. By the same token,
designers can modify the look and feel of a page without having to
ask the programmer to change a few print
statements in a CGI program.

Embperl is but one of several template systems available for
mod_perl. Another contender for this role is ePerl, about which I
have read quite a bit, but haven't yet had a chance to try. Another
solution, which uses Perl but doesn't depend on mod_perl or Apache,
is Text::Template, a module I have used in previous columns when
discussing templates. Finally, PHP is an embedded scripting
language that resembles C and Perl in many ways, and is designed to
be interspersed with HTML inside of documents. To find more
information about all of these, including URLs, see
Resources.

How does Embperl work?

Before we can use Embperl, it's important to understand how
HTTP requests and responses are formed, and how a web server
performs its job. When you click on a web page link, your browser
connects to the host name in the URL and sends a short request to
the server. The request consists of a verb (typically GET or POST),
the name of the document being requested, and the version of HTTP
that the browser supports. For example, to request the root
document from a web server, a browser will typically send

GET / HTTP/1.0

to the server. It is the server's responsibility to handle
the request, responding with an error message or a document.
Depending on which version of HTTP the browser is running, the
server might return multiple documents in the same response, demand
some sort of user authentication before continuing, or redirect the
user's browser to a different URL.

In many cases, though, the server will not return a document
at all. Instead, it will run a program, returning the program's
output, rather than its contents. This is how CGI programs work:
the server is configured such that all files in a certain directory
are treated as programs, rather than documents to be retrieved
verbatim. (Indeed, security concerns arise when users can retrieve
programs' contents, rather than seeing their output.) As far as the
browser is concerned, it requested a document and received one in
response. The magic happens on the server side, where the program
is executed and produces its output.

A price is paid for CGI programs, above and beyond their
execution times: because web servers fork a separate process for
each CGI program, and Perl (and other popular scripting languages)
can have a long start-up time, it often takes longer for the
program to get started than for it to actually run.

For this reason, each web server has developed its own native
API that allows programs to bind more closely to the server's
internal code than would be possible with CGI. Netscape's NSAPI and
Microsoft's ISAPI are two examples of such proprietary systems, and
Apache's mod_perl is an example of how similar functionality can be
given to Perl programmers. With mod_perl installed in your server,
operations speed up tremendously, because the server compiles the
program once, rather than each time it is run. In addition, because
the program never requires creating a separate process, the
overhead associated with executing such programs is relatively
low.

Mod_perl is perhaps best known for allowing programmers to
write very fast CGI-like programs. However, since Apache's
internals are available via mod_perl, it is possible to write Perl
programs that change one or more steps in Apache's processing of
outgoing documents. These can range from the mundane to the fancy;
in Embperl's case, we are setting a special PerlHandler for
particular documents. In the Apache world, a “handler” is a
program that does something special with the files in a directory
before returning them to the HTTP client. You can think of a
handler as a middleman between Apache and the file; the handler
grabs the file and modifies it as necessary, handing the finished
product to Apache. Apache then takes this finished product and
returns it to the user's browser in the HTTP response.

Installing Embperl

Note that installing Embperl can be a bit tricky. The
documentation is generally good and describes all of the steps
necessary to install it on your own computer. I have installed it
several times and found that each time required several tries
before I managed to follow the directions correctly. I will
describe the procedure here in some detail, but you might want to
look at the FAQ file that comes with Embperl for more information.
(Many of the following instructions are based on that FAQ.)

Before you begin, you should install or update the latest
versions of several packages: LWP (the library for Web client
programming), HTML::HeadParser (used for parsing HTML document
heads), CGI.pm (the super-module that handles everything having to
do with CGI), and MIME::Base64 (which handles the encoding
information to and from Base64, which is used in the MIME
standard). All of these are available from CPAN (see
Resources).

Both mod_perl and Apache must be recompiled in order to get
Embperl running. It is possible for Embperl to run as an external
CGI program, rather than from within mod_perl, but you will then
lose the speed benefits of mod_perl. I strongly suggest going the
mod_perl route, unless you are using a web server other than
Apache, or if you would rather not recompile things just
now.

For starters, then, you will need the source for Apache (from
http://www.apache.org/), mod_perl (from CPAN, at
http://www.perl.com/CPAN/) and Embperl (also from CPAN, as
HTML-Embperl-1.0.0). On my machine, these packages were named as
follows:

HTML-Embperl-1.0.0.tar.gz
apache_1.3.0.tar.gz
mod_perl-1.12.tar.gz

I am sure that newer versions of these programs will be
available by the time you read this article. However, you should be
able to follow this discussion by updating the version numbers as
appropriate.

First, unpack all of the files using the command:

for file in `ls *gz`; do tar -zxvf $file; done

Before you can start to compile the components, you will have
to set some of the configurations and modify Makefiles. First of
all, go into the mod_perl directory and edit
src/modules/perl/Makefile:

/downloads/mod_perl-1.12/src/modules/perl/Makefile

You will have to make three changes to this file. First, add the
HTML::Embperl to the definition of STATIC_EXTS that will be grabbed
by the mod_perl configuration system. That is, edit the line (line
98, in mod_perl-1.12):

#STATIC_EXTS = Apache Apache::Constants

and change it to:

#STATIC_EXTS = Apache Apache::Constants HTML::Embperl

Next, look for the line that begins with OBJS=
(line 131 in mod_perl-1.12). Just before that line, define the
variable EPDIR so that it points to your Embperl
build directory. For instance, assuming that we are building
Embperl in /downloads/HTML-Embperl-1.0.0, we will set it to:

EPDIR=/downloads/HTML-Embperl-1.0.0

We will now modify the OBJS variable such that it creates the
object files for Embperl as well as mod_perl:

Don't forget to put backslashes at the end of each continued line,
so that make doesn't think the second and third
lines should stand on their own.

The hardest part is over. All we have to do now is configure
and compile the various components. Make sure to do them in the
right order, though, or things might not work correctly.

First, enter the mod_perl directory and create the Makefile
using the standard Perl command perl
Makefile.PL.

If you want some or all of mod_perl's capabilities, now is
the time to specify that. I tend to activate all of them (except
for two that need explicit activation), so I enter perl
Makefile.PL EVERYTHING=1. This will begin the mod_perl
configuration process. You will be asked if you want to use the
Apache source code in the parallel directory, and then if you want
mod_perl to build httpd for you.
Answer “yes” to both questions.

When the configuration script has finished running, go into
the Embperl directory (/downloads/HTML-Embperl-1.0.0) and configure
the module using the same command:

perl Makefile.PL

Once again, the system will perform a variety of
configurations. You will be asked if you want Embperl to support
Apache, and then if it should use the Apache source code in the
parallel directory. Again, answer “yes” to both questions.
Finally, you will be asked for a path name to the copy of httpd
that will be used in testing. Check that the default is correct,
and correct it if necessary.

Now we can actually create Embperl by typing
make in its directory. After the compilation is
complete, switch back to the mod_perl directory and create mod_perl
and Apache by typing make.

Congratulations. You should now have working copies of
Apache, mod_perl and Embperl. At this point, we could run
make install in each of the three directories to
install the software, or we can test Embperl. If you are interested
in testing your Embperl compilation without installing it, I
suggest that you read the FAQ. The directions are not that
difficult to follow, but they are more complex than I can describe
in the space provided here.

Configuring Apache to Allow Embperl

Now that Embperl is part of your copy of Apache, what can you
do with it? Not much at this point, since we have not yet defined
the handler for our Embperl files. Now we will have to modify
Apache's configuration files, which might be in a number of
possible places. When I installed Apache, I accepted the default
installation locations (under /usr/local/apache), and my path names
will reflect that.

In order to get Embperl working, we will need to modify two
of Apache's configuration files. (Each of the three files can
actually contain any of the configuration directives, but certain
items are traditionally put in certain files.) I told the server to
redirect URLs beginning with /embperl to
/usr/local/apache/share/embperl by adding the following lines to
the srm.conf file:

Alias /embperl /usr/local/apache/share/embperl

Next, I told Apache to install Embperl as the handler for
that directory. As I mentioned above, this means that HTML::Embperl
will be called each time Apache is asked to retrieve a document
from /embperl. The file will be read from disk, handled by Embperl,
and finally given to Apache, which returns it to the user's
computer. I added the following to the access.conf file:

Embperl files look just like HTML files, with a minor
difference: square brackets signify special sections of code, which
are interpreted separately. In other words, you can put stock HTML
files in an Embperl directory, although I would tend to advise
against doing so, because of the additional overhead involved. Why
force Embperl to look at a file unnecessarily? For that reason,
some sites have decided to use a special suffix—.htmpl,
perhaps—and then to configure Apache so that all files with that
suffix, regardless of directory, are interpreted. That allows HTML
files to be mixed in with their Embperl counterparts.

The following file, when retrieved from within a directory
defined for Embperl, will print the current time:

However, Embperl has several advantages over a CGI program. For
one, running it under mod_perl gives it a distinct speed advantage.
Of course, we could modify our CGI program and/or Apache
configuration so that the program would run under Apache::Registry,
the mod_perl module that handles CGI-like programs.

The biggest advantage, though, is the clean separation
between static and dynamic content. No longer does the programmer
become the bottleneck, slowing down design and content changes—now
the site's designers and editors can modify the HTML, so long as
they stay away from the Perl inside square brackets. There will
obviously be times when the embedded Perl code will affect the
design, and the programmer can be included in such cases. But for
the most part, such a separation allows everyone to do what they do
best.

We have already seen one form of the Embperl brackets, namely
[+ and +]. Anything in
square-plus brackets is evaluated as Perl code, with the results
inserted into the HTML document and passed to the browser. Remember
that the result of evaluating a Perl variable or string is the
value of that variable or string. It's very common for square-plus
brackets to contain a single variable name, whose contents are
inserted into the document at the indicated point. Don't use the
print function to insert things into the Embperl
document, because print sends output to STDOUT,
and then returns a result indicating whether it was successful.
Each set of square-plus brackets can contain as much or as little
Perl as you might like, although most Embperl programmers seem to
prefer keeping the lines short.

Output from square-plus brackets is placed directly into the
file without any additional formatting. If you want something to be
in paragraph tags, boldface, italics or a different font, it is
your responsibility to make sure that happens. Most often, you will
want to surround the square-plus brackets with the appropriate HTML
tags, so that the resulting output will be correctly formatted.
That is, you could make a variable's value italic by saying

[+ "<i>$variable</i>" +]

but, for the sake of maintenance and separating static and
dynamic content, it's better to say:

<i>[+ $variable +]</i>

What if you don't want the results of your code to be inserted into
the document? You could end each set of Perl expressions with the
empty string, as in:

[+ $counter++; &get_user_info($id); "" +]

which will insert the empty string into the document, since it was
the last element to be evaluated. But a better solution would be
square-minus brackets ([- and -]), which do that for you
automatically. For example:

Output from the above Embperl will look the same as the one without
square-minus brackets, since the output from operations performed
in square-minus brackets aren't inserted into the HTML. This is
useful when making variable assignments, as well as when importing
other Perl modules. For example:

As you can see, variable assignments are kept across square
brackets, meaning that you can assign a variable in one block and
refer to it later. Variables are global by default, but you can use
Perl's “my” convention to create temporary variables, which go
out of scope at the end of the block.

One of the nice things about mod_perl is that it compiles
programs once, caching them for future invocations. Not only do you
save the overhead of forking a new process, but the program runs
much faster since it only needs to be interpreted. In many cases,
you want variable values to remain intact across several
invocations of a program. Such persistence allows you to log into a
database server only once, keeping a connection open through the
duration of many HTTP requests.

This raises the question of what happens to variables you
define in an Embperl document—do they also keep their values
across invocations, or do they disappear? The answer is that each
Embperl document is processed in its own package, and the variables
defined in that package are reset by default upon each invocation.
However, variables defined in other packages are kept across
invocations. The following Embperl document demonstrates how this
works:

If you try this on your system, you may well discover that
$counter always remains at 1, while
$remain::counter is incremented with each
invocation. However, if you are running more than a single copy of
Apache, $remain::counter probably jumps around,
as if several different copies of it were being incremented. This
is indeed the case, since each copy of Apache is running its own
copy of mod_perl and Embperl. If you rely on persistent variables
across invocations, remember that a given user might connect to
more than one copy of Apache, and you cannot rely on the same copy
always being available to the same user.

However, persistent variables can be useful when making
connections with other than the user's computer. In particular, DBI
(the Perl database interface) can take advantage of this with the
Apache::DBI module. This module
opens a connection to a database server when it is first invoked,
and then continues to use that connection throughout the life of
the Apache process, immediately sending each query to the database
server. Because the persistence is between Apache and the database
server, it works regardless of whether a user connects to the same
httpd process each time.

When defining subroutines inside of Embperl documents, it's
probably best to use another kind of square brackets, with
exclamation points as the special characters. Square-bang brackets
([! !]) are the same as square-minus brackets, except that the Perl
code contained within is executed only upon the document's first
invocation. If you are running Embperl under mod_perl, defining
subroutines inside of square-bang brackets means they will be
defined and compiled a single time, further increasing the speed of
your program.

Embperl Meta-commands

Finally, we come to square-dollar brackets ([$ $]), which
allow you to enter Embperl meta-commands. These meta-commands, as
you might imagine from the name, are actually part of a small
programming language with which you can tell Embperl what to
do.

Meta-commands allow you to make sections of HTML and Perl
conditional, or to loop over them a given number of times. The same
tasks could be performed inside of a normal Embperl block, since
Perl is a full-fledged programming language and can handle
conditionals and looping just fine. But by using the Embperl
meta-commands, you can place even more HTML outside of the Perl
blocks, making the Perl blocks somewhat smaller and easier to
read.

For example, let's say we run a web site that requires
registration. Assuming we have a function called
&is_registered that returns
“true” or “false”, depending on whether a user is registered
with our system, we could print an appropriate greeting with the
following code:

[+ &registered($user_id) ? "You are known" : "You are
unknown" +]

Once you start to deal with the formatting associated with
those strings, the menus you might want to display for new users
and the personalized home pages that registered users should see,
the block of Perl inside of square-plus brackets becomes quite
large. It's thus easier to use square-dollar brackets and Embperl
meta-commands:

[$ if &registered($user_id) $]
You are known, your registered home page:
[+ &output_home_page($user_id) +]
[$ else $]
Welcome, new user! We would like to ask you a few
questions:
[+ &output_questionnaire +]
[$ endif $]

The above, which I have indented in the style of a programming
language, is easier to understand than a large block of Perl code.
It is also more easily understood and modified by non-programmers
on your site, who can clearly see the difference between HTML and
other items.

Automatic Table Generation

Embperl has many features, far too many to describe here. My
favorite feature is its ability to create HTML tables
automatically, filling them in as necessary. Embperl looks for the
beginning of an HTML table, marked with a
<TABLE> tag, before filling it in. In
order to do this, Embperl expects you to use a number of “magic”
variables within the table. You can set the exact behavior with
Embperl's $tabmode, but the basic idea is that
within a table, $row (a magic variable) begins
at 0 and increments until it reaches $maxrows
(another magic variable). When an expression within the table
returns “undefined”, Embperl exits from the table loop and stops
incrementing $row. We can thus get a nicely
formatted listing of environment variables with this code:

Notice how we first defined each array outside of the table
definition. We then used $row (which is
incremented automatically by Embperl) to retrieve each element from
@keys.

Using the magic table fill-in procedure can be extremely
powerful, but it requires you to change your programming style
somewhat. Nevertheless, the potential uses for it in database
applications are tremendous, since it greatly cuts down on the
amount of necessary coding.

If you look at the list of environment variables, you might
notice QUERY_STRING is unset. When invoking
programs, QUERY_STRING is normally set by
appending a question mark (?) and a string to a URL, but there is
no reason why we cannot use the same syntax with Embperl documents,
as in
http://localhost/embperl/env.html?foo.

If the above environment-printing Embperl file is called
env.html, then invoking it with the foo
parameter should give QUERY_STRING a
value.

Indeed, we can even use Embperl documents as the “action”
of a CGI program. Grabbing values from the %fdat hash, Perl blocks
within our document can retrieve form values, use them, and even
construct a document based on them.

Embperl does require a slightly different style of
programming than is usual in Perl. Typically, Perl is written in
blocks of code, with each code returning a value. Embperl is much
terser, with pairs of square brackets occurring much more often
than Perl's curly braces. Of course, the style presented in the
Embperl documentation and the above examples does not have to be
your own; you can put entire Perl programs between square
brackets.

The trend seems to be toward using templates and databases to
design web sites, with more and more products appearing on the
market that claim to do such things. The combination of Linux,
Apache, mod_perl and Embperl not only makes for a cost-effective
solution, but also a powerful combination of programming tools that
is hard to beat. Next month, we will look at Embperl a bit more
closely, and learn how we can use it with databases to easily
create personalized home pages.

Reuven M. Lerner
is an Internet and Web
consultant living in Haifa, Israel, who has been using the Web
since early 1993. In his spare time, he cooks, reads and volunteers
with educational projects in his community. You can reach him at
reuven@netvision.net.il.