The Apache::Request module gives you an easy way to get form
content, including uploaded files. In order to add file upload
functionality to your form, you need to add two things.

First, you'll need to add a form field which is type file. This
will put a browse button on the form that will allow the user to
choose a file to upload.

Second, you'll neet to make sure to add, to the form tag the
following:

enctype="multipart/form-data"

You won't be able to upload a file unless you have added this to the
form tag.

In your code, you'll need to take a few extra steps to actually
retrieve that file that has been uploaded. Using the following
form() method will allow you to have a standard function that
handles all of your forms, and does the right thing in the event that
there was a file uploaded. You can put this function in your
mod_perl handler, or in whatever module you want.

In your code, you can get the contents of the form by calling this
function:

my $form = Your::Class::form(); # Wherever you put this function

The value returned from this function is compatible with CGI.pm and
other modules such as CGI::Lite Which is to say, the function
returns a hashref. The keys of the hash are the names in your
form. The values in the hash are the values entered in those fields,
with the exception that a multiple select list with multiple things
selected will return a listref of the selected values.

If your form contained a file upload element, then $form{UPLOAD}
will contain a file upload object, which you can make calls back into.

That should give you the general idea of how this works. This lets you
have a generic form handler that does "normal" forms as well as file
upload forms, in mod_perl, without having to mess with CGI.pm, and
without having to do custom things when you have a file upload.

You will need to see the documentation for Apache::Upload for more
information about how to deal with the file upload object once you
have it. Note that the Apache::Upload docs are embeded in the
Apache::Request documentation, so you'll need to look there for
that information.

Many error conditions result in an exception (or signal -- same
thing) which is raised in order to tell the operating system that a
condition has arisen which needs more urgent attention than can be
given by other means. One of the most familiar ways of raising a
signal is to hit Ctrl-C on your terminal's keyboard. The signal
interrupts the processor. In the case of Ctrl-C the INT signal
is generated and the interrupt is usually trapped by a default
signal handler supplied by OS, which causes the operating system to
stop the process currently attached to the terminal.

Under mod_perl, a Perl runtime error causes an exception. By default
this exception is trapped by default mod_perl handler. The handler
logs information about the error (such as the date and time that the
error occurred) to error_log. If you want to redirect this
information to the client instead of to error_log you have to take
the responsibility yourself, by writing your own exception handler to
implement this behaviour. See the section "Exception Handling for mod_perl" for more information.

The code examples below can be useful with your own exception handlers
as well as with the default handlers.

META: Integrate the 2 sections

The CGI::Carp package implements handlers for signals. To trap
(almost) all Perl run-time errors and send the output to the client
instead of to Apache's error_log add this line to your script:

use CGI::Carp qw(fatalsToBrowser);

Refer to the CGI::Carp man page for more detailed information.

You can trap individual exceptions: for example you can write custom
__DIE__ and __WARN__ signal handlers. The special %SIG hash
contains references to signal handlers. The signal handler is just a
subroutine, in the example below it is called "mydie". To install the
handler we assign a reference to our handler to the appropriate
element of the %SIG hash. This causes the signal handler to call
mydie(error_message) whenever the die() sub is called as a result
of something which happened when our script was executing.

Do not forget the local keyword! If you do, then after the signal
handler has been loaded it will be called whenever die() is called
by any script executed by the same process. Probably that's not
what you want. If it is, you can put the assignment statement in any
module, as long as it will be executed at the right time.

Here is an example of a handler which I wrote because I wanted users
to know that there was an error, without displaying the error message,
but instead email it to me. If the error is caused by user
(e.g. uploading image whose size is bigger than the limit I had set) I
want to tell them about it. I wrote this handler for the mod_perl
environment, but it works correctly when called from the shell. The
code shown below is a stripped-down version with additional comments.

The following code must be added to the script:

# Using the local() keyword restricts the scope of the directive to
# the block in which it is found, so this line must be added at the
# right place in the right script. It will not affect other blocks
# unless the local() keyword is removed. Usually you will want the
# directive to affect the entire script, so you just place it near
# the beginning of the file, where the innermost enclosing block is
# the file itself.
local $SIG{__DIE__} = \&mydie;
# The line above assumes that the subroutine "mydie" is in the same script.
# Alternatively you can create a separate module for the error handler.
# If you put the signal handler in a separate module, e.g. Error.pm,
# you must explicitly give the package name to set the handler in your
# script, using a line like this instead of the one above:
local $SIG{__DIE__} = \&Error::mydie;
# again within the script!
# Do not forget the C<local()>, unless you want this signal handler to
# be invoked every time any scripts dies (including events where this
# treatment may be undesirable).
my $max_image_size = 100*1024; # 100k
my $admin_email = 'foo@example.com';
# and the handler itself
# Here is the handler itself:
# The handler is called with a text message in a scalar argument
sub mydie{
my $why = shift;
chomp $why;
my $orig_why = $why; # an ASCII copy for email report
# handle the shell execution case (so we will not get all the HTML)
print("Error: $why\n"), exit unless $ENV{MOD_PERL};
my $should_email = 0;
my $message = '';
$why =~ s/[<&>]/"&#".ord($&).";"/ge; # entity escape
# Now we need to trap various kinds of errors that come from CGI.pm
# We don't want these errors to be emailed to us, since
# they aren't programmatical errors
if ($orig_why =~ /Client attempted to POST (\d+) bytes/o) {
$message = qq{
You cannot POST messages bigger than
@{[1024*$max_image_size]} bytes.<BR>
You have tried to post $1 bytes<BR>
If you are trying to upload an image, make sure its
size is no bigger than @{[1024*$max_image_size]}
bytes.<P>
Thank you!
};
} elsif ($orig_why =~ /Malformed multipart POST/o) {
$message = qq{
Have you tried to upload an image in the wrong way?<P>
To successfully upload an image you must use a browser that supports
image upload and use the 'Browse' button to select that image.
DO NOT type the path to the image into the upload field.<P>
Thank you!
};
} elsif ($orig_why =~ /closed socket during multipart read/o) {
$message = qq{
Have you pressed a 'STOP' button?<BR>
Please try again!<P>
Thank you!
};
} else {
$message = qq{
<B>You need take no action since
the error report has already been
sent to the webmaster. <BR><P>
<B>Thank you for your patience!</B>
};
$should_email = 1;
}
print qq{Content-type: text/html
<HTML><BODY BGCOLOR="white">
<B>Oops, Something went wrong.</B><P>
$message
</BODY></HTML>};
# send email report if appropriate
if ($should_email){
# import sendmail subs
use Mail ();
# prepare the email error report:
my $subject ="Error Report";
my $body = qq|
An error has happened:
$orig_why
|;
# send error reports to admin
send_mail($admin_email,$admin_email,$subject,$body);
print STDERR "[".scalar localtime()."] [SIGDIE] Sending Error Email\n";
}
# print to error_log so we will know there was an error
print STDERR "[".scalar localtime()."] [SIGDIE] $orig_why \n";
exit 1;
} # end of sub mydie

You may have noticed that I trap the CGI.pm's die() calls here, I
don't see any reason why my users should see ugly error messages, but
that's the way CGI.pm written. The workaround is to trap them
yourself.

Please note that as of version 2.49, CGI.pm provides the cgi_error()
method to print the errors and won't die() unless you want it to.

What happens if you need to access the POSTed data more than once, say
to reuse it in subsequent handlers of the same request? POSTed data
comes directly from the socket, and at the low level data can only be
read from a socket once. So you have to store it to make it available
for reuse.

There is an experimental option for Makefile.PL called
PERL_STASH_POST_DATA. If you turn it on, you can get at it again
with $r->subprocess_env("POST_DATA"). This is not enabled
by default because it adds a processing overhead for each POST
request.

But what do we do with large multipart file uploads? Because POST
data is not all read in one clump, it's a problem that's not easy to
solve in a general way. A transparent way to do this is to switch the
request method from POST to GET, and store the POST data in the query
string. This handler does exactly this:

To save a few more cycles, so the handler will be called only for POST
requests.

Effectively, this trick turns the POST request into a GET request
internally. Now when CGI.pm, Apache::Request or whatever module
parses the client data, it can do so more than once since
$r->args doesn't go away (unless you make it go away by
resetting it).

If you are using Apache::Request, it solves this problem for you
with its instance() class method, which allows Apache::Request to
be a singleton. This means that whenever you call
Apache::Request->instance() within a single request you always
get the same Apache::Request object back.

Under mod_cgi it's not easy to redirect POST requests to some other
location. With mod_perl you can easily redirect POST requests. All
you have to do is read in the content, set the method to GET,
populate args() with the content to be forwarded and finally do the
redirect:

If you redirect, that's most likely telling the web browser to fetch
the new page. This makes it a totally new request, so no environment
variables are preserved.

However, if you're using internal_redirect(), you can make the
environment variables seen in the sub-process via
subprocess_env(). The only nuance is that the %ENV keys will be
prefixed with REDIRECT_.

Many people use relative paths for require, use, etc., and when
they open files in their scripts they make assumptions about the
current directory. This will fail if you don't chdir() to the
correct directory first (as could easily happen if you have another
script which calls the first script by its full path).

since foo.txt is located in the current directory. But when the
current directory isn't /home/httpd/perl, if we call the script
like this:

% /home/httpd/perl/test.pl

then the script will fail to find foo.txt. Think about
crontabs!

Notice that you cannot use the FindBin.pm package, something that
you'd do in the regular Perl scripts, because it relies on the
BEGIN block it won't work under mod_perl. It's loaded and executed
only for the first script executed inside the process, all the others
will use the cached value, which would be probably incorrect if they
reside in different directories. Perl 5.9.1 provides a new function
FindBin::again which will do the right thing. Also the CPAN module
FindBin::Real provides a working alternative working under
mod_perl.

I wrote this script a long time ago, when I had to debug my CGI
scripts but didn't have access to the error_log file. I asked
the admin to install this script and have used it happily since then.

If your scripts are running on these 'Get-free-site' servers, and you
cannot debug your script because you can't telnet to the server or
can't see the error_log, you can ask your sysadmin to install this
script.

Note, that it was written for plain Apache, and isn't prepared to
handle the complex multiline error and warning messages generated by
mod_perl. It also uses a system() call to do the main work with the
tail() utility, probably a more efficient perl implementation is due
(take a look at File::Tail module). You are welcome to fix it and
contribute it back to mod_perl community. Thank you!

On occassion you will need to set a cookie and then redirect the user
to another page. This is probably most common when you want a
Location to be password protected, and if the user is unauthenticated,
display to them a login page, otherwise display another page, but both
at the same URL.

Let's say that we are writing a handler for the location /dealers
which is a protected area to be accessed only by people who can pass a
username / password authentication check.

We will use Apache::Cookie here as it runs pretty fast under
mod_perl, but CGI::Cookie has pretty much the same syntax, so you
can use that if you prefer.

For the purposes of this example, we'll assume that we already have
any passed parameters in a %params hash, the authenticate()
routine returns true or false, display_login() shows the
username and password prompt, and display_main_page() displays the
protected content.

The handler prepares the data in hash %my_data and calls pnotes()
method to store the data internally for other handlers to re-use. All
the subsequently called handlers can retrieve the stored data in this
way:

which retrieves the username from the notes (using
$r->main->notes), uses some getquota() function to get the
quota related data and then sets the acquired data in the notes for
the PHP code. Now the PHP code reads the data from the notes and
proceeds with setting $message if $quota is set.

So any Apache modules can communicate with each other over the Apache
notes() mechanism.

You can use notes along with the sub-request methods lookup_uri() and
lookup_filename() too. To make it work, you need to set a note in the
sub-request. For example if you want to call a php sub-request from
within mod_perl and pass it a note, you can do it in the following
way:

As of the time of this writing you cannot access the parent request
tables from a PHP handler, therefore you must set this note for the
sub-request. Whereas if the sub-request is running in the mod_perl
domain, you can always keep the notes in the parent request notes
table and access them via the method main():

Just like $ENV{MODPERL} is checked to see whether the code is run
under mod_perl, $ENV{HTTPS} is set by ssl modules and therefore can
be used to check whether a request is running over SSL connection. For
example:

print "SSL" if $ENV{HTTPS};

If PerlSetupEnv Off setting is in effect, $ENV{HTTPS} won't be
available, and then:

print "SSL" if $r->subprocess_env('https');

should be used instead.

Note that it's also possible to check the scheme:

print "SSL" if Apache::URI->parse($r)->scheme =~ m/^https/;

but it's not one hundred percent certain unless you control the server
and you know that you run a secure server on the port 443.

There is nothing special about sending email from mod_perl, it's just
that we do it a lot. There are a few important issues. The most
widely used approach is starting a sendmail process and piping the
headers and the body to it. The problem is that sendmail is a very
heavy process and it makes mod_perl processes less efficient.

If you don't want your process to wait until delivery is complete, you
can tell sendmail not to deliver the email straight away, but
either do it in the background or just queue the job until the next
queue run. This can significantly reduce the delay for the mod_perl
process which would otherwise have to wait for the sendmail process
to complete. This can be specified for all deliveries in
sendmail.cf or on each invocation on the sendmail command line:

-odb (deliver in the background)

-odq (queue-only) or

-odd (queue, and also defer the DNS/NIS lookups).

The trend is to move away from sendmail(1) and switch to using lighter
mail delivery programs like qmail(1) or postfix(1). You should check
the manpage of your favorite mailer application for equivalent
configuration presented for sendmail(1).

The most efficient approach is to talk directly to the SMTP server.
Luckily Net::SMTP modules makes this very easy. The only problem
is when Net::SMTP fails to deliver the mail, because the destination
peer server is temporarily down. But from the other side Net::SMTP
allows you to send email much faster, since you don't have to
invoke a dedicated process. Here is an example of a subroutine that
sends email.

We can easily implement everything mod_rewrite does in Perl. We do
this with help of PerlTransHandler, which is invoked at the beginning
of request processing. For example consider that we need to perform a
redirect based on query string and URI, the following handler does
that.

The handler code retrieves the request object and the URI. Then it
retrieves the id using the regular expression. Finally it sets the
new value of the URI and the arguments string. The handler returns
DECLINED so the default Apache transhandler will take care of URI
to filename remapping.

Notice the technique to set the arguments. By the time the
Apache-request object has been created, arguments are handled in a
separate slot, so you cannot just push them into the original
URI. Therefore the args() method should be used.

After declaring the package name and importing constants, we set a
translation table of MIME types and corresponding handlers to be
called. Then comes the handler, where the request object is retrieved
and if its MIME type is found in our translation table we set the
handler that should handle this request. Otherwise we do nothing. At
the end we return DECLINED so some other fixup handler could take
over.

This will have all the urls starting with /some/url proxied off to
the other server at the same url. It will append the REMOTE_HOST
header as a query string argument. (QSA = Query String Append, P =
Proxy). There is probably a way to remap it as an X-Header of some
sort, but if query string is good enough for you, then this should
work really nicely.

You can provide your own mechanism to authenticate users, instead of
the standard one. If you want to make Apache think that the user was
authenticated by the standard mechanism, set the username with:

$r->connection->user('username');

Now you can use this information for example during the logging, so
that you can have your "username" passed as if it was transmitted to
Apache through HTTP authentication.

But you'll find that while the browser redirects itself, mod_perl
logs the result code as 200. It turns out that status() only touches
the Apache response, and the log message is determined by the Apache
return code.

Aha! So we'll change the exit() in redir() to exit(REDIRECT). This
fixes the log code, but causes a bogus "[error] 302" line in the
error_log. That comes from Apache::Registry:

So you see that any time the return code causes $@ to return true,
we'll get an error line. Not wanting this, what can we do?

We can hope that a future version of mod_perl will allow us to set the
HTTP result code independent from the handler return code (perhaps a
log_status() method? or at least an Apache::LOG_HANDLER_RESULT
config variable?).

In the meantime, there's Apache::RedirectLogFix, distributed with
mod_perl.

Add to your httpd.conf:

PerlLogHandler Apache::RedirectLogFix

and take a look at the source code below. Note that it requires us to
return the HTTP status code 200.

Now, if we wanted to do the same sort of thing for an error 500
handler, we could write another PerlLogHandler (call it
ServerErrorLogFix). But we'll leave that as an exercise for the
reader, and hope that it won't be needed in the next mod_perl release.
After all, it's a little awkward to need a LogHandler to clean up
after ourselves....

Perl uses sh() for its system() and open() calls. So if you
want to set a temporary variable when you call a script from your CGI
you do something like this:

open UTIL, "USER=stas ; script.pl | " or die "...: $!\n";

or

system "USER=stas ; script.pl";

This is useful, for example, if you need to invoke a script that uses
CGI.pm from within a mod_perl script. We are tricking the Perl script
into thinking it's a simple CGI, which is not running under mod_perl.

Make sure that the parameters you pass are shell safe -- all "unsafe"
characters like single-quote and back-tick should be properly escaped.

Unfortunately mod_perl uses fork() to run the script, so you have
probably thrown out the window most of the performance gained from
using mod_perl. To avoid the fork, change script.cgi to a module
containing a subroutine which you can then call directly from your
mod_perl script.

This is somewhat off-topic, but since many of us use mysql or some
other RDBMS in their work with mod_perl driven sites, it's good to
know how to backup and restore the databases in case of database
corruption.

First we should tell mysql to log all the clauses that modify the
databases (we don't care about SELECT queries for database
backups). Modify the safe_mysql script by adding the
--log-update options to the mysql server startup parameters and
restart the server. From now on all the non-select queries will be
logged to the /var/lib/mysql/www.bar.com logfile. Your hostname
will show up instead of www.bar.com.

Now create a dump directory under /var/lib/mysql/. That's where
the backups will be stored (you can name the directory as you wish of
course).

Prepare the backup script and store it in a file, e.g:
/usr/local/sbin/mysql/mysql.backup.pl

Now make the script executable and arrange the crontab entry to run
the backup script nightly. Note that the disk space used by the
backups will grow without bound and you should remove the old backups.
Here is a sample crontab entry to run the script at 4am every day:

0 4 * * * /usr/local/sbin/mysql/mysql.backup.pl > /dev/null 2>&1

So now at any moment we have the dump of the databases from the last
execution of the backup script and the log file of all the clauses
that have updated the databases since then. If the database gets
corrupted we have all the information to restore it to the state it
was in at our last backup. We restore it with the following script,
which I put in: /usr/local/sbin/mysql/mysql.restore.pl