NAME
ModPerl2::Tools - a few hopefully useful tools
SYNOPSIS
use ModPerl2::Tools;
ModPerl2::Tools::spawn +{keep_fd=>[3,4,7], survive=>1}, sub {...};
ModPerl2::Tools::spawn +{keep_fd=>[3,4,7], survive=>1}, qw/bash -c .../;
ModPerl2::Tools::safe_die $status;
$r->safe_die($status);
$f->safe_die($status);
$content=ModPerl2::Tools::fetch_url $url;
$content=$r->fetch_url($url);
INSTALLATION
perl Makefile.PL
make
make test
make install
DESCRIPTION
This module is a collection of functions and methods that I found useful
when working with "mod_perl". I work mostly under Linux. So, I don't
expect all of these functions to work on other operating systems.
Forking off long running processes
Sometimes one needs to spawn off a long running process as the result of
a request. Under modperl this is not as simple as calling "fork" because
that way all open file descriptors would be inherited by the child and,
more subtle, the long running process would be killed when the
administrator shuts down the web server. The former is usually
considered a security issue, the latter a design decision.
There is already $r->spawn_proc_prog that serves a similar purpose as
the "spawn" function. However, "spawn_proc_prog" is not usable for long
running processes because it kills the children after a certain timeout.
Solution
$pid=ModPerl2::Tools::spawn \%options, $subroutine, @parameters;
or
$pid=ModPerl2::Tools::spawn \%options, @command_line;
"spawn" expects as the first parameter an options hash reference. The
second parameter may be a code reference or a string.
In case of a code ref no other program is executed but the subroutine is
called instead. The remaining parameters are passed to this function.
Note, the perl environment under modperl differs in certain ways from a
normal perl environment. For example %ENV is not bound to the C-level
"environ". These modifications are not undone by this module. So, it's
generally better to execute another perl interpreter instead of using
the $subroutine feature.
The options parameter accepts these options:
keep_fd => \@fds
here an array of file descriptor numbers (not file handles) is
expected. All other file descriptors except for the listed and file
descriptor 2 (STDERR) are closed before calling $subroutine or
executing @command_line.
survive => $boolean
if passed "false" the created process will be killed when Apache
shuts down. if true it will survive an Apache restart.
The return code on success is the PID of the process. On failure "undef"
or an empty string is returned.
The created process is not related as a child process to the current
apache child.
Serving "ErrorDocument"s
Triggering "ErrorDocument"s from a registry script or even more from an
output filter is not simple. The normal way as a handler is
return Apache2::Const::STATUS;
This does not work for registry scripts. An output filter even if it
returns a status can trigger only a "SERVER_ERROR".
The main interface to enter standard error processing in Apache is
"ap_die()" at C-level. Its Perl interface is hidden in Apache2::HookRun.
There is one case when an error message cannot be sent to the user. This
happens if the HTTP headers are already on the wire. Then it is too
late.
The various flavors of "safe_die()" take this into account.
"safe_die" won't return. Instead it calls ModPerl::Util::exit(0) which
raises an exception.
ModPerl2::Tools::safe_die $status
This function is designed to be called from registry scripts. It
uses Apache2::RequestUtil->request to fetch the current request
object. So,
PerlOption +GlobalRequest
must be enabled.
Usage example:
ModPerl2::Tools::safe_die 401;
$r->safe_die($status)
$f->safe_die($status)
These 2 methods are to be used if a request object or a filter
object are available.
Usage from within a filter:
package My::Filter;
use strict;
use warnings;
use ModPerl2::Tools;
use base 'Apache2::Filter';
sub handler : FilterRequestHandler {
my ($f, $bb)=@_;
$f->safe_die(410);
}
The filter flavor removes the current filter from the request's
output filter chain.
$r->headers_sent
This function checks if the HTTP_HEADER output filter is still
present. If so, it returns an empty list, true otherwise.
The presence of this filter means no output has yet been written to
the client. The HTTP status code and header fields can still be
modified.
Fetching the content of another document
Sometimes a handler or a filter needs the content of another document in
the web server's realm. Apache provides subrequests for this purpose.
The 2 "fetch_url" variants use a subrequest to fetch the content of
another document. The document can even be fetched via "mod_proxy" from
another server.
"ModPerl2::Tools::fetch_url" needs
PerlOption +GlobalRequest
Usage:
$content=ModPerl2::Tools::fetch_url '/some/where?else=42';
$content=$r->fetch_url('/some/where?else=42');
($content, $headers)=
$r->fetch_url('http://what.is/the/meaning/of?life=42');
If "mod_proxy" is available "fetch_url" can use it to fetch a document
from another web server. If "mod_ssl" is configured to allow proxying
SSL (see "SSLProxyEngine") even the "https" scheme works. Another subtle
point, "ProxyErrorOverride" may affect the output in case of an error.
Further, if "fetch_url" is passed a subroutine as the last argument the
content is not accumulated in a single variable but passed brigade-wise
to the function:
($content, $headers)=
$r->fetch_url('http://what.is/the/meaning/of?life=42', sub {
my ($subr, @brigade)=$_;
...
});
The subroutine is called with the subrequest as the first parameter and
a list of non-empty strings. The list itself may be empty if all buckets
of the brigade do not contain data.
On success the resulting $content will be the empty string in this case.
"fetch_url()" normally strips almost all input HTTP header fields from
the subrequest before running it. However, if the $r request object has
a "Host" header field it is passed on. Also, a "User-Agent" header is
set for the subrequest containing "ModPerl2::Tools/$VERSION" where
$VERSION is the module's version.
If you need to pass more fields pass an array reference as the 2nd
parameter to "fetch_url()":
($content, $headers)=
$r->fetch_url('http://what.is/the/meaning/of?life=42', [qw/
X-MyHeader my-value
X-MyNextHeader my-next-value
/]);
or even:
($content, $headers)=
$r->fetch_url('http://what.is/the/meaning/of?life=42', [qw/
X-MyHeader my-value
X-MyNextHeader my-next-value
/], sub {
my ($subr, @brigade)=$_;
...
});
If "Host" or "User-Agent" headers are passed this way they overwrite the
default ones.
Note, though, the header fields are assigned to the subrequest just
before the response handler is run. Earlier phases will see a copy of
the main request's headers.
How does it work?
If the passed URL starts with "https://" or "http://" a subrequest for
the URI "/" is initiated via "$r->lookup_uri('/')". Before the
subrequest is run it is changed into a proxy request for the passed URL.
One precondition for this to work is that there are no access
restrictions for the URL "/".
Otherwise it is simply a subrequest for the passed URL.
Then "ModPerl2::Tools::Filter::fetch_content_filter" is installed as
output filter for the subrequest. After that the subrequest is run.
The output filter collects all output.
When the request is done its "$r->headers_out" is copied into a normal
hash and in list context the output string and this hash are returned.
In scalar context only the string is returned.
HTTP header names are case insensitive. Their names are all converted to
lower case in the $headers hash. There are 2 hash members in upper case.
"STATUS" contains the HTTP status code of the subrequest and
"STATUSLINE" the status line.
Useful functions for similar cases
Note, it is always better to process data one chunk at a time. Try hard
to do that! Collecting data in memory should only be a last resort.
ModPerl2::Tools::Filter::read_bb $bucket_brigade, \@buffer
"read_bb" collects the data of a bucket brigade in the @buffer
array. If an "EOS" bucket has been seen it returns true otherwise
false.
A simple output filter that collects all data could look like:
sub filter {
my ($f, $bb)=@_;
my @buffer;
do_something(join '', @buffer)
if ModPerl2::Tools::Filter::read_bb $bb, \@buffer;
return Apache2::Const::OK;
}
ModPerl2::Tools::Filter::fetch_content_filter
This function is a "FilterRequestHandler". Is is controlled by 2
elements of "$r->pnotes", "out" and "force_fetch_content". "out"
must be an array reference. It is passed to "read_bb" to collect the
output. "force_fetch_content" is a flag. If false the filter does
nothing and removes itself if the "$r->status" on the first
invocation of the filter is not "HTTP_OK".
Usage:
my $subr=$r->lookup_uri(...);
my $output=[];
@{$subr->pnotes}{qw/out force_fetch_content/}=($output,1);
$subr->add_output_filter
(\&ModPerl2::Tools::Filter::fetch_content_filter);
$subr->run;
do_something(join '', @$output)
EXPORTS
None.
TODO
Look at APR to see what it provides to make things easier. For example
"apr_proc_create()"
SEE ALSO
AUTHOR
Torsten Förtsch,
COPYRIGHT AND LICENSE
Copyright (C) 2010 by Torsten Förtsch
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.