PERLFAQ9

NAME

VERSION

DESCRIPTION

This section deals with questions related to running web sites,
sending and receiving email as well as general networking.

Should I use a web framework?

Yes. If you are building a web site with any level of interactivity
(forms / users / databases), you
will want to use a framework to make handling requests
and responses easier.

If there is no interactivity then you may still want
to look at using something like Template Toolkit <https://metacpan.org/module/Template>
or Plack::Middleware::TemplateToolkit
so maintenance of your HTML files (and other assets) is easier.

Which web framework should I use?

There is no simple answer to this question. Perl frameworks can run everything
from basic file servers and small scale intranets to massive multinational
multilingual websites that are the core to international businesses.

Below is a list of a few frameworks with comments which might help you in
making a decision, depending on your specific requirements. Start by reading
the docs, then ask questions on the relevant mailing list or IRC channel.

Catalyst

Strongly object-oriented and fully-featured with a long development history and
a large community and addon ecosystem. It is excellent for large and complex
applications, where you have full control over the server.

Dancer

Young and free of legacy weight, providing a lightweight and easy to learn API.
Has a growing addon ecosystem. It is best used for smaller projects and
very easy to learn for beginners.

Mojolicious

Fairly young with a focus on HTML5 and real-time web technologies such as
WebSockets.

Web::Simple

Currently experimental, strongly object-oriented, built for speed and intended
as a toolkit for building micro web apps, custom frameworks or for tieing
together existing Plack-compatible web applications with one central dispatcher.

What is Plack and PSGI?

PSGI is the Perl Web Server Gateway Interface Specification, it is
a standard that many Perl web frameworks use, you should not need to
understand it to build a web site, the part you might want to use is Plack.

Plack is a set of tools for using the PSGI stack. It contains
middleware <https://metacpan.org/search?q=plack%3A%3Amiddleware>
components, a reference server and utilities for Web application frameworks.
Plack is like Ruby's Rack or Python's Paste for WSGI.

You could build a web site using Plack and your own code,
but for anything other than a very basic web site, using a web framework
(that uses Plack) is a better option.

How do I remove HTML from a string?

Use HTML::Strip, or HTML::FormatText which not only removes HTML
but also attempts to do a little simple formatting of the resulting
plain text.

How do I extract URLs?

HTML::SimpleLinkExtor will extract URLs from HTML, it handles anchors,
images, objects, frames, and many other tags that can contain a URL.
If you need anything more complex, you can create your own subclass of
HTML::LinkExtor or HTML::Parser. You might even use
HTML::SimpleLinkExtor as an example for something specifically
suited to your needs.

You can use URI::Find to extract URLs from an arbitrary text document.

How do I fetch an HTML file?

(contributed by brian d foy)

Use the libwww-perl distribution. The LWP::Simple module can fetch web
resources and give their content back to you as a string:

If you need to do something more complicated, you can use
LWP::UserAgent module to create your own user-agent (e.g. browser)
to get the job done. If you want to simulate an interactive web
browser, you can use the WWW::Mechanize module.

How do I automate an HTML form submission?

If you are doing something complex, such as moving through many pages
and forms or a web site, you can use WWW::Mechanize. See its
documentation for all the details.

If you're submitting values using the GET method, create a URL and encode
the form using the "query_form" method:

How do I make sure users can't enter values into a form that causes my CGI script to do bad things?

(contributed by brian d foy)

You can't prevent people from sending your script bad data. Even if
you add some client-side checks, people may disable them or bypass
them completely. For instance, someone might use a module such as
LWP to submit to your web site. If you want to prevent data that
try to use SQL injection or other sorts of attacks (and you should
want to), you have to not trust any data that enter your program.

The perlsec documentation has general advice about data security.
If you are using the DBI module, use placeholder to fill in data.
If you are running external programs with "system" or "exec", use
the list forms. There are many other precautions that you should take,
too many to list here, and most of them fall under the category of not
using any data that you don't intend to use. Trust no one.

How do I parse a mail header?

Use the Email::MIME module. It's well-tested and supports all the
craziness that you'll see in the real world (comment-folding whitespace,
encodings, comments, etc.).

Without sending mail to the address and seeing whether there's a human
on the other end to answer you, you cannot fully answer part b, but
the Email::Valid module will do both part a and part b as far
as you can in real-time.

Our best advice for verifying a person's mail address is to have them
enter their address twice, just as you normally do to change a
password. This usually weeds out typos. If both versions match, send
mail to that address with a personal message. If you get the message
back and they've followed your directions, you can be reasonably
assured that it's real.

A related strategy that's less open to forgery is to give them a PIN
(personal ID number). Record the address and PIN (best that it be a
random one) for later processing. In the mail you send, include a link to
your site with the PIN included. If the mail bounces, you know it's not
valid. If they don't click on the link, either they forged the address or
(assuming they got the message) following through wasn't important so you
don't need to worry about it.

How do I decode a MIME/BASE64 string?

The MIME::Base64 package handles this as well as the MIME/QP encoding.
Decoding base 64 becomes as simple as:

use MIME::Base64;
my $decoded = decode_base64($encoded);

The Email::MIME module can decode base 64-encoded email message parts
transparently so the developer doesn't need to worry about it.

How do I find the user's mail address?

Ask them for it. There are so many email providers available that it's
unlikely the local system has any idea how to determine a user's email address.

The exception is for organization-specific email (e.g. foo@yourcompany.com)
where policy can be codified in your program. In that case, you could look at
$ENV{USER}, $ENV{LOGNAME}, and getpwuid($<) in scalar context, like so:

my $user_name = getpwuid($<)

But you still cannot make assumptions about whether this is correct, unless
your policy says it is. You really are best off asking the user.

By default, Email::Sender::Simple will try `sendmail` first, if it exists
in your $PATH. This generally isn't the case. If there's a remote mail
server you use to send mail, consider investigating one of the Transport
classes. At time of writing, the available transports include:

Email::Sender::Transport::Sendmail

This is the default. If you can use the mail(1) or mailx(1)
program to send mail from the machine where your code runs, you should
be able to use this.

Email::Sender::Transport::SMTP

This transport contacts a remote SMTP server over TCP. It optionally
uses SSL and can authenticate to the server via SASL.

Email::Sender::Transport::SMTP::TLS

This is like the SMTP transport, but uses TLS security. You can
authenticate with this module as well, using any mechanisms your server
supports after STARTTLS.

Telling Email::Sender::Simple to use your transport is straightforward.

How do I use MIME to make an attachment to a mail message?

Email::MIME directly supports multipart messages. Email::MIME
objects themselves are parts and can be attached to other Email::MIME
objects. Consult the Email::MIME documentation for more information,
including all of the supported methods and examples of their use.

To get the IP address, you can use the "gethostbyname" built-in function
to turn the name into a number. To turn that number into the dotted octet
form (a.b.c.d) that most people expect, use the "inet_ntoa" function
from the Socket module, which also comes with perl.

How can I do RPC in Perl?

AUTHOR AND COPYRIGHT

This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.

Irrespective of its distribution, all code examples in this file
are hereby placed into the public domain. You are permitted and
encouraged to use this code in your own programs for fun
or for profit as you see fit. A simple comment in the code giving
credit would be courteous but is not required.