Web/Literature/Practical web programming in Haskell

From HaskellWiki

This page is under construction. Feel free to help out. If you make substantial edits, please add your name to the authors list at the bottom of this page, so that you can be credited if this is ever published in another medium.

1 Introduction

This tutorial aims to get you started with writing web applications
in Haskell. We describe a relatively light-weight
approach to Haskell web programming
which uses a CGI library and an XHTML combinator library.

We think that while the approach we describe here is not as sophisticated
or innovative as some other approaches, it is simple, portable and easy
to understand if you are already familiar with web programming in other
languages.

The tutorial starts with preliminaries such as how to install the
necessary software and how to compile and run your web
applications. We then show a number of working small example programs
which introduce the basic features of the CGI and XHtml libraries. We
then move on to how to use monad transformers to add application
specific functionality such as sessions to the CGI monad, and how to
create database-driven web applications.
We also present FastCGI, and an approach to using dynamically
loaded Haskell code.

1.2 Assumed knowledge

This tutorial is not meant as an introduction to Haskell or web programming.
We will assume that you have some familiarity with the following
concepts:

1.2.1 Haskell

This tutorial is not meant as a first introduction to Haskell. If you
want to learn about Haskell in general, have a look at the lists of
books and tutorials. You may want to start with Haskell in 5 steps.

The combinators in the XHtml library do not make much sense unless you
understand at least some parts of HTML.

1.2.3 CGI

CGI (Common Gateway Interface) programs are programs which run on the
web server. They are given input which comes from the user's browser,
and their output is given to the browser.

To really understand how the CGI library works, you probably need to know
a thing or two about CGI. The authoritative resource on CGI is the
CGI specification.

2 Required software

2.1 Haskell compiler

GHC, the Glasgow Haskell
Compiler, is the Haskell implementation that we will use in this tutorial.
However, any Haskell implementation that supports Haskell98 and multi-parameter
type classes should work.

2.2 Libraries: xhtml and cgi

If your Haskell implementation does not come with the xhtml and
cgi packages, download them from
HackageDB.

2.3 Web server

You need to have access to a web server on which you can run CGI programs.
The most convenient way to do this when learning and developing is to run
a web server on your development machine. If you run the programs on some
other machine you need to make sure that you compile your programs
so that they can run on that machine. This normally means that the machines
must to have the same architecture and run the same operating system.

2.3.1 Deploying statically linked applications

Linking your applications statically by giving the flags -static
-optl-static to GHC will avoid problems with missing libraries on
the web server.

The -static flag in GHC 6.8.2 does not link the libraries in the correct order, resulting in a link failure (which you can hack around if you have to by shuffling -lpthread after -lrt in the gargantuan linker invocation). This problem should disappear with GHC 6.8.3.

Sometimes you will need to add extra-libraries fields to various libraries' .cabal files. This manifests as missing symbols. Note that many linkers are sensitive to the order of the -l arguments, so the order of libraries in this field matters.

3 Compiling and running web applications

Use GHC to produce a binary executable called prog.cgi from the Haskell
source code file prog.hs:

ghc --make -package cgi -package xhtml -o prog.cgi prog.hs

Put the compiled program in the cgi-bin directory,
or give it the extension .cgi, depending on the configuration
of the web server.

Linking your applications statically
by giving the flags -static -optl-static to GHC
will avoid problems with missing libraries on the web server.

To run the compiled program, visit the URL of the CGI
program with your web browser.

4 Simple examples

4.1 Hello World

Here is a very simple example which just outputs some static HTML.
The type signatures in this code are optional. We show them here
for clarity, but omit them in some later examples.

The page function constructs an HTML document which consists
of a body containing a single header element which contains the text
"Hello World". The CGI-action cgiMain renders the HTML document
as a string, and produces that string as output. The main function
runs cgiMain, using the normal CGI protocol for input and output.
It also uses handleErrors to output an error page in case |cgiMain|
throws an exception.

Fans of one-liners may like this version better (handleErrors has been
omitted since this simple program will not throw any exceptions):

We first output a file upload form, which should use the HTTP POST method,
and the multipart/form-data content type. Here we seen an example of the use of
HTML attributes, added with the ! operator.

For efficiency reasons, we use Data.ByteString.Lazy to represent the file contents.
getInputFPS gets the value of an input variable as a lazy ByteString.

4.5 Error handling

handleErrors catches all exceptions and
outputs a default error page with some information about the exception.
You can write you own exception handler if you want to do something else
when an exception is thrown. It can be useful to set
the response code, e.g. 404.

4.6 Returning non-HTML

Of course we do not have to output HTML. Use setHeader to set the value
of the Content-type header, and you can output whatever string you like.

4.7 Setting response headers

You can use the setHeader function to set arbitrary HTTP response headers.
You can also set the response code, as seen above.

Example: output raw file data (with last-modified)

5 Going further

This section explores some of possibilities beyond the basic web application
programming.

5.1 Extending the CGI monad with monad transformers

At this point, you should be able to create many useful CGI scripts.
As your scripts get more ambitious, however, you may find yourself
needing to pass "global" parameters to your CGI actions (e.g. database
connections, session information.) Rather than explicitly passing
these values around, you can extend the CGI monad to do this work for
you.

The

Network.CGI.Monad

module defines a CGI monad

transformer, allowing us to build a new monad that does everything the
CGI monad does -- and more!

For example, let's define a new CGI monad that provides a database
connection (in this example, we use the

Database.HSQL.PostgreSQL

module for our database.) Since

it will be used by the CGI application, I'll call the new monad "App".

So now we have an App monad that gives us all the functionality of
CGI, but also carries around a database connection. The last step is
to define the function that creates the monad so we can run actions
inside it.

gets released properly when the monad ends or if an exception is
thrown.

5.2 FastCGI

FastCGI is a standard for CGI-like programs that are not restarted
for every request. This reduces the overhead involved in handling each
request, and reduces the servers response time for each request.
The overhead involved in starting a new process for each
request can also include the need to set up new DB connections
every time. With FastCGI, DB connections can be reused.

Install FastCGI. Get a web server which can run FastCGI programs.
Import Network.FastCGI. Use runFastCGI.

5.3 SCGI

SCGI is a simpler alternative to FastCGI for writing CGI-like programs in persistent processes, external to the web server. SCGI is less featureful than FastCGI, but has the advantage that it does not require an external library.

Install SCGI, import Network.SCGI, and use runSCGI. Everything else is then done inside a CGI monad as above.

5.4 URL rewriting

5.5 Dynamic loading

6 Database-driven web-applications

6.1 Persistent DB connections with FastCGI

FastCGI aren't restarted for each request, only the runFastCGI part is re-run. Everything (handles, datastructures etc.) you do outside of that loop will be persistent. However you need to handle errors yourself, because you're operating outside of handleErrors.