Note: This page is currently being written and is in an intermediate state.

Note: This page is currently being written and is in an intermediate state.

+

+

Note 2: There is currently a WAI package on Hackage which is actively maintained and very close to the proposal described here. I recommend we remove this page. -- Michael Snoyman

== Abstract ==

== Abstract ==

Line 16:

Line 18:

This specification proposes a standardized interface between web servers and web applications or frameworks: the Haskell Web Application Interface (WAI).

This specification proposes a standardized interface between web servers and web applications or frameworks: the Haskell Web Application Interface (WAI).

−

A standardized interface is of no use of no one implements it. Since there are currently no web servers or frameworks that implements WAI it ''must'' be made easy to implement so the investment in supporting it is low, making it more likely to be widely used.

+

A standardized interface is of no use if no one implements it. Since there are currently no web servers or frameworks that implements WAI it ''must'' be made easy to implement so the investment in supporting it is low, making it more likely to be widely used.

In addition to keeping the interface simple it also needs to avoid adding new dependencies (except maybe for a package defining the types defined in the interface) to the web servers and applications implementing it.

In addition to keeping the interface simple it also needs to avoid adding new dependencies (except maybe for a package defining the types defined in the interface) to the web servers and applications implementing it.

Line 158:

Line 160:

=== The Enumerator Type ===

=== The Enumerator Type ===

−

The web server needs to provide the application with the data in the request body and the application needs to provide the web server with a response body (e.g. an HTML page). They could do so using bytestrings. However, if the the amount of data to send is large (e.g. a big file) all data would have to be kept in memory leading to unnecessary high memory usage. A way to stream data between the server and the client is needed.

+

The web server needs to provide the application with the data in the request body and the application needs to provide the web server with a response body (e.g. an HTML page). They could do so using bytestrings. However, if the amount of data to send is large (e.g. a big file) all data would have to be kept in memory leading to unnecessary high memory usage. A way to stream data between the server and the client is needed.

Streams can be represented in Haskell using lists or some optimized representation like lazy bytestrings. However, using either of these two options is problematic in a web server serving hundreds or even thousands of request per second for the following reason: When the web application opens a file for sending to the client (or some other resource) it needs to free this resource when it is no longer needed. Stream I/O using lists or lazy bytestrings both uses unsafeInterleaveIO together with a finalizer that gets run by the garbage collector to free the resource (i.e. file) when it's no longer needed. But since the garbage collector runs at some unpredictable time in the future the server might run out of resources (e.g. file handles) before it is run leading to it crashing or being unresponsive.

Streams can be represented in Haskell using lists or some optimized representation like lazy bytestrings. However, using either of these two options is problematic in a web server serving hundreds or even thousands of request per second for the following reason: When the web application opens a file for sending to the client (or some other resource) it needs to free this resource when it is no longer needed. Stream I/O using lists or lazy bytestrings both uses unsafeInterleaveIO together with a finalizer that gets run by the garbage collector to free the resource (i.e. file) when it's no longer needed. But since the garbage collector runs at some unpredictable time in the future the server might run out of resources (e.g. file handles) before it is run leading to it crashing or being unresponsive.

Line 177:

Line 179:

</haskell>

</haskell>

−

This particular enumerator which will be used for all data streaming in this specification is a normal, strict left fold with some support for early termination. If the consumer wants to signal that it doesn't want to consume more data it can return <code>Left seed</code>. If it wants to continue it return <code>Right seed</code> instead. The consumed likely wants to perform some I/O during each iteration to e.g. send a part of a large file over the network.

+

This particular enumerator which will be used for all data streaming in this specification is a normal, strict left fold with some support for early termination. If the consumer wants to signal that it doesn't want to consume more data it can return <code>Left seed</code>. If it wants to continue it return <code>Right seed</code> instead. The consumer likely wants to perform some I/O during each iteration to e.g. send a part of a large file over the network.

=== The Environment Type ===

=== The Environment Type ===

Latest revision as of 17:21, 22 June 2010

Note: This page is currently being written and is in an intermediate state.

Note 2: There is currently a WAI package on Hackage which is actively maintained and very close to the proposal described here. I recommend we remove this page. -- Michael Snoyman

As Haskell is getting more widely used more people will want to use Haskell for writing web applications. Web applications has so far been a domain dominated by dynamic languages like PHP, Python and Ruby. To write a web application you generally need two things:

A high quality web server. Production web servers need to be stable, have good performance and be easy to configure and manage. Writing such a web server takes considerable effort.

A framework. A framework helps the user to render HTML pages, persist state between request, etc. There are several web frameworks available for Haskell such as HAppS, WASH, HSP to name a few.

At the moment, picking a framework limits the choice of usable web servers, and vice versa. By creating a standardized interface between web servers and frameworks we could separate the choice of framework from the choice of web server. This frees the server and framework developers to focus on working on their preferred area.

This specification proposes a standardized interface between web servers and web applications or frameworks: the Haskell Web Application Interface (WAI).

A standardized interface is of no use if no one implements it. Since there are currently no web servers or frameworks that implements WAI it must be made easy to implement so the investment in supporting it is low, making it more likely to be widely used.

In addition to keeping the interface simple it also needs to avoid adding new dependencies (except maybe for a package defining the types defined in the interface) to the web servers and applications implementing it.

Finally, it's not a goal of the interface to provide a nice API for writing web applications, that is the job of the framework running on top of the web server.

The WAI interface has two sides: the "server" side, and the "application" or "framework" side. The server calls a function that is provided by the application side. How the server is configured with this function is not a part of this specification.

An application is simply a function from an environment (described later) to a response. We use the word "application" here in a very general sense. WAI is not intended for writing applications directly. Rather it is an interface for writing frameworks upon. However, we will continue to use the word application to refer both to applications and frameworks. A simple application might look like this:

Note that a function may play both the role of a server with respect to some application(s), while also acting as an application with respect to some server(s). Such "middleware" components can perform such functions as:

Routing requests to different applications based on the request URL. This can be used to have several application or frameworks run side-by-side in the same server.

Perform content postprocessing.

Load an application dynamically (using for example the GHC API) on every request removing the manual recompilation step while developing.

The application function takes one argument of type Environment. The environment contains the different parts of the HTTP request and optionally some extra pieces of information describing the web server and the environment in which it is run (such as the shell's environment variables).

When called by the server the application function must return a tuple containing a valid HTTP status code, a list of HTTP response headers, and an enumerator (described below) containing the body of the response.

The server must transmit the body (which it receives by using the enumerator provided by the application) in an unbuffered fashion, completing the transmission of each chunk of bytes before requesting another one from the application. This means that applications should perform their own buffering.

To be able to represent an HTTP message we need a type to represent bytes. Although some parts of a message can be represented by other types, be it integers or strings, some parts are properly viewed as a sequence of bytes (e.g. the message body). Haskell has three different types that could be and are used for this purpose:

String - Used both to represent bytes (e.g. in the Socket API) and text. Has an inefficient memory representation giving it a larger memory footprint and poor cache behavior. Using it for storing bytes is also considered bad style since it is intended to represent Unicode code points.

[Word8] - Has the same properties as String except for being explicitly intended to only contain binary data.

ByteString - A fast and memory efficient representation. Used in this proposal as it can be easily be converted to the above two types but the opposite is not possible without a performance penalty.

The web server needs to provide the application with the data in the request body and the application needs to provide the web server with a response body (e.g. an HTML page). They could do so using bytestrings. However, if the amount of data to send is large (e.g. a big file) all data would have to be kept in memory leading to unnecessary high memory usage. A way to stream data between the server and the client is needed.

Streams can be represented in Haskell using lists or some optimized representation like lazy bytestrings. However, using either of these two options is problematic in a web server serving hundreds or even thousands of request per second for the following reason: When the web application opens a file for sending to the client (or some other resource) it needs to free this resource when it is no longer needed. Stream I/O using lists or lazy bytestrings both uses unsafeInterleaveIO together with a finalizer that gets run by the garbage collector to free the resource (i.e. file) when it's no longer needed. But since the garbage collector runs at some unpredictable time in the future the server might run out of resources (e.g. file handles) before it is run leading to it crashing or being unresponsive.

To avoid this problem resources need to be freed as soon as they are no longer needed. There are (at least) two different ways to achieve this. The first is to use an iterator type interface that provides an explicit closing of the underlying resource:

This is the solution used in most imperative languages e.g. Python and Java. The other option is to use an enumerator (e.g. for-each) style interface and have the enumerator free the resource automatically when the iteration is finished. Oleg showed how this can be implemented using a left fold.

This particular enumerator which will be used for all data streaming in this specification is a normal, strict left fold with some support for early termination. If the consumer wants to signal that it doesn't want to consume more data it can return Left seed. If it wants to continue it return Right seed instead. The consumer likely wants to perform some I/O during each iteration to e.g. send a part of a large file over the network.

The requestMethod field corresponds directly to the HTTP request method and is given as an enumerated data type as it is likely the applications want to pattern matches on it to decide how a request should be handled.

scriptName

The initial portion of the request URL's "path" that corresponds to the application object, so that the application knows its virtual "location". This may be an empty string, if the application corresponds to the "root" of the server.

pathInfo

The remainder of the request URL's "path", designating the virtual "location" of the request's target within the application. This may be an empty string, if the request URL targets the application root and does not have a trailing slash.

queryString

The portion of the request URL that follows the "?", if any. May be empty or absent.

requestProtocol

The version of the protocol the client used to send the request. Typically this will be something like "HTTP/1.0" or "HTTP/1.1" and may be used by the application to determine how to treat any HTTP request headers.

headers

Variables corresponding to the client-supplied HTTP request headers. The presence or absence of these variables should correspond with the presence or absence of the appropriate HTTP header in the request.

If the application does not provide a Content-Length header, a server may choose one of two approaches to handle it. The first and simplest one is to close the client connection after the response has been sent.

The other option is only possible if both the server and the client support HTTP/1.1 "chunked encoding". In this case the server may use chunked encoding to send each chunk and compute the length of each chunk as it does so. By doing so the server may keep the connection alive if it chooses to do so. The server must respect RFC 2616 when doing this.