More: Systems Programming with PLT Scheme

Matthew Flatt

In contrast to the impression that Quick: An Introduction to PLT Scheme with Pictures may give, PLT Scheme is
not just another pretty face. Underneath the graphical facade of
DrScheme lies a sophisticated toolbox for managing threads and
processes, which is the subject of this tutorial.

Specifically, we show how to build a secure, multi-threaded,
servlet-extensible, continuation-based web server. We use much more of
the language than in Quick: An Introduction to PLT Scheme with Pictures, and we expect you to click on syntax or
function names that you don’t recognize (which will take you to the
relevant documentation). Beware that the last couple of sections
present material that is normally considered difficult. If you’re
still new to Scheme and have relatively little programming experience,
you may want to skip to Guide: PLT Scheme.

To get into the spirit of this tutorial, we suggest that you set
DrScheme aside for a moment, and switch to raw mzscheme in a
terminal. You’ll also need a text editor, such as emacs or
vi. Finally, you’ll need a web client, perhaps lynx or
firefox.

Of course, if you’re already spoiled, you can keep using
DrScheme.

1Ready...

If you’re using a plain terminal, if you have GNU Readline installed
on your system, and if you’d like Readline support in mzscheme,
then evaluate (requirereadline). If you also evaluate
(install-readline!), then your "~/.mzschemerc" is
updated to load Readline whenever you start mzscheme for
interactive evaluation.

4“Hello World” Server

The server accepts TCP connections through a listener, which
we create with tcp-listen. To make interactive development
easier, we supply #t as the third argument to
tcp-listen, which lets us re-use the port number without
waiting on TCP timeouts.

Now point your browser to http://localhost:8080 (assuming that
you used 8080 as the port number, and that the browser is
running on the same machine) to receive a friendly greeting from your
web server.

5Server Thread

Before we can make the web server respond in more interesting ways, we
need to get a Scheme prompt back. Typing Ctl-C in your terminal window
interrupts the server loop:

In DrScheme, instead of typing Ctl-C, click the
Stop button once.

> (serve8080)

^Cuser break

>

Unfortunately, we cannot now re-start the server with the same port
number:

> (serve8080)

tcp-listen: listen on 8080 failed (address already in use)

The problem is that the listener that we created with serve
is still listening on the original port number.

To avoid this problem, let’s put the listener loop in its own thread,
and have serve return immediately. Furthermore, we’ll have
serve return a function that can be used to shut down the
server thread and TCP listener:

With this change, our server can now handle multiple threads at
once. The handler is so fast that this fact will be difficult to
detect, however, so try inserting (sleep(random10)) before
the handle call above. If you make multiple connections from
the web browser at roughly the same time, some will return soon, and
some will take up to 10 seconds. The random delays will be independent
of the order in which you started the connections.

7Terminating Connections

A malicious client could connect to our web server and not send the
HTTP header, in which case a connection thread will idle forever,
waiting for the end of the header. To avoid this possibility, we’d
like to implement a timeout for each connection thread.

One way to implement the timeout is to create a second thread that
waits for 10 seconds, and then kills the thread that calls
handle. Threads are lightweight enough in Scheme that this
watcher-thread strategy works well:

This first attempt isn’t quite right, because when the thread is
killed, its in and out streams remain open. We
could add code to the watcher thread to close the streams as well as
kill the thread, but Scheme offers a more general shutdown mechanism:
custodians. A custodian is a kind of container for all
resources other than memory, and it supports a
custodian-shutdown-all operation that terminates and closes
all resources within the container, whether they’re threads, streams,
or other kinds of limited resources.

Whenever a thread or stream is created, it is placed into the current
custodian as determined by the current-custodian
parameter. To place everything about a connection into a custodian, we
parameterize all the resource creations to go into a new
custodian:

With this implementation, in, out, and the thread
that calls handle all belong to cust. In addition,
if we later change handle so that it, say, opens a file, then
the file handles will also belong to cust, so they will be
reliably closed when cust is shut down.

In fact, it’s a good idea to change serve to that it uses a
custodian, too:

That way, the main-cust created in serve not only
owns the TCP listener and the main server thread, it also owns every
custodian created for a connection. Consequently, the revised shutdown
procedure for the server immediately terminates all active connections,
in addition to the main server loop.

8Dispatching

It’s finally time to generalize our server’s “Hello, World!”
response to something more useful. Let’s adjust the server so that we
can plug in dispatching functions to handle requests to different
URLs.

To parse the incoming URL and to more easily format HTML output, we’ll
require two extra libraries:

With the new require import and new handle,
dispatch, and dispatch-table definitions, our
“Hello World!” server has turned into an error server. You don’t have
to stop the server to try it out. After modifying "serve.ss"
with the new pieces, evaluate (enter!"serve.ss") and then
try again to connect to the server. The web browser should show an
“Unknown page” error in red.

After adding these lines and evaluating (enter!"serve.ss"),
opening http://localhost:8081/hello should produce the old
greeting.

9Servlets and Sessions

Using the query argument that is passed to a handler by
dispatch, a handler can respond to values that a user
supplies through a form.

The following helper function constructs an HTML form. The
label argument is a string to show the user. The
next-url argument is a destination for the form results. The
hidden argument is a value to propagate through the form as a
hidden field. When the user responds, the "number" field in
the form holds the user’s value:

As usual, once you have added these to your program, update with
(enter!"serve.ss"), and then visit
http://localhost:8081/many. Provide a number, and you’ll receive
a new page with that many “hello”s.

10Limiting Memory Use

With our latest "many" servlet, we seem to have a new
problem: a malicious client could request so many “hello”s that the
server runs out of memory. Actually, a malicious client could also
supply an HTTP request whose first line is arbitrarily long.

The solution to this class of problems is to limit the memory use of a
connection. Inside accept-and-handle, after the definition of
cust, add the line

We’re assuming that 50MB should be plenty for any
servlet. Garbage-collector overhead means that the actual memory use
of the system can be some small multiple of 50 MB. An important
guarantee, however, is that different connections will not be charged
for each other’s memory use, so one misbehaving connection will not
interfere with a different one.

So, with the new line above, and assuming that you have a couple of
hundred megabytes available for the mzscheme process to use,
you shouldn’t be able to crash the web server by requesting a
ridiculously large number of “hello”s.

Given the "many" example, it’s a small step to a web server
that accepts arbitrary Scheme code to execute on the server. In that
case, there are many additional security issues besides limiting
processor time and memory consumption. The
scheme/sandbox library provides support to managing
all those other issues.

11Continuations

As a systems example, the problem of implementing a web server exposes
many system and security issues where a programming language can
help. The web-server example also leads to a classic, advanced Scheme
topic: continuations. In fact, this facet of a web server
needs delimited continuations, which PLT Scheme provides.

The problem solved by continuations is related to servlet sessions and
user input, where a computation spans multiple client connections
[Queinnec00]. Often, client-side computation (as in AJAX) is the
right solution to the problem, but many problems are best solved with
a mixture of techniques (e.g., to take advantage of the browser’s
“back” button).

As the multi-connection computation becomes more complex, propagating
arguments through query becomes increasingly tedious. For
example, we can implement a servlet that takes two numbers to add by
using the hidden field in the form to remember the first number:

The problem is that get-number needs to send an HTML response
back for the current connection, and then it must obtain a response
through a new connection. That is, somehow it needs to convert the
page generated by build-request-page into a query
result:

Continuations let us implement a send/suspend operation that
performs exactly that operation. The send/suspend procedure
generates a URL that represents the current connection’s computation,
capturing it as a continuation. It passes the generated URL to a
procedure that creates the query page; this query page is used as the
result of the current connection, and the surrounding computation
(i.e., the continuation) is aborted. Finally, send/suspend
arranges for a request to the generated URL (in a new connection) to
restore the aborted computation.

Specifically, we need prompt and abort from
scheme/control. We use prompt to mark the
place where a servlet is started, so that we can abort a computation
to that point. Change handle by wrapping an prompt
around the call to dispatch:

Now, we can implement send/suspend. We use call/cc
in the guise of let/cc, which captures the current
computation up to an enclosing prompt and binds that
computation to an identifier – k, in this case:

When the user submits the form, the handler associated with the form’s
URL is the old computation, stored as a continuation in the dispatch
table. Calling the continuation (like a function) restores the old
computation, passing the query argument back to that
computation.

In summary, the new pieces are: (requirescheme/control),
adding prompt inside handle, the definitions of
send/suspend, get-number, and sum2, and
(hash-set!dispatch-table"sum2"sum2). Once you have
the server updated, visit http://localhost:8081/sum2.

Some of this material is based on relatively recent research, and more
information can be found in papers written by the authors of PLT
Scheme, including papers on MrEd [Flatt99], memory accounting
[Wick04], kill-safe abstractions [Flatt04], and
delimited continuations [Flatt07].

Bibliography

[Flatt99]

Matthew Flatt, Robert Bruce Findler, Shriram Krishnamurthi, and Matthias Felleisen, “Programming Languages as Operating Systems
(or Revenge of the Son of the Lisp Machine),” International Conference on Functional Programming, 1999. http://www.ccs.neu.edu/scheme/pubs/icfp99-ffkf.pdf