In recent years, the exponential growth of the
Internet has been widely attributed to the usefulness of the
World-Wide Web (WWW). This is a distributed hyper-media
information resource based around the HyperText Markup Language
(HTML) which is delivered by a wide spread implementation of the
HyperText Transfer Protocol (HTTP). HTTP is an extensible
protocol for the transfer of generic data types but specially
optimised for the transfer of HTML. To quote the specification of
HTTP/1.0: [Draft1.0]

"The
Hypertext Transfer Protocol (HTTP) is an
application-level protocol with the lightness and speed
necessary for distributed, collaborative, hypermedia
information systems. It is a generic, stateless,
object-oriented protocol which can be used for many
tasks, such as name servers and distributed object
management systems, through extension of its request
methods (commands). A feature of HTTP is the typing and
negotiation of data representation, allowing systems to
be built independently of the data being transferred.

The WWW is regularly used by millions of users
world wide and because of the congestion on the Internet is
beginning to put strain on the infrastructure thus making true
interactive use of the Internet an oxymoron. Although the main
cause of this is the network speed, we will be looking at how the
inefficiencies of HTTP, particularly in conjunction with TCP,
have contributed to the problem. We also briefly examine other
issues to do with HTTP in its current incarnation and as an
evolving standard.

HTTP originated as a very simple protocol, HTTP
0.9, developed to reduce the inefficiencies of the FTP protocol.
The goal was fast request-response interaction without requiring
state at the server. The protocol was extended to include a MIME
style wrapper (to convey the content type and encoding of the
returned document), and a basic authentication mechanism. This
extended protocol became HTTP 1.0, which is in very widespread
use.

The HTTP model is extremely simple. The client
(user) establishes a connection to the remote server, then issues
a request. The server then processes the request, returns a
response, and closes the connection.

The simplicity of HTTP has been a major factor
in its rapid adoption, but this very simplicity has become its
main drawback.

One main problem with HTTP/1.0 is the fact that
it opens a connection (usually TCP) for each request for
data and closes that connection directly after receiving the data
object. TCP/IP uses a slow start mechanism to avoid congestion
and gradually increases the throughput to match the bandwidth
available. As a result most HTTP transactions operate at a
reduced bandwidth (operating at around 10% was quoted). This was
discussed by Spero in the 31st IETF Meeting [31Minutes]. Figures quoted in Speros [SperoAnal] paper on
HTTP performance over TCP are 530ms for a open/get/close for a
single data object, and only 120ms if the connection is already
open. This is a substantial improvement considering that as
HTTP/1.0 stands it cannot ask for any more than one object per
request. With small transfers the protocol spends more time
waiting for connections to be setup and closed than for the
actual transfer of the data.

The above problem with respect to small data
objects is particularly relevant as a study of 200,000 HTTP
retrievals referenced in Padmanabhan and Moguls paper [PadMog]
regarding the improvement of HTTP latency problems states that
the mean size of data objects transported was 13767 and median of
1946 bytes (excluding zero length transfers).

HTTP/1.0 only has basic access authentication.
This meant that passwords and private details are sent as plain
text and so are vulnerable to snooping on the network.

Scaleablity is also large problem due to this
single request per connect paradigm. One is due to the widespread
use of TCP which requires the server to maintain connection
information 4 minutes after the connection is closed. This
on a busy server can amount to thousands of control blocks. This
concept is discussed in Speros paper [SperoAnal]. Another scaleablity problem is that clients, to
overcome the one data item per connection, use multiple threads,
with a connection per thread, thus increasing the load seen by
the already overloaded server. Netscape Navigator and Microsoft
Explorer now due this multi-threading as standard.

The HTTP Working Group was set up to develop
extensions to the existing HTTP/1.0. Their findings and proposals
were embodied is HTTP/1.1 which was published as an internet
draft before the Stockholm IETF meeting in July 1995. According
to Dave Kristol, the ultimate aim of the extended protocol was to
provide a small yet relatively versatile system for security,
payment information, packetizing and compression [32Minutes]. A separate group was formed to deal with security and
payment information.

The extensions focused on the main problems of
HTTP/1.0 as outlined above. They introduced the concept of a
session method. This improvement introduces a
persistent connection where the connection is only terminated on
an agreement between the client and server, which is sent using
the Connection header field. One of the key
improvements was that multiple transactions within a single
connection were now available. Also, it allowed session-long
negotiation of Accept-*, authentication, and privacy
extensions.

According to the HTTP/1.1 Internet Draft [Draft1.1], persistent HTTP connections, allowing multiple
transactions, have a number of advantages:

By limiting the
number of TCP connections, CPU time is saved, and
memory used for TCP protocol control blocks is also
saved.

HTTP requests and
responses can be pipelined during one connection.
Pipelining allows a client to make multiple requests
without waiting for each response. The
acknowledgement system implemented by the combination
of HTTP/1.0 and TCP was far too restrictive and as
Spero said, only used ten per cent of the total
bandwidth available.

Network
congestion is reduced by reducing the number of
packets caused by TCP opens, and by allowing TCP
sufficient time to determine the congestion state of
the network.

The HTTP protocol
can be developed with ease since errors can be
reported without the penalty of closing the TCP
connection. Clients using future versions of HTTP
could implement a new feature, but if communicating
with an older server, retry with old semantics after
an error is reported.

Another possible extension which was discussed
was better access authentication. The Digest Authentication
Scheme [DraftDigest] was devised to encrypt sensitive details for
transmission across the Internet. This was not included, however,
in the final draft of the HTTP/1.1.

As it stands, HTTP/1.1 has only just become a
proposed standard and may become a Request For Comment if
accepted.

Within the Working Group on HTTP under the
IETF, there are several groups set up to deal with particular
aspects of improving on previous protocols and at the time of
writing these groups submit there findings / drafts under the
title of HTTP/1.2. These submissions, such as the Internet Draft
on Transparent Content Negotiation in HTTP [TransCont], and the Protocol Extension Protocol (PEP) [PEP] are
such extensions. There are several different working groups and
all have Internet Drafts associated with them. These drafts are
heavily influenced by HTTPng which for some time now has been the
goal of the Working Group.

HTTP - Next Generation is just that, the next
generation in the HTTP series. It is outlined by Simon Spero in
his progress report [HTTPng]. The report outlines the current state of the
protocol, and is writen primarily in comparison to HTTP/1.0 as it
was during the development of HTTP/1.1, and thus some of the
comments on such things as persistent connections are now
obsolete, but more recent comments on HTTPng are hard to find.

HTTPng is a protocol based on not only
persistent connections but also on multiple data flows / streams
within the same connection. This allows the downloading of
different data items concurrently so that if one stalls for a
period, that time slot will be used by another. The protocol uses
a special Session Layer protocol called SCP (Session Control
Protocol) [SCP] which manages these streams of data.

In HTTP/1.x, all the data is encoded/decoded
using a variant of MIME. This is easy for humans to understand
but very wasteful and complicated for computers. To avoid this,
HTTPng uses a different way of describing and encoding the
request message. HTTPng uses a simplified form of ASN.1 (Abstract
Syntax Notation) and PER (Packed Encoding Rules). This scheme
allows efficient, compact parsers to be generated automatically,
whilst remaining simple enough to allow hand-crafted parsers to
be built easily. An example of the result of this is the bit
vectors used in the HTTPng header to specify body content instead
of lengthy character strings. Each bit of the vector specifies
the most commonly transferred data types and so makes the
transport of these types efficient but is also extensible so that
other types can still be transferred. This is in direct contrast
to HTTP/1.x which has to send a list of data types that it can
accept on each request and each type in the list is a character
string describing it.

At the moment, as HTTPng is so different to
HTTP/1.x, it is probably going to be implemented first on a
server to server basis / proxy servers where its efficiency would
be of greatest importance. The servers could also see what items
are needed and pre-fetch them using HTTPng and then feed them to
the client using say HTTP/1.1.

With the many improvements outlined above
hopefully not only will the WWW and thus the Internet become much
more useful but also very much less congested. It is interesting
to note that while new versions of the protocol are being
designed recently, very few actual implementations have used even
a large part of the HTTP/1.0 specification. This is partly due to
the complexity and inadequate standard definitions provided to
developers but also due to the speed of growth which has led to
the need to rush implementations.

In practice it has been discovered that the
vast majority of transfers have been of a very small subset of
data type. While NG tackles this by having short bit codes for
popular types, others have investigated even simpler solutions
than HTTP. One of these is Suns WebNFS which has been
outlined in an RFC. This views the Internet as a huge public file
system. Type checking is left purely to the client. This may
prove to be a viable alternative for bulk traffic.

Finally, HTTP is not in itself immature,
however most implementations of it are. New tools, requiring set
standards, could make all the difference.