The HTTP protocol follows the robustness principle
as described in RFC1122,
which states "Be liberal in what you accept, and conservative in
what you send". As a result of this principle, HTTP clients will
compensate for and recover from incorrect or misconfigured responses, or
responses that are uncacheable.

As a website is scaled up to face greater and greater traffic loads,
suboptimal or misconfigured applications or server configurations can
threaten both the stability and scalability of the website, as well as
the hosting costs associated with it. A website can also scale up to face
greater configuration complexity, and it can be increasingly difficult to
detect and keep track of suboptimally configured URL spaces on a given
server.

Eventually a point is reached where the principle "conservative in
what you send" needs to be enforced by the server administrator.

The mod_policy module provides a set of filters
which can be applied to a server, allowing key features of the HTTP
protocol to be explicitly tested, and non compliant responses logged as
warnings, or rejected outright as an error. Each filter can be applied
separately, allowing the administrator to pick and choose which policies
should be enforced depending on the circumstances of their environment.

The filters might be placed in testing and staging environments for
the benefit of application and website developers, or may be applied
to production servers to protect infrastructure from systems outside
the administrator's direct control.

In the above example, an Apache httpd server has been placed between
the application server and the internet at large, and configured to cache
responses from the application server. The mod_policy
filters have been added to enforce support for cacheable content and
conditional requests, ensuring that both mod_cache and
public caches on the internet are fully able to cache content created
by the restful application server efficiently.

In the above simpler example, a static server serving highly cacheable
content has a set of policies applied to ensure that the server configuration
conforms to a minimum level of compliance.

This policy will be rejected if the server does not correctly respond
to a conditional request with the appropriate status code.

Conditional requests form the mechanism by which an HTTP cache makes
stale content fresh again, and particularly for content with short freshness
lifetimes, lack of support for conditional requests can add avoidable load
to the server.

Most specifically, the existence of any of following headers in the
request makes the request conditional:

If-Match

If the provided ETag in the If-Match header does not match
the ETag of the response, the server should return
412 Precondition Failed. Full details of how to handle an
If-Match header can be found in
RFC2616 section 14.24.

If-None-Match

If the provided ETag in the If-None-Match header matches
the ETag of the response, the server should return either
304 Not Modified for GET/HEAD requests, or
412 Precondition Failed for other methods. Full details of how
to handle an If-None-Match header can be found in
RFC2616 section 14.26.

If-Modified-Since

If the provided date in the If-Modified-Since header is
older than the Last-Modified header of the response, the server
should return 304 Not Modified. Full details of how to handle an
If-Modified-Since header can be found in
RFC2616 section 14.25.

If-Unmodified-Since

If the provided date in the If-Modified-Since header is
newer than the Last-Modified header of the response, the server
should return 412 Precondition Failed. Full details of how to
handle an If-Unmodified-Since header can be found in
RFC2616 section 14.28.

If-Range

If the provided ETag or date in the If-Range header matches
the ETag or Last-Modified of the response, and a valid Range
is present, the server should return
206 Partial Response. Full details of how to handle an
If-Range header can be found in
RFC2616 section 14.27.

If the response is detected to have been successful (a 2xx response),
but was conditional and one of the responses above was expected instead,
this policy will be rejected. Responses that indicate a redirect or a
failure of some kind (3xx, 4xx, 5xx) will be ignored by this policy.

When the Content-Length header is present, the size of
the body is declared at the start of the response. If this information
is missing, an HTTP cache might choose to ignore the response, as it
does not know in advance whether the response will fit within the
cache's defined limits.

HTTP/1.1 defines the Transfer-Encoding header as an
alternative to Content-Length, allowing the end of the
response to be indicated to the client without the client having to
know the length beforehand. However, when HTTP/1.0 requests are
processed, and no Content-Length is specified, the only
mechanism available to the server to indicate the end of the request
is to drop the connection. In an environment containing load
balancers, this can cause the keepalive mechanism to be bypassed.

If the response is detected to have been successful (a 2xx response),
and has a response body (this excludes 204 No Content), and
the Content-Length header is missing, this policy will be
rejected. Responses that indicate a redirect or a failure of some kind
(3xx, 4xx, 5xx) will be ignored by this policy.

It should be noted that some modules, such as
mod_proxy, add their own Content-Length
header should the response be small enough for it to have been possible
to read the response lacking such a header in one go. This may cause
small responses to pass this policy, while larger responses may
fail for the same URL.

When the Content-Length header is present, the size of
the body is declared at the start of the response. HTTP/1.1 defines the
Transfer-Encoding header as an alternative to
Content-Length, allowing the end of the response to be
indicated to the client without the client having to know the length
beforehand. In the absence of these two mechanisms, the only way for
a server to indicate the end of the request is to drop the connection.
In an environment containing load balancers, this can cause the keepalive
mechanism to be bypassed.

Most specifically, we follow these rules:

IF

we have not marked this connection as errored;

and

the client isn't expecting 100-continue

and

the response status does not require a close;

and

the response body has a defined length due to the status code
being 304 or 204, the request method being HEAD, already having defined
Content-Length or Transfer-Encoding: chunked, or the request version
being HTTP/1.1 and thus capable of being set as chunked

THEN

we support keepalive.

The server may choose to turn off keepalive for
various reasons, such as an imminent shutdown, or a Connection: close from
the client, or an HTTP/1.0 client request with a response with no
Content-Length, but for our purposes we only care that
keepalive was possible from the application, not that keepalive actually
took place.

It should also be noted that the Apache httpd server includes a filter
that adds chunked encoding to responses without an explicit content
length. This policy catches those cases where this filter is bypassed or
not in effect.

This policy will be rejected if the server response does not have
an explicit freshness lifetime at least as long
as the server defined limit, or if the freshness lifetime is
calculated based on a heuristic.

During the freshness lifetime, a cache does not need to contact the
origin server at all, it can simply pass the cached content as is back
to the client.

When the freshness lifetime is reached, the cache should contact the
origin server in an effort to check whether the content is still fresh,
and if not, replace the content.

When the freshness lifetime is too short, it can result in excessive
load on the server. In addition, should an outage occur that is as long
or longer than the freshness lifetime, all cached content will become
stale, which could cause a thundering herd of traffic when the
server or network returns.

Most specifically, should any of the following header combinations
exist in the response headers, the response will be rejected:

Cache-Control: no-cache

Cache-Control: no-store

Cache-Control: private

Pragma: no-cache

When unexpected, uncacheable content may produce unacceptable levels
of server load, or may incur significant cost. When this policy is enabled,
all server defined uncacheable content will be rejected.

In addition to being checked present, the headers are checked for
syntax.

An ETag that is not surrounded with quotes, or is not
declared "weak" by prefixing it with a "W/" will cause the policy to be
rejected. A Last-Modified that is not parsed as a valid date
will cause the policy to be rejected.

Some client provided headers, such as User-Agent,
can contain thousands or millions of combinations of values over a period
of time, and if the response is declared cacheable, a cache might attempt
to cache each of these responses separately, filling up the cache and
crowding out other entries in the cache. In this scenario, if so
configured, the policy will reject the response.

This policy will be rejected if the client request was made with a
version number lower than the version of HTTP specified.

This policy is typically used with restful applications where
control over the type of client is desired. This policy can be used
alongside the POLICY_KEEPALIVE filter to ensure that
HTTP/1.0 clients don't cause keepalive connections to be dropped.

Notice:This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed again by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Freenode, or sent to our mailing lists.