HTTP Cache

The nature of rich web applications means that they're dynamic. No matter
how efficient your application, each request will always contain more overhead
than serving a static file.

And for most Web applications, that's fine. Symfony2 is lightning fast, and
unless you're doing some serious heavy-lifting, each request will come back
quickly without putting too much stress on your server.

But as your site grows, that overhead can become a problem. The processing
that's normally performed on every request should be done only once. This
is exactly what caching aims to accomplish.

The most effective way to improve performance of an application is to cache
the full output of a page and then bypass the application entirely on each
subsequent request. Of course, this isn't always possible for highly dynamic
websites, or is it? In this chapter, you'll see how the Symfony2 cache
system works and why this is the best possible approach.

The Symfony2 cache system is different because it relies on the simplicity
and power of the HTTP cache as defined in the HTTP specification.
Instead of reinventing a caching methodology, Symfony2 embraces the standard
that defines basic communication on the Web. Once you understand the fundamental
HTTP validation and expiration caching models, you'll be ready to master
the Symfony2 cache system.

For the purposes of learning how to cache with Symfony2, the
subject is covered in four steps:

A gateway cache, or reverse proxy, is
an independent layer that sits in front of your application. The reverse
proxy caches responses as they're returned from your application and answers
requests with cached responses before they hit your application. Symfony2
provides its own reverse proxy, but any reverse proxy can be used.

HTTP cache headers are used
to communicate with the gateway cache and any other caches between your
application and the client. Symfony2 provides sensible defaults and a
powerful interface for interacting with the cache headers.

HTTP expiration and validation
are the two models used for determining whether cached content is fresh
(can be reused from the cache) or stale (should be regenerated by the
application).

Edge Side Includes (ESI) allow HTTP
cache to be used to cache page fragments (even nested fragments) independently.
With ESI, you can even cache an entire page for 60 minutes, but an embedded
sidebar for only 5 minutes.

Since caching with HTTP isn't unique to Symfony, many articles already exist
on the topic. If you're new to HTTP caching, Ryan
Tomayko's article Things Caches Do is highly recommended . Another in-depth resource is Mark
Nottingham's Cache Tutorial.

When caching with HTTP, the cache is separated from your application entirely
and sits between your application and the client making the request.

The job of the cache is to accept requests from the client and pass them
back to your application. The cache will also receive responses back from
your application and forward them on to the client. The cache is the "middle-man"
of the request-response communication between the client and your application.

Along the way, the cache will store each response that is deemed "cacheable"
(See Introduction to HTTP Caching). If the same resource is requested again,
the cache sends the cached response to the client, ignoring your application
entirely.

But a gateway cache isn't the only type of cache. In fact, the HTTP cache
headers sent by your application are consumed and interpreted by up to three
different types of caches:

Browser caches: Every browser comes with its own local cache that is
mainly useful for when you hit "back" or for images and other assets.
The browser cache is a private cache as cached resources aren't shared
with anyone else;

Proxy caches: A proxy is a shared cache as many people can be behind a
single one. It's usually installed by large corporations and ISPs to reduce
latency and network traffic;

Gateway caches: Like a proxy, it's also a shared cache but on the server
side. Installed by network administrators, it makes websites more scalable,
reliable and performant.

Tip

Gateway caches are sometimes referred to as reverse proxy caches,
surrogate caches, or even HTTP accelerators.

Note

The significance of private versus shared caches will become more
obvious when caching responses containing content that is
specific to exactly one user (e.g. account information) is discussed.

Each response from your application will likely go through one or both of
the first two cache types. These caches are outside of your control but follow
the HTTP cache directions set in the response.

Symfony2 comes with a reverse proxy (also called a gateway cache) written
in PHP. Enable it and cacheable responses from your application will start
to be cached right away. Installing it is just as easy. Each new Symfony2
application comes with a pre-configured caching kernel (AppCache) that
wraps the default one (AppKernel). The caching Kernel is the reverse
proxy.

To enable caching, modify the code of a front controller to use the caching
kernel:

1
2
3
4
5
6
7
8
9
10
11
12

// web/app.phprequire_once__DIR__.'/../app/bootstrap.php.cache';require_once__DIR__.'/../app/AppKernel.php';require_once__DIR__.'/../app/AppCache.php';useSymfony\Component\HttpFoundation\Request;$kernel=newAppKernel('prod',false);$kernel->loadClassCache();// wrap the default AppKernel with the AppCache one$kernel=newAppCache($kernel);$kernel->handle(Request::createFromGlobals())->send();

The caching kernel will immediately act as a reverse proxy - caching responses
from your application and returning them to the client.

Tip

The cache kernel has a special getLog() method that returns a string
representation of what happened in the cache layer. In the development
environment, use it to debug and validate your cache strategy:

1

error_log($kernel->getLog());

The AppCache object has a sensible default configuration, but it can be
finely tuned via a set of options you can set by overriding the
getOptions()
method:

Unless overridden in getOptions(), the debug option will be set
to automatically be the debug value of the wrapped AppKernel.

Here is a list of the main options:

default_ttl: The number of seconds that a cache entry should be
considered fresh when no explicit freshness information is provided in a
response. Explicit Cache-Control or Expires headers override this
value (default: 0);

private_headers: Set of request headers that trigger "private"
Cache-Control behavior on responses that don't explicitly state whether
the response is public or private via a Cache-Control directive.
(default: Authorization and Cookie);

allow_reload: Specifies whether the client can force a cache reload by
including a Cache-Control "no-cache" directive in the request. Set it to
true for compliance with RFC 2616 (default: false);

allow_revalidate: Specifies whether the client can force a cache
revalidate by including a Cache-Control "max-age=0" directive in the
request. Set it to true for compliance with RFC 2616 (default: false);

stale_while_revalidate: Specifies the default number of seconds (the
granularity is the second as the Response TTL precision is a second) during
which the cache can immediately return a stale response while it revalidates
it in the background (default: 2); this setting is overridden by the
stale-while-revalidate HTTP Cache-Control extension (see RFC 5861);

stale_if_error: Specifies the default number of seconds (the granularity
is the second) during which the cache can serve a stale response when an
error is encountered (default: 60). This setting is overridden by the
stale-if-error HTTP Cache-Control extension (see RFC 5861).

If debug is true, Symfony2 automatically adds a X-Symfony-Cache
header to the response containing useful information about cache hits and
misses.

Changing from one Reverse Proxy to Another

The Symfony2 reverse proxy is a great tool to use when developing your
website or when you deploy your website to a shared host where you cannot
install anything beyond PHP code. But being written in PHP, it cannot
be as fast as a proxy written in C. That's why it is highly recommended you
use Varnish or Squid on your production servers if possible. The good
news is that the switch from one proxy server to another is easy and
transparent as no code modification is needed in your application. Start
easy with the Symfony2 reverse proxy and upgrade later to Varnish when
your traffic increases.

For more information on using Varnish with Symfony2, see the
How to use Varnish cookbook chapter.

Note

The performance of the Symfony2 reverse proxy is independent of the
complexity of the application. That's because the application kernel is
only booted when the request needs to be forwarded to it.

To take advantage of the available cache layers, your application must be
able to communicate which responses are cacheable and the rules that govern
when/how that cache should become stale. This is done by setting HTTP cache
headers on the response.

Tip

Keep in mind that "HTTP" is nothing more than the language (a simple text
language) that web clients (e.g. browsers) and web servers use to communicate
with each other. HTTP caching is the part of that language that allows clients
and servers to exchange information related to caching.

HTTP specifies four response cache headers that are looked at here:

Cache-Control

Expires

ETag

Last-Modified

The most important and versatile header is the Cache-Control header,
which is actually a collection of various cache information.

The Cache-Control header is unique in that it contains not one, but various
pieces of information about the cacheability of a response. Each piece of
information is separated by a comma:

Cache-Control: private, max-age=0, must-revalidate

Cache-Control: max-age=3600, must-revalidate

Symfony provides an abstraction around the Cache-Control header to make
its creation more manageable:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

// ...useSymfony\Component\HttpFoundation\Response;$response=newResponse();// mark the response as either public or private$response->setPublic();$response->setPrivate();// set the private or shared max age$response->setMaxAge(600);$response->setSharedMaxAge(600);// set a custom Cache-Control directive$response->headers->addCacheControlDirective('must-revalidate',true);

Both gateway and proxy caches are considered "shared" caches as the cached
content is shared by more than one user. If a user-specific response were
ever mistakenly stored by a shared cache, it might be returned later to any
number of different users. Imagine if your account information were cached
and then returned to every subsequent user who asked for their account page!

To handle this situation, every response may be set to be public or private:

public: Indicates that the response may be cached by both private and
shared caches;

private: Indicates that all or part of the response message is intended
for a single user and must not be cached by a shared cache.

Symfony conservatively defaults each response to be private. To take advantage
of shared caches (like the Symfony2 reverse proxy), the response will need
to be explicitly set as public.

HTTP caching only works for "safe" HTTP methods (like GET and HEAD). Being
safe means that you never change the application's state on the server when
serving the request (you can of course log information, cache data, etc).
This has two very reasonable consequences:

You should never change the state of your application when responding
to a GET or HEAD request. Even if you don't use a gateway cache, the presence
of proxy caches mean that any GET or HEAD request may or may not actually
hit your server;

Don't expect PUT, POST or DELETE methods to cache. These methods are meant
to be used when mutating the state of your application (e.g. deleting a
blog post). Caching them would prevent certain requests from hitting and
mutating your application.

HTTP 1.1 allows caching anything by default unless there is an explicit
Cache-Control header. In practice, most caches do nothing when requests
have a cookie, an authorization header, use a non-safe method (i.e. PUT, POST,
DELETE), or when responses have a redirect status code.

Symfony2 automatically sets a sensible and conservative Cache-Control
header when none is set by the developer by following these rules:

If no cache header is defined (Cache-Control, Expires, ETag
or Last-Modified), Cache-Control is set to no-cache, meaning
that the response will not be cached;

If Cache-Control is empty (but one of the other cache headers is present),
its value is set to private, must-revalidate;

But if at least one Cache-Control directive is set, and no 'public' or
private directives have been explicitly added, Symfony2 adds the
private directive automatically (except when s-maxage is set).

With the expiration model, you simply specify how long a response should
be considered "fresh" by including a Cache-Control and/or an Expires
header. Caches that understand expiration will not make the same request
until the cached version reaches its expiration time and becomes "stale";

When pages are really dynamic (i.e. their representation changes often),
the validation model is often necessary. With this model, the
cache stores the response, but asks the server on each request whether
or not the cached response is still valid. The application uses a unique
response identifier (the Etag header) and/or a timestamp (the Last-Modified
header) to check if the page has changed since being cached.

The goal of both models is to never generate the same response twice by relying
on a cache to store and return "fresh" responses.

Reading the HTTP Specification

The HTTP specification defines a simple but powerful language in which
clients and servers can communicate. As a web developer, the request-response
model of the specification dominates your work. Unfortunately, the actual
specification document - RFC 2616 - can be difficult to read.

There is an on-going effort (HTTP Bis) to rewrite the RFC 2616. It does
not describe a new version of HTTP, but mostly clarifies the original HTTP
specification. The organization is also improved as the specification
is split into seven parts; everything related to HTTP caching can be
found in two dedicated parts (P4 - Conditional Requests and P6 -
Caching: Browser and intermediary caches).

As a web developer, you are strongly urged to read the specification. Its
clarity and power - even more than ten years after its creation - is
invaluable. Don't be put-off by the appearance of the spec - its contents
are much more beautiful than its cover.

The expiration model is the more efficient and straightforward of the two
caching models and should be used whenever possible. When a response is cached
with an expiration, the cache will store the response and return it directly
without hitting the application until it expires.

The expiration model can be accomplished using one of two, nearly identical,
HTTP headers: Expires or Cache-Control.

According to the HTTP specification, "the Expires header field gives
the date/time after which the response is considered stale." The Expires
header can be set with the setExpires()Response method. It takes a
DateTime instance as an argument:

The setExpires() method automatically converts the date to the GMT
timezone as required by the specification.

Note that in HTTP versions before 1.1 the origin server wasn't required to
send the Date header. Consequently the cache (e.g. the browser) might
need to rely onto his local clock to evaluate the Expires header making
the lifetime calculation vulnerable to clock skew. Another limitation
of the Expires header is that the specification states that "HTTP/1.1
servers should not send Expires dates more than one year in the future."

Because of the Expires header limitations, most of the time, you should
use the Cache-Control header instead. Recall that the Cache-Control
header is used to specify many different cache directives. For expiration,
there are two directives, max-age and s-maxage. The first one is
used by all caches, whereas the second one is only taken into account by
shared caches:

1
2
3
4
5
6

// Sets the number of seconds after which the response// should no longer be considered fresh$response->setMaxAge(600);// Same as above but only for shared caches$response->setSharedMaxAge(600);

The Cache-Control header would take on the following format (it may have
additional directives):

When a resource needs to be updated as soon as a change is made to the underlying
data, the expiration model falls short. With the expiration model, the application
won't be asked to return the updated response until the cache finally becomes
stale.

The validation model addresses this issue. Under this model, the cache continues
to store responses. The difference is that, for each request, the cache asks
the application whether or not the cached response is still valid. If the
cache is still valid, your application should return a 304 status code
and no content. This tells the cache that it's ok to return the cached response.

Under this model, you mainly save bandwidth as the representation is not
sent twice to the same client (a 304 response is sent instead). But if you
design your application carefully, you might be able to get the bare minimum
data needed to send a 304 response and save CPU also (see below for an implementation
example).

Tip

The 304 status code means "Not Modified". It's important because with
this status code the response does not contain the actual content being
requested. Instead, the response is simply a light-weight set of directions that
tell cache that it should use its stored version.

Like with expiration, there are two different HTTP headers that can be used
to implement the validation model: ETag and Last-Modified.

The ETag header is a string header (called the "entity-tag") that uniquely
identifies one representation of the target resource. It's entirely generated
and set by your application so that you can tell, for example, if the /about
resource that's stored by the cache is up-to-date with what your application
would return. An ETag is like a fingerprint and is used to quickly compare
if two different versions of a resource are equivalent. Like fingerprints,
each ETag must be unique across all representations of the same resource.

To see a simple implementation, generate the ETag as the md5 of the content:

1
2
3
4
5
6
7
8
9

publicfunctionindexAction(){$response=$this->render('MyBundle:Main:index.html.twig');$response->setETag(md5($response->getContent()));$response->setPublic();// make sure the response is public/cacheable$response->isNotModified($this->getRequest());return$response;}

The isNotModified()
method compares the ETag sent with the Request with the one set
on the Response. If the two match, the method automatically sets the
Response status code to 304.

This algorithm is simple enough and very generic, but you need to create the
whole Response before being able to compute the ETag, which is sub-optimal.
In other words, it saves on bandwidth, but not CPU cycles.

The Last-Modified header is the second form of validation. According
to the HTTP specification, "The Last-Modified header field indicates
the date and time at which the origin server believes the representation
was last modified." In other words, the application decides whether or not
the cached content has been updated based on whether or not it's been updated
since the response was cached.

For instance, you can use the latest update date for all the objects needed to
compute the resource representation as the value for the Last-Modified
header value:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

publicfunctionshowAction($articleSlug){// ...$articleDate=new\DateTime($article->getUpdatedAt());$authorDate=new\DateTime($author->getUpdatedAt());$date=$authorDate>$articleDate?$authorDate:$articleDate;$response->setLastModified($date);// Set response as public. Otherwise it will be private by default.$response->setPublic();if($response->isNotModified($this->getRequest())){return$response;}// ... do more work to populate the response with the full contentreturn$response;}

The isNotModified()
method compares the If-Modified-Since header sent by the request with
the Last-Modified header set on the response. If they are equivalent,
the Response will be set to a 304 status code.

Note

The If-Modified-Since request header equals the Last-Modified
header of the last response sent to the client for the particular resource.
This is how the client and server communicate with each other and decide
whether or not the resource has been updated since it was cached.

The main goal of any caching strategy is to lighten the load on the application.
Put another way, the less you do in your application to return a 304 response,
the better. The Response::isNotModified() method does exactly that by
exposing a simple and efficient pattern:

useSymfony\Component\HttpFoundation\Response;publicfunctionshowAction($articleSlug){// Get the minimum information to compute// the ETag or the Last-Modified value// (based on the Request, data is retrieved from// a database or a key-value store for instance)$article=...;// create a Response with a ETag and/or a Last-Modified header$response=newResponse();$response->setETag($article->computeETag());$response->setLastModified($article->getPublishedAt());// Set response as public. Otherwise it will be private by default.$response->setPublic();// Check that the Response is not modified for the given Requestif($response->isNotModified($this->getRequest())){// return the 304 Response immediatelyreturn$response;}else{// do more work here - like retrieving more data$comments=...;// or render a template with the $response you've already startedreturn$this->render('MyBundle:MyController:article.html.twig',array('article'=>$article,'comments'=>$comments),$response);}}

When the Response is not modified, the isNotModified() automatically sets
the response status code to 304, removes the content, and removes some
headers that must not be present for 304 responses (see
setNotModified()).

So far, it's been assumed that each URI has exactly one representation of the
target resource. By default, HTTP caching is done by using the URI of the
resource as the cache key. If two people request the same URI of a cacheable
resource, the second person will receive the cached version.

Sometimes this isn't enough and different versions of the same URI need to
be cached based on one or more request header values. For instance, if you
compress pages when the client supports it, any given URI has two representations:
one when the client supports compression, and one when it does not. This
determination is done by the value of the Accept-Encoding request header.

In this case, you need the cache to store both a compressed and uncompressed
version of the response for the particular URI and return them based on the
request's Accept-Encoding value. This is done by using the Vary response
header, which is a comma-separated list of different headers whose values
trigger a different representation of the requested resource:

1

Vary: Accept-Encoding, User-Agent

Tip

This particular Vary header would cache different versions of each
resource based on the URI and the value of the Accept-Encoding and
User-Agent request header.

The Response object offers a clean interface for managing the Vary
header:

1
2
3
4
5

// set one vary header$response->setVary('Accept-Encoding');// set multiple vary headers$response->setVary(array('Accept-Encoding','User-Agent'));

The setVary() method takes a header name or an array of header names for
which the response varies.

You can of course use both validation and expiration within the same Response.
As expiration wins over validation, you can easily benefit from the best of
both worlds. In other words, by using both expiration and validation, you
can instruct the cache to serve the cached content, while checking back
at some interval (the expiration) to verify that the content is still valid.

Gateway caches are a great way to make your website perform better. But they
have one limitation: they can only cache whole pages. If you can't cache
whole pages or if parts of a page has "more" dynamic parts, you are out of
luck. Fortunately, Symfony2 provides a solution for these cases, based on a
technology called ESI, or Edge Side Includes. Akamaï wrote this specification
almost 10 years ago, and it allows specific parts of a page to have a different
caching strategy than the main page.

The ESI specification describes tags you can embed in your pages to communicate
with the gateway cache. Only one tag is implemented in Symfony2, include,
as this is the only useful one outside of Akamaï context:

Notice from the example that each ESI tag has a fully-qualified URL.
An ESI tag represents a page fragment that can be fetched via the given
URL.

When a request is handled, the gateway cache fetches the entire page from
its cache or requests it from the backend application. If the response contains
one or more ESI tags, these are processed in the same way. In other words,
the gateway cache either retrieves the included page fragment from its cache
or requests the page fragment from the backend application again. When all
the ESI tags have been resolved, the gateway cache merges each into the main
page and sends the final content to the client.

All of this happens transparently at the gateway cache level (i.e. outside
of your application). As you'll see, if you choose to take advantage of ESI
tags, Symfony2 makes the process of including them almost effortless.

Now, suppose you have a page that is relatively static, except for a news
ticker at the bottom of the content. With ESI, you can cache the news ticker
independent of the rest of the page.

1
2
3
4
5
6
7
8

publicfunctionindexAction(){$response=$this->render('MyBundle:MyController:index.html.twig');// set the shared max age - which also marks the response as public$response->setSharedMaxAge(600);return$response;}

In this example, the full-page cache has a lifetime of ten minutes.
Next, include the news ticker in the template by embedding an action.
This is done via the render helper (See Embedding Controllers
for more details).

As the embedded content comes from another page (or controller for that
matter), Symfony2 uses the standard render helper to configure ESI tags:

Since Symfony 2.0.20/2.1.5, the Twig render tag now takes an absolute url
instead of a controller logical path. This fixes an important security
issue (CVE-2012-6431) reported on the official blog. If your application
uses an older version of Symfony or still uses the previous render tag
syntax, you should upgrade as soon as possible.

The render tag takes the absolute url to the embedded action. This means
that you need to define a new route to the controller that you're embedding:

Unless you want this URL to be accessible to the outside world, you
should use Symfony's firewall to secure it (by allowing access to your
reverse proxy's IP range). See the Securing by IP
section of the Security Chapter for more information
on how to do this.

Tip

The best practice is to mount all your ESI urls on a single prefix (e.g.
/esi) of your choice. This has two main advantages. First, it eases
the management of ESI urls as you can easily identify the routes used for ESI.
Second, it eases security management since securing all urls starting
with the same prefix is easier than securing each individual url. See
the above note for more details on securing ESI URLs.

By setting standalone to true in the render Twig tag, you tell
Symfony2 that the action should be rendered as an ESI tag. You might be
wondering why you would want to use a helper instead of just writing the ESI tag
yourself. That's because using a helper makes your application work even if
there is no gateway cache installed.

When standalone is false (the default), Symfony2 merges the included page
content within the main one before sending the response to the client. But
when standalone is true, and if Symfony2 detects that it's talking
to a gateway cache that supports ESI, it generates an ESI include tag. But
if there is no gateway cache or if it does not support ESI, Symfony2 will
just merge the included page content within the main one as it would have
done were standalone set to false.

Note

Symfony2 detects if a gateway cache supports ESI via another Akamaï
specification that is supported out of the box by the Symfony2 reverse
proxy.

The embedded action can now specify its own caching rules, entirely independent
of the master page.

1
2
3
4
5
6

publicfunctionnewsAction($max){// ...$response->setSharedMaxAge(60);}

With ESI, the full page cache will be valid for 600 seconds, but the news
component cache will only last for 60 seconds.

One great advantage of this caching strategy is that you can make your
application as dynamic as needed and at the same time, hit the application as
little as possible.

Note

Once you start using ESI, remember to always use the s-maxage
directive instead of max-age. As the browser only ever receives the
aggregated resource, it is not aware of the sub-components, and so it will
obey the max-age directive and cache the entire page. And you don't
want that.

The render helper supports two other useful options:

alt: used as the alt attribute on the ESI tag, which allows you
to specify an alternative URL to be used if the src cannot be found;

ignore_errors: if set to true, an onerror attribute will be added
to the ESI with a value of continue indicating that, in the event of
a failure, the gateway cache will simply remove the ESI tag silently.

"There are only two hard things in Computer Science: cache invalidation
and naming things." --Phil Karlton

You should never need to invalidate cached data because invalidation is already
taken into account natively in the HTTP cache models. If you use validation,
you never need to invalidate anything by definition; and if you use expiration
and need to invalidate a resource, it means that you set the expires date
too far away in the future.

Note

Since invalidation is a topic specific to each type of reverse proxy,
if you don't worry about invalidation, you can switch between reverse
proxies without changing anything in your application code.

Actually, all reverse proxies provide ways to purge cached data, but you
should avoid them as much as possible. The most standard way is to purge the
cache for a given URL by requesting it with the special PURGE HTTP method.

Here is how you can configure the Symfony2 reverse proxy to support the
PURGE HTTP method:

Symfony2 was designed to follow the proven rules of the road: HTTP. Caching
is no exception. Mastering the Symfony2 cache system means becoming familiar
with the HTTP cache models and using them effectively. This means that, instead
of relying only on Symfony2 documentation and code examples, you have access
to a world of knowledge related to HTTP caching and gateway caches such as
Varnish.