Enabling Client-Side Caching of Generated Content in ASP.NET

Abstract: This article describes how to support client-side caching of dynamically generated content in ASP.NET and Delphi, and lists some of the lesser-known problems with ASP.NET’s HttpCachePolicy class.

Introduction

Web sites are slow. Well, usually slower than native applications, anyway. It’s not that web applications are badly written, or that browsers are stupid – it’s simply that browsers and web servers have to perform many tasks just to show a single page. For example:

The browser has to parse the URL, extract the server name, resolve its address, and connect to it.

The browser then sends an HTTP request and waits for a response.

The server parses the request, and tries to fulfill it.

The server then sends an HTTP response to the browser.

The browser parses the response headers and extracts the content.

Finally, the browser renders the content. If the page has embedded content (such as images or style sheets), the steps are repeated for every embedded element.

The third and fourth steps usually take the longest. The server has to load or generate the content, then transfer it to the browser over the Internet. For static sites, that transfer is the slowest part, so browsers try to cache the data locally. This is a little more difficult for web applications.

Web applications usually work by generate content based on information stored in databases. In fact, almost every element you see on this page was retrieved from a database. Because content is dynamically generated, the web server doesn’t know if the content is going to change, so it can’t give the browser enough information to cache the results. It then becomes the responsibility of the application to provide that information.

How Client-Side Caching Works

For caching to work, browsers and web servers follow a set of rules specified in the HTTP 1.1 specifications. The rules can be quite complicated, but the basic process is simple:

When the server returns content that can be cached, it provides additional information such as when was the content last modified, when does it expire, and who is allowed to cache it.

When the browser asks for content, it tells the server which version of the content it currently has in its cache.

If the server determines the content hadn’t changed, it sets the HTTP status code to 304 (“Not Modified”). Otherwise, the request is treated as a normal request for new content, and the server tried to fulfill it.

The caching information on conveyed in HTTP request and response headers, and most web servers provide caching information in response headers and process caching request headers for static content.

Client-Side Caching in ASP.NET

ASP.NET allows pages to control client-side caching (which it calls “output caching”) using several methods:

By using the OutputCacheSection and OutputCacheSettingsSection elements in web.config and machine.config files.

By setting the @ OutputCache directive in ASP.NET pages and user controls.

Code can access the HttpCachePolicy class using the Cache property of the current HttpRequest.

Supporting Client-Side Caching in Code

To support client-side caching for dynamic content, an ASP.NET web application has to do two things:

Content that can be cached needs to be decorated with the appropriate response headers.

The application must process the request headers and determine whether cached content has changed since it was last cached.

Because caching headers use a timestamp, the application must keep a timestamp of the last modification to the content.

Setting Response Headers

Last Modification Timestamp

The SetLastModified method of HttpCachePolicy adds the Last-Modified HTTP header to the response, and correctly encodes the specified timestamp. The browser will pass the timestamp to the server next time it tries to retrieve the content.

Response.Cache.SetLastModified(ModifyDate);

Entity Tag

Some clients, however, do not send conditional requests unless the ETag header is also specified. The ETag header contains a unique “entity tag.” Clients may use the entity tag instead of or in addition to the last modification date, so applications should provide both.

The entity tag must identify the exact version of the content. One method of generating such a unique identifier is to use the content name and last modification date. The following code generates an entity tag based on those parameters:

The double quotes added by the function are expected by the HTTP 1.1 specification for entity tags.

The SetETag method of the HttpCachePolicy class adds the ETag header to the response. However, there are a few restrictions on the use of this method:

The SetETag method can only be called once. Calling it a second time will raise an exception.

The method does not add the double quotes expected by HTTP 1.1. If you use a different algorithm to create the tag, make sure you include the double quotes.

HttpCachePolicy will not add the header if the “cacheability” is set to Private, which is the default value. Unless you call the SetCacheability method with a different value, you’ll have to add the header by calling HttpRequest’s AppendHeader method.

As far as I can tell, that last restriction isn’t documented, and may be a bug.

Expiration

You can specify cache expiration using the SetExpires, SetMaxAge, and SetSlidingExpiration methods. Most browsers will not send request for cached content until it expires. This means the server will not get the opportunity to check whether content had been updated, so only set expiration for content you know is not going to change within the specified timeframe.

Processing Requests for Cached Content

If the content is cached, browsers send what the HTTP 1.1 specification calls “conditional GET” requests. A conditional GET is a GET request that includes an If-Modified-Since, If-Unmodified-Since, If-Match, If-None-Match, or If-Range header field, but not the Range header field (in that case it is considered a “partial GET” request).

The If-Modified-Since request header corresponds to the Last-Modified response header, and contains the same value. Similarly, the If-None-Match request header contains the value passed in the ETag response header.

According to the HTTP 1.1 specification, browsers must pass the If-Match or If-None-Match headers for cache-conditional requests if the server provided an entity tag, but not all browsers do. The specification also states that servers receiving requests containing both an entity tag and a last modification date must process both, and may only return a 304 status code if all conditions match.

The following code determines whether content had been updated since it was last cached:

Because all that’s needed to determine whether the content had been updated is the resource name and the last modification date, it’s possible to optimize applications that retrieve content from a database by only checking the required fields and not retrieving the actual content.

Notice the additional check for time differences larger than one second:

The DateTime type has a higher resolution than the HTTP date/time format, so we lose values smaller than one second when setting the Last-Modified header. Making sure the content is at least one second newer than the cache avoids the problem of always assuming the cache is invalid if the last modification timestamp does not have a round second value.

Returning Responses for Cached Content

If the application determines the cache is still valid, it may return a 304 HTTP status code:

There’s still one problem with the code, and it’s not obvious without checking the HTTP response at run-time: the Connection field is automatically added to the header, with its value set to close. This causes the browser to close the connection, which means the browser will have to open a new connection for future requests, which may affect performance.

The reason the field is added is that the content is empty, and because the browser doesn’t know what content to expect, it waits for additional data. The problem can be prevented by explicitly adding the Content-Length header. Since we’re not returning any data, we set the field value to 0: