Client-Server Interactions and the max-age Attribute with SharePoint BLOB Caching

This post discusses how client-side caching and the max-attribute work with SharePoint BLOB caching. Client-server request/response interactions are covered, and some max-age watch-outs are also detailed.

I first presented (in some organized capacity) on SharePoint’s platform caching capabilities at SharePoint Saturday Ozarks in June of 2010, and since that time I’ve consistently received a growing number of questions on the topic of SharePoint BLOB caching. When I start talking about BLOB caching itself, the area that seems to draw the greatest number of questions and “really?!?!” responses is the use of the max-age attribute and how it can profoundly impact client-server interactions.

I’d been promising a number of people (including Todd Klindt and Becky Bertram) that I would write a post about the topic sometime soon, and recently decided that I had lollygagged around long enough.

Before I go too far, though, I should probably explain why the max-age attribute is so special … and even before I do that, we need to agree on what “caching” is and does.

Caching 101

Why does SharePoint supply caching mechanisms? Better yet, why does any application or hardware device employ caching? Generally speaking, caching is utilized to improve performance by taking frequently accessed data and placing it in a state or location that facilitates faster access. Faster access is commonly achieved through one or both of the following mechanisms:

By placing the data that is to be accessed on a faster storage medium; for example, taking frequently accessed data from a hard drive and placing it into memory.

By placing the data that is to be accessed closer to the point of usage; for example, offloading files from a server that is halfway around the world to one that is local to the point of consumption to reduce round-trip latency and bandwidth concerns. For Internet traffic, this scenario can be addressed with edge caching through a content delivery network such as that which is offered by Akamai’s EdgePlatform.

Oftentimes, data that is cached is expensive to fetch or computationally calculate. Take the digits in pi (3.1415926535 …) for example. Computing pi to 100 decimals requires a series of mathematical operations, and those operations take time. If the digits of pi are regularly requested or used by an application, it is probably better to compute those digits once and cache the sequence in memory than to calculate it on-demand each time the value is needed.

Caching usually improves performance and scalability, and these ultimately tend to translate into a better user experience.

SharePoint and caching

Through its publishing infrastructure, SharePoint provides a number of different platform caching capabilities that can work wonders to improve performance and scalability. Note that yes, I did say “publishing infrastructure” – sorry, I’m not talking about Windows SharePoint Services 3 or SharePoint Foundation 2010 here.

Each of these caching mechanisms and options work to improve performance within a SharePoint farm by using a combination of the two mechanisms I described earlier. Object caching stores frequently accessed property, query, and navigational data in memory on WFEs. Basic BLOB caching copies images, CSS, and similar resource data from content databases to the file system of WFEs. Page output caching piggybacks on ASP.NET page caching and holds SharePoint pages (which are expensive to render) in memory and serves them back to users. The Office Web Applications Cache stores the output of Word documents and PowerPoint presentations (which is expensive to render in web-accessible form) in a special site collection for subsequent re-use.

Public-facing SharePoint

Each of the aforementioned caching mechanisms yields some form of performance improvement within the SharePoint farm by reducing load or processing burden, and that’s all well and good … but do any of them improve performance outside of the SharePoint farm?

What do I even mean by “outside of the SharePoint farm?” Well, consider a SharePoint farm that serves up content to external consumers – a standard/typical Internet presence web site. Most of us in the SharePoint universe have seen (or held up) the Hawaiian Airlines and Ferrari websites as examples of what SharePoint can do in a public-facing capacity. These are exactly the type of sites I am focused on when I ask about what caching can do outside of the SharePoint farm.

For companies that host public-facing SharePoint sites, there is almost always a desire to reduce load and traffic into the web front-ends (WFEs) that serve up those sites. These companies are concerned with many of the same performance issues that concern SharePoint intranet sites, but public-facing sites have one additional concern that intranet sites typically don’t: Internet bandwidth.

Even though Internet bandwidth is much easier to come by these days than it used to be, it’s not unlimited. In the age of gigabit Ethernet to the desktop, most intranet users don’t think about (nor do they have to concern themselves with) the actual bandwidth to their SharePoint sites. I can tell you from experience that such is not the case when serving up SharePoint sites to the general public

So … for all the platform caching options that SharePoint has, is there anything it can actually do to assist with the Internet bandwidth issue?

Enter BLOB caching and the max-age attribute

As it turns out, the answer to that question is “yes” … and of course, it centers around BLOB caching and the max-age attribute specifically. Let’s start by looking at the <BlobCache /> element that is present in every SharePoint Server 2010 web.config file.

BLOB caching disabled

This is the default <BlobCache /> element that is present in all starting SharePoint Server 2010 web.config files, and astute readers will notice that the enabled attribute has a value of false. In this configuration, BLOB caching is turned off and every request for BLOB resources follows a particular sequence of steps. The first request in a browser session looks like this:

In this series of steps

A request for a BLOB resource is made to a WFE

The WFE fetches the BLOB resource from the appropriate content database

The BLOB is returned to the WFE

The WFE returns an HTTP 200 status code and the BLOB to the requester

Here’s a section of the actual HTTP response from server (step #4 above):

You’ll notice that I highlighted theCache-Control header line. This line gives the requesting browser guidance on what it should and shouldn’t do with regard to caching the BLOB resource (typically an image, CSS file, etc.) it has requested. This particular combination basically tells the browser that it’s okay to cache the resource for the current user, but the resource shouldn’t be shared with other users or outside the current session.

Since the browser knows that it’s okay to privately cache the requested resource, subsequent requests for the resource by the same user (and within the same browser session) follow a different pattern:

When the browser makes subsequent requests like this for the resource, the HTTP response (in step #2) looks different than it did on the first request:

A request is made and a response is returned, but the HTTP 304 status code indicates that the requested resource wasn’t updated on the server; as a result, the browser can re-use its cached copy. Being able to re-use the cached copy is certainly an improvement over re-fetching it, but again: the cached copy is only used for the duration of the browser session – and only for the user who originally fetched it. The requester also has to contact the WFE to determine that the cached copy is still valid, so there’s the overhead of an additional round-trip to the WFE for each requested resource anytime a page is refreshed or re-rendered.

BLOB caching enabled

Even if you’re not a SharePoint administrator and generally don’t poke around web.config files, you can probably guess at how BLOB caching is enabled after reading the previous section. That’s right: it’s enabled by setting the enabled attribute to true as follows:

When BLOB caching is enabled in this fashion, the request pattern for BLOB resources changes quite a bit. The first request during a browser session looks like this:

In this series of steps

A request for a BLOB resource is made to a WFE

The WFE returns the BLOB resource from a file system cache

The gray arrow that is shown indicates that at some point, an initial fetch of the BLOB resource is needed to populate the BLOB cache in the file system of the WFE. After that point, the resource is served directly from the WFE so that subsequent requests are handled locally for the duration of the browser session.

As you might imagine based on the interaction patterns described thus far, simply enabling the BLOB cache can work wonders to reduce the load on your SQL Servers (where content databases are housed) and reduce back-end network traffic. Where things get really interesting, though, is on the client side of the equation (that is, the Requester’s machine) once a resource has been fetched.

What about the max-age attribute?

You probably noticed that a max-age attribute wasn’t specified in the default (previous) <BlobCache /> element. That’s because the max-age is actually an optional attribute. It can be added to the <BlobCache /> element in the following fashion:

Before explaining exactly what the max-age attribute does, I think it’s important to first address what it doesn’t do and dispel a misconception that I’ve seen a number of times. The max-age attribute has nothing to do with how long items stay within the BLOB cache on the WFE’s file system. max-age is not an expiration period or window of viability for content on the WFE. The server-side BLOB cache isn’t like other caches in that items expire out of it. New assets will replace old ones via a maintenance thread that regularly checks associated site collections for changes, but there’s no regular removal of BLOB items from the WFE’s file system BLOB cache simply because of age. max-age has nothing to do with server side operations.

So, what does the max-age attribute actually do then? Answer: it controls information that is sent to requesters for purposes of specifying how BLOB items should be cached by the requester. In short: max-age controls client-side cacheability.

The effect of the max-age attribute

max-age values are specified in seconds; in the case above, 43200 seconds translates into 12 hours. When a max-age value is specified for BLOB caching, something magical happens with BLOB requests that are made from client browsers. After a BLOB cache resource is initially fetched by a requester according to the previous “BLOB caching enabled” series of steps, subsequent requests for the fetched resource look like this for a period of time equal to the max-age:

You might be saying, “hey, wait a minute … there’s only one step there. The request doesn’t even go to the WFE?” That’s right: the request doesn’t go to the WFE. It gets served directly from local browser cache – assuming such a cache is in use, of course, which it typically is.

Why does this happen? Let’s take a look at the HTTP response that is sent back with the payload on the initial resource request when BLOB caching is enabled:

The Cache-Control header line in this case differs quite a bit from the one that was specified when BLOB caching was disabled. First, the use of public instead of private tells the receiving browser or application that the response payload can be cached and made available across users and sessions. The response header max-age attribute maps directly to the value specified in the web.config, and in this case it basically indicates that the payload is valid for 12 hours (43,200 seconds) in the cache. During that 12 hour window, any request for the payload/resource will be served directly from the cache without a trip to the SharePoint WFE.

Implications that come with max-age

On the plus side, serving resources directly out of the client-side cache for a period of time can dramatically reduce requests and overall traffic to WFEs. This can be a tremendous bandwidth saver, especially when you consider that assets which are BLOB cached tend to be larger in nature – images, media files, etc. At the same time, serving resources directly out of the cache is much quicker than round-tripping to a WFE – even if the round trip involves nothing more than an HTTP 304 response to say that a cached resource may be used instead of being retrieved.

While serving items directly out of the cache can yield significant benefits, I’ve seen a few organizations get bitten by BLOB caching and excessive max-age periods. This is particularly true when BLOB caching and long max-age periods are employed in environments where images and other BLOB cached resources are regularly replaced and changed-out. Let me illustrate with an example.

Suppose a site collection that hosts collaboration activities for a graphic design group is being served through a Web application zone where BLOB caching is enabled and a max-age period of 43,200 seconds (12 hours) is specified. One of the designers who uses the site collection arrives in the morning, launches her browser, and starts doing some work in the site collection. Most of the scripts, CSS, images, and other BLOB assets that are retrieved will be cached by the user’s browser for the rest of the work day. No additional fetches for such assets will take place.

In this particular scenario, caching is probably a bad thing. Users trying to collaborate on images and other similar (BLOB) content are probably going to be disrupted by the effect of BLOB caching. The max-age value (duration) in-use would either need to be dialed-back significantly or BLOB caching would have to be turned-off entirely.

What you don’t see can hurt you

There’s one more very important point I want to make when it comes to BLOB caching and the use of the max-age attribute: the default <BlobCache /> element doesn’t come with a max-age attribute value, but that doesn’t mean that there isn’t one in-use. If you fail to specify a max-age attribute value, you end up with the default of 86,400 seconds – 24 hours.

This wasn’t always the case! In some recent exploratory work I was doing with Fiddler, I was quite surprised to discover client-side caching taking place where previously it hadn’t. When I first started playing around with BLOB caching shortly after MOSS 2007 was released, omitting the max-age attribute in the <BlobCache /> element meant that a max-age value of zero (0) was used. This had the effect of caching BLOB resources in the file system cache on WFEs without those resources getting cached in public, cross-session form on the client-side. To achieve extended client-side caching, a max-age value had to be explicitly assigned.

Somewhere along the line, this behavior was changed. I’m not sure where it happened, and attempts to dig back through older VM images (for HTTP response comparisons) didn’t give me a read on when Microsoft made the change. If I had to guess, though, it probably happened somewhere around service pack 1 (SP1). That’s strictly a guess, though. I had always gotten into the habit of explicitly including a max-age value – even if it was zero – so it wasn’t until I was playing with the BLOB caching defaults in a SharePoint 2010 environment that I noticed the 24 hour client-side caching behavior by default. I then backtracked to verify that the behavior was present in both SharePoint 2007 and SharePoint 2010, and it affected both authenticated and anonymous users. It wasn’t a fluke.

So watch-out: if you don’t specify a max-age value, you’ll get 24 hour client-side caching by default! If users complain of images that “won’t update” and stale BLOB-based content, look closely at max-age effects.

An alternate viewpoint on the topic

As I was finishing up this post, I decided that it would probably be a good idea to see if anyone else had written on this topic. My search quickly turned up Chris O’Brien’s “Optimization, BLOB caching and HTTP 304s” post which was written last year. It’s an excellent read, highly informative, and covers a number of items I didn’t go into.

Throughout this post, I took the viewpoint of a SharePoint administrator who is seeking to control WFE load and Internet bandwidth consumption. Chris’ post, on the other hand, was written primarily with developer and end-user concerns in mind. I wasn’t aware of some of the concerns that Chris points out, and I learned quite a few things while reading his write-up. I highly recommend checking out his post if you have a moment.

Share this:

Like this:

LikeLoading...

Related

Author: Sean McDonough

I am the Chief Technology Officer for Bitstream Foundry LLC, a SharePoint solutions, services, and consulting company headquartered in Cincinnati, Ohio. My professional development background goes back to the COM and pre-COM days - as well as SharePoint (since 2004) - and I've spent a tremendous amount of time both in the plumbing (as an IT Pro) and APIs (as a developer) associated with SharePoint and SharePoint Online. In addition, Microsoft awarded me an MVP (most valuable professional) in 2016 for the Office Servers and Services category.
View all posts by Sean McDonough

Thanks for the comment. I don’t have a whole lot of direct experience with EBS providers (SharePoint 2007) and RBS providers (SharePoint 2010), but there are a number of vendors operating in the space now. I know that AvePoint, Quest, and StoragePoint operate in this space, so you might check them out. I believe the StoragePoint folks have been at it longer than just about anyone else, and I know them – and their product – better. They are the ones I’d start with if I had to implement an EBS/RBS solution on my own. Again, though: I’m not the expert in this area. Personally, I’d try them all side-by-side and pick the one that works for me.

I have multiple web apps running on our companies home page. Blob caching is enabled on all of them since our site is heavily dependent on images. One site in particular always resets itself to “private, max-age-0” overnight even though it is enabled in the web.config. I can tell it is reset by using Curl to view the host header. When I reset the app pool it reverts back to correct web.config setting which is “public, max-age=259200”. I am not sure what is causing this overnight reset which counters the web.config setting or how to troubleshoot it. Any ideas? Thanks – Finch

Sounds like a very strange situation, John. I can think of a few things to check and verify.

You’ve probably already verified this, but first make sure that there aren’t any variations in the web.config settings for the BlobCache element on each of your WFEs. It probably seems silly, but problems that pop up intermittently in load-balanced farms are oftentimes due to differences between the WFEs.

In my experience, the one time I’ve seen the BLOB cache “break” (apparently work and then stop working) is when web gardens are in-use. A web garden is an multi-process IIS application pool, and they’re great for scalability and isolation … but they’re horrible for the BLOB cache. The internals of the BLOB cache (check out my post here: https://sharepointinterface.com/2009/06/18/we-drift-deeper-into-the-sound-as-the-flush-comes/) require that each worker process (W3WP) exclusively own a BLOB cache folder. With web gardens, this doesn’t happen … and it mucks up the process royally. If you happen to be using web gardens, turn them off.

If you have networking equipment between your machine and the actual WFEs (e.g., a caching firewall or reverse proxy), the cache directives can get modified, rewritten, or dropped altogether. Verify that this isn’t happening to you.

Another point worth mentioning: BLOB caching only works for assets that are associated with list items — not “loose files.” At the same time, BLOB caching is not going to apply for assets that are served directly from the SharePoint Root (typically the _LAYOUTS folder and its subfolders). If you’re looking at some images and seeing them cached while other items are not, make sure they’re not being served from different locations (content DB vs. _LAYOUTS, for example).

Great article! Thanks a lot for sharing your investigations.
Do you know any way to change the cache-control data on Foundation 2010?
We’re using it for a public facing website and “cache-control: private,max-age=0” is really affecting the performance a lot. Any 3rd party tools / IIS tweak / whatever to edit the cache-control settings?

Thanks, Guillaume! Unfortunately, I don’t know of any simple way to address your situation with Foundation 2010. 3rd party products that control caching may exist, but I don’t actually know of any. I’m guessing the market probably isn’t that big given that most folks who build publicly facing SharePoint sites leverage SharePoint’s publishing features – which, of course, require the use of SharePoint Server.

Max-age headers are tied to BLOB caching, and BLOB caching is controlled through SharePoint Server’s PublishingHttpModule (which I described in more detail here: https://sharepointinterface.com/2009/06/18/we-drift-deeper-into-the-sound-as-the-flush-comes/). The PublishingHttpModule only comes with SharePoint Server, so obviously SharePoint Foundation isn’t going to give you any help. You’d be reinventing the wheel a bit, but there’s nothing stopping you from writing an HttpModule yourself that evaluates incoming requests and applies the appropriate HTTP cache control headers to carry out client-side caching of specific assets. In terms of actual coding effort, it wouldn’t be at all prohibitive.

Besides cache control headers, you might see a more consistent across-the-board improvement by simply controlling how much javascript goes to the client browsers (I’m assuming they’re anonymous). Core.js, for example, is hundreds of kilobytes of scripting that really doesn’t need to go to client browsers in anonymous usage scenarios. Chris O’Brien over in the U.K. wrote a nice post (with code sample) on trimming down javascript requests, such as core.js, with some in-line code: http://www.sharepointnutsandbolts.com/2011/01/eliminating-large-js-files-to-optimize.html

I haven’t played with setting the headers directly (I’ve relied on BLOB caching for that), but the ones you specified should generate similar results to those produced by the BLOB cache’s max-age attribute setting. The technique in the article describes doing them on a page-by-page basis, and I imagine that would get tedious pretty quickly … but it would be a good way to prove-out whether or not the technique will work for you. If it does, I recommend incorporating the header additions into an HttpModule for broader (and easier) use.

If you’re going to set the cache control headers for specific pages rather than normally static assets like images, javascript, etc, be careful. Some pages in SharePoint are truly static, but just as many aren’t and require some rendering change (or consideration for user, query string parameter, etc.) in a postback scenario. If you’re going to start caching entire pages (as is typically accomplished through page output caching in MOSS or SharePoint Server 2010), you’re going to need to test, test, and then test some more to make sure that you aren’t inadvertently caching a page with content that is dynamic and should be re-rendered/changed on a postback.