Early findings: Mobile browser cache persistence and behaviour

If you’ve spent any time in the mobile space, you know that the browser cache is a critical limiting factor for performance. I often talk about:

how unreliable the mobile browser cache is,

how poorly it persists data, and

how poorly it observes basic caching headers.

You may also know that one of the key benefits that front-end optimization (FEO) can bring to the mobile world is a more useful browser cache. Our team, led by Hooman Beheshti and Jarrod Connolly, has been working for years to perfect mobile acceleration, and they’ve spent a huge amount of time digging into the browser cache conundrum. Hooman recently presented some very interesting findings at Velocity, which I want to expand on today.

Background:

Much has been written about how poor mobile caching is. For a thorough overview, read more here, here, here, here, and — my favourite link — here. But for our purposes today, here’s the Reader’s Digest version:

1. Mobile cache sizes are far too small.

The stock browser in Android (2.x) has a cache size limit of around 5.7MB. iOS is much larger (>50MB).

2. When we think about mobile cache sizes, we need to take into account a bunch of variables.

We need to consider the total cache size (which is a shared cache for all our browsing), individual object size limits for caching, and the fact that HTML pages may be treated slightly differently.

But we had more questions about mobile caches, so Hooman set out to get some answers. Below is what he found.

Question 1: How do non-stock browsers cache on Android?

Further to this question, Hooman wanted to see if anything has changed with Android ICS (Ice Cream Sandwich, version 4.0 of Android). His methodology was pretty similar to what others have done to determine mobile cache sizes (see the links above).

Here’s what came out of it:

On Gingerbread (Android 2.3.x), Firefox and Opera both had cache sizes that were slightly larger than the stock browser. They both came out at around 10MB.

The stock ICS browser had an HTTP cache that was much bigger than its predecessor. Hooman’s tests showed a limit of 25MB.

He was also going to test Chrome on ICS (Chrome is not available for Gingerbread), but then found this great article from Tony Gentilcore from Google that outlines how the Chrome cache is large and persistent. Better yet, the cache size is dynamic and based on the amount of disk space available, which is great news.

To put things into perspective, it’s good to compare mobile cache sizes to those of desktop browsers, which are usually in the range of 100s of MB:

This article from IE tells us that in IE6/7/8 the IE cache size was 1/32 of the disk, with a default cap of 50MB. IE9 is 1/256 of disk size (not a bad thing since disks are much bigger these days) and has a default cap of 250MB. Cache size is user configurable in IE.

Spot checking our own Chrome and Firefox browsers showed a cache size of >300MB on Chrome and >700MB on Firefox.

Conclusion: Mobile caches are lacking when it comes to how much content they can hold.

This is especially true when you consider that an HTTP object cache is a shared resource for a browser. If you do even minimal browsing on your smartphone, you’re bound to fill up the cache rather quickly. But things are getting better. ICS is a good example of that. And even though progress may not be occurring at the pace we’d like, any progress is good. This also highlights the value of using localStorage as an object cache, which mobile acceleration solutions like Strangeloop’s Mobile Optimizer automatically enable a site to do.

Question 2: How do mobile browser caches behave with various user actions (e.g., closing the browser, locking the phone, shutting down the phone)?

This is an extremely important question that’s never been satisfactorily answered. There is very little data on the persistence of the cache and how (and if) objects are fetched from cache. Here’s a breakdown of Hooman’s suite of experiments in this area.

Methodology

Considering the hundreds of permutations that something like this could end up with, we tried to keep things simple. We put an Android Nexus S (version 2.3.7) and an older iPhone 3GS (iOS version 5.1.1) through the following paces:

1. We used a single page with ~2MB of images on it to prime the cache.

2. In priming the cache, we used four different sets of cache control headers (including none at all) on the images to see their effect on caching:

No cache headers at all

A Last-Modified header only

A Cache-Control: max-age header, together with a Last-Modified header

A Cache-Control: max-age header only

3. Then, after priming the cache, we performed some user actions and revisited the page.

4. Through network traces, we determined if, when, and how the browser went back to the server to fetch any of the images.

Results

The table below shows all the permutations and how each phone behaved in various circumstances.

It’s not an easy table to read, so here’s what’s happening:

In all the green cells, all the images were fetched from the cache. No requests for those images went over the network to the server.

In all the orange cells, nothing was fetched from the cache. All the images were re-fetched from the server.

In the blue cells, the requests were made from the server, but with some relevant caching headers in the requests. If you want to know exactly what these headers mean, it’s best to consult RFC2616.

User action

Device

Primed with no cache headers

Primed with Last-Modified

Primed with Last-Modified and max-age

Primed with max-age

New window or tab

Android

All from cache

All from cache

All from cache

All from cache

iOS

None from cache

All from cache

All from cache

All from cache

Exit browser, lock phone, unlock phone, open browser

Android

All from cache

All from cache

All from cache

All from cache

iOS

None from cache

All from cache

All from cache

All from cache

Kill browser, launch browser

Android

All from cache

All from cache

All from cache

All from cache

iOS

None from cache

All from cache

All from cache

All from cache

Reboot phone

Android

All from cache

All from cache

All from cache

All from cache

iOS

None from cache

All from cache

All from cache

All from cache

After page loads, click URL and hit “Enter”

Android

All from cache

If-Modified-Since and Cache-Control:max-age=0

If-Modified-Since and Cache-Control:max-age=0

All from cache

iOS

None from cache

All from cache

All from cache

All from cache

Refresh after page loads

Android

Cache-Control:no-cache

Cache-Control:no-cache

Cache-Control:no-cache

Cache-Control:no-cache

iOS

Cache-Control:max-age=0

If-Modified-Since and Cache-Control:max-age=0

If-Modified-Since and Cache-Control:max-age=0

Cache-Control:max-age=0

Observations and conclusions

Most of the cells show behavior that is expected from a well-behaving browser. But there are a few areas Hooman highlighted as particularly interesting:

Primed with No Cache Headers

It’s interesting that Android was so aggressive and chose to cache the images even if they didn’t have caching headers on them. This is particularly aggressive. But that’s not necessarily a bad thing, especially if this behavior is specific to images. But it makes it that much more likely that the cache will fill up quickly. iOS, by contrast, doesn’t cache any images if they don’t have any caching headers at all.

After page loads, click URL and hit “Enter”

This whole row was interesting because, to the Android, this is very much like a refresh of the page. For situations where the original images were put into the cache with the Last-Modified validator, the browser wants to validate all the way to the origin server. (Intermediate caches can’t validate on their own because of the max-age=0 directive.) This is true even if there was a max-age put on the image when it originally went into the cache.

Primed with Last-Modified

When the image went into the cache with just a Last-Modified validator, caching was rather aggressive form both Android and iOS. This is actually probably a good thing. It’s also possible that this behavior would be different if we had revisited the page much later than when the images originally went into the cache.

A few caveats

As Hooman points out, even though this is all very telling, it’s important to consider that we didn’t test the hundreds of permutations that could have included how full the cache was and combinations of different content types (images, scripts, CSS, HTML, etc). So these observations apply just to images. It’s possible that some of the behaviours could have been different if other file types were introduced into the mix.

Also, while some research out there suggests that the iOS cache isn’t persistent (it reportedly gets wiped after the device shuts down), we didn’t find this in our research. However, the fact that we used a small amount of images could explain this discrepancy.

Final thoughts: Mobile browser vendors need to offer more visibility into their products

This is all great info and we’ll continue to test and get insight into how mobile browsers work. This is not only important for the performance community, but also vital to our own product development here at Strangeloop.

But we’d also like to re-iterate what others have said, which is to ask browser vendors to provide more visibility and documentation into what their browsers are doing (kudos to Chrome and Tony for starting to walk that path). As mobile browsing is poised to rival and overtake desktop browsing in the near future, the more we know, the better we can design our web applications and the products that help optimize them.

For a greater understanding of mobile performance challenges, I encourage you to download Strangeloop’s mobile optimization whitepaper (not a product shill, I promise).