Contents

New Cache Plans

We have decided to rewrite our HTTP disk cache.

People

The design team will be responsible for coming up with a design for the new disk cache. The design should be thorough and well-documented. Once the design team is satisfied with an initial design document, the implementation team will start implementing.

Design team:

Michal Novotny

Taras Glek

Steve Workman

Honza Bambas

Nick Hurley

Brian Bondy

Doug Turner

Patrick McManus

Steve Workman

Implementation team:

Honza Bambas

Michal Novotny

Primary Design Goals

This section documents issues that need to be addressed in the new cache's design.

Version API for the cache so we can update easily.

All APIs should be async. No main-thread locking or i/o at all.

A crash or abnormal program termination should not invalidate the entire cache.

Support gzip compression. Meta-data should say whether a file is gzip'd or not, can choose to write compressed or uncompressed data on a per-file basis at runtime. Pass through files gzip'd from the network.

Make use of fallocate.

Minimize API surface, especially for APIs exposed to JS/extensions. All exposed APIs should have a clear, safe use case.

Consider eliminating memory cache.

Competing ideas:

Temporal layout so that sub-resources are together.

Don't over-optimize on-disk storage, use one file per entry and let OS optimize.

Separate services for HTTP and offline cache? Find a way to make these use cases work well without over-complicating code.

Browser should behave properly with disk cache entirely disabled.

Allow for effectively racing cache against network, so as to not wait serially.

Use this very same cache for more general meta-like data, e.g. cache hosts for DNS prewarms, appcache namespaces + its other data and versioning, any useful host specific data we now getter in memory and throw away after restart (SPDY preference, TLS tolerance, pipeline successful test, etc...)

Success Metrics

This section documents the ways in which we'll determine whether or not the new cache design is a success.

Should not be possible to trigger main-thread i/o.

Create telemetry for with-cache and without-cache. For top 50% cache should be faster than no cache, for low 50% cache should be faster than no cache.

API

This section documents the APIs for interacting with the new disk cache.