What are the Benefits of a Larger Cache?

Bigger is better right? So a 512KB L2 cache must be better than a 256KB one
- after all, AMD wouldn't spend 17 million transistors for no gain. Although
it's very true that a larger cache is generally beneficial, the real question
is how beneficial and in what situations. To answer that question, we should
have a quick lesson in caches and what makes them so useful.

Think of a cache as a bridge between two entities - a slower and a faster one.
In this case, the cache we are talking about is part of a multilevel cache system
and it helps to bridge the gap between the CPU and main memory.

It's no surprise that main memory runs significantly slower than today's CPUs.
Not only does memory run at significantly slower clock speeds (e.g. 200MHz for
DDR400) than today's CPUs, but main memory is physically located very far away
from the processor. Our multi-gigahertz CPUs have to waste well over 100 clock
cycles to retrieve data from main memory as their requests must cross over slow
front-side buses, through an external memory controller, to the memory and back.
Making this trip can wreak havoc on performance, especially for CPUs with very
long pipelines, as these pipelines generally remain idle if the data necessary
to populate them has to be fetched from main memory.

The idea behind a processor's caches is that you store important data in these
high speed memories (now located on the processor's die itself), so that most
of the time, your CPU doesn't have to make the long trip to main memory. The
reason caches are split into multiple levels is because the larger your cache
is, the longer it takes to fetch data. Therefore, it ends up being that having
one smaller but very low latency cache combined with a larger and somewhat higher
latency (but still significantly quicker than main memory) cache provides the
best balance of performance in today's microprocessors. These two caches are
the Level 1 (L1) and Level 2 (L2) caches you hear about all the time.

Caches work based on two major principles - spatial and temporal locality.
These two principles are simple; spatial locality states that, if you are accessing
data, then, the data around it will be accessed soon, and temporal locality
states that if you are accessing data, chances are that you'll access that same
piece of data again. In practice, this means that frequently accessed data is
kept in cache, as well as data physically around it. Since caches are of relatively
small sizes (rightfully so, it would be cost and performance prohibitive to
have main memory-sized caches), the algorithms they use to make sure that the
right information remains in the cache is even more critical to performance
than the sheer size of the cache.

With Barton, AMD left their L1 the same as before, but increased their L2 cache
size by a total of 256KB. AMD didn't change any of the specifications of the
cache (e.g. it is still a 16-way set associative L2 cache) Luckily, AMD increased
the cache size without sacrificing access time, but where will the added L2
cache help?

Let's look at those two principles we mentioned before, spatial and temporal
locality. If an application's usage pattern does not abide by either one of
these principles, then it doesn't matter how much cache you add, the performance
will not improve. So what are some examples of applications that are and are
not cache-friendly?

For starters, let's talk about things that don't abide by the principle of
temporal locality - mainly multimedia applications, more specifically - encoding
applications. If you think about how encoding works, the data is never reused,
simply encoded on a bit-by-bit basis and then the original data is never touched
again. At the other end of the spectrum, we have things like office applications
that happily abide by the principle of temporal locality. In these sorts of
applications, you are often re-using data, performing very similar tasks to
them over and over again and thus making great use of larger caches.

The principle of spatial locality applies to a much wider range of applications,
including multimedia encoding applications because of the fact that data is
generally stored in contiguous form in main memory and is thus very cache-friendly.
Spatial locality is why you will see some improvement from larger caches even
in applications that don't exhibit much temporal locality.