Protocol
overhead is larger due to inspection of all replicated cache
tags

Several
messages on the critical path to retrieve up-to-date copy from
another cache.

Advantages/Disadvantages

Shared Cache:
Advantages

No
coherence protocol at shared cache level

Less
latency of communication

Coherence protocol overhead is smaller, Replication of tags for
private

Processors with overlapping working set

One
processor may prefetch data for the other

Smaller
cache size needed

Better
usage of loaded cache lines before eviction (spatial locality)

Less
congestion on limited memory connection

Dynamic
sharing of cache space

if one
processor needs less space, the other can use more

Avoidance of false sharing

Disadvantages:

Multiple
CPUs impose higher requirements

higher
bandwidth

Shared
cache is larger than the individual private caches. This induces
higher latency.

Design
more complex

One CPU
can evict data of other CPU

Implementation of shared caches

Centralized vs distributed (tiled)

Centralized: designed as a central cache

Distributed: designed as a tiled cache

Centralized shared cache

Centralized LLC controller

L2 is
typically organized in banks and the L2 cache controller
functionality is replicated across the banks.

Although
the cache is banked and the controller is replicated to avoid to go
to a central controller across the banks, the design is called
centralized because the LLC occupies a contiguous area on the
chip.

The
interconnect from the cache banks to the on-chip memory controller
is simplified by a centralized design.

Distributed (tiled) shared cache

Tiled
architecture gives more flexibility to manufacturers.

In case
of many cores, this design is favorable with respect to termal and
power density reasons.

Uniform vs non-uniform Cache Architecture

Same
access latency independent of the bank vs different access
latencies.

Uniform Cache Architecture (UCA)

All
accesses take same time (are equally slow).

A UCA
banked cache design often adopts an H-tree topology for the
interconnect fabric connecting the banks to the cache
controller.

Relatively simple network where the requests are pipelined from the
cache controller to the banks.