System Performance Enhancements

Memory Placement Optimization (MPO) enables operating systems to allocate
memory local to the core where the threads or processes are being executed.
The sun4v architecture runs on virtualized hardware environment. The MPO for
sun4v platforms feature provides the required standard accessors in the sun4v
layer to provide locality information for the generic MPO framework. This
feature is effective on platforms that have multiple sockets with differences
in memory access latency. The MPO feature enhances the performance of various
applications by enabling the OS to allocate memory local to the nodes.

SPARC: Shared Contexts Support

The context mechanism, which is used by the Memory Management Unit (MMU)
hardware to distinguish between the use of the same virtual address in different
process address spaces, introduces some inefficiencies when shared memory
is used. The inefficiencies in shared memory are because the data at a particular
shared memory and the address in different processes might really be identical,
but the context number associated with each process is different. Therefore,
the MMU hardware cannot recognize a match. This inability to recognize a match
results in mappings being unnecessarily evicted from the MMU translation cache
and the Translation Lookaside Buffer (TLB), only to be replaced by identical
mappings with a different context number.

The Niagara 2 system has an additional shared context, which is a hardware
feature that can be used to prevent the inefficiency in handling shared memory.
Searching the TLB for mapping a match on either the private or the shared
context results in a TLB hit. The current software support for shared context
activates the feature for processes that use the Dynamic Intimate Shared Memory
(DISM). In this case, the process text segment and DISM segments mapped at
the same virtual address with the same permissions for each process use the
shared context.

x86: CPUID-Based Cache Hierarchy Awareness

Modern Intel processors provide an interface for discovering information
about the cache hierarchy of the processor through the CPUID instruction.