ACE’ing the verification of a cache coherent system using UVM

UVM-based ACE verification IP
One example of a suite of
UVM-based verification components that provides a complete UVM-based
verification solution for ACE protocol is the Synopsys VIP for AMBA AXI.
The AXI ACE VIP provides a system environment component with a
configurable number of ACE master and slave agents, a system monitor and
an interconnect component, as illustrated in Figure 7. The VIP
leverages most of the functionality mentioned in the previous sections
as well as the UVM resource mechanism to provide the configurability and
the sophisticated stimulus generation requirements in the ACE context.

Figure 7: AXI ACE system environment

The master agent generates constrained random ACE coherent
transactions, and responds to the ACE snoop transactions concurrently.
It also allocates cache lines and performs cache state transitions to
the various cache states based on the transactions it sends and receives
by using a built-in cache model. The user has back-door access to the
cache model through built-in application programming interfaces (APIs)
to allocate, de-allocate or query the cache lines. The slave agent
responds to read/write requests and models the memory for the system. It
also supports ACE-Lite requirements through simple configuration
parameters. The interconnect environment component receives coherent
transactions from the initiating master, and generates appropriate snoop
transactions to the other masters based on domain information. It then
responds to the coherent transactions based on responses received
through snoop transactions.

The master and slave agent
instantiate the port monitor, which continues to be available when the
agents are configured in the passive mode. These monitors perform
port-level transaction checks, signal stability checks and sequencing
between ACE coherent and snoop transaction checks. Another key component
of the ACE solution is the system monitor, which performs system-level
checks, coherency checks and data integrity checks. As certain checks
are dependent on design behavior, the system monitor also provides hooks
to implement design-specific checks. The built-in coverage supports ACE
coherent and snoop transaction coverage. The cache state transition
coverage helps to validate whether the master’s cache has transitioned
through all the legal cache states. The coverage can be used in
conjunction with the ACE verification planner to track the verification
progress.

Solving cache coherency challenges

Stimulus generation

The
aforementioned components are complemented by a library of configurable
ACE sequences that, weaved together, form virtual sequences to further
aid in scenario creation at the block, cluster or system level across
various masters and the interconnect. Additionally, the UVM sequence
library enables the user to control the different permutations by which
atomic and hierarchical sequences can be stitched together to create the
complex scenarios depicted earlier.

Creating custom rules for
the sequence library would help not only to streamline multiple
sequences in different simulations but also to avoid redundancy and move
progressively toward convergence of all interesting system-level
scenarios. Again, in such scenarios, the sequences have to be aware of
the functional configuration to enable reconfiguration based on the
system-level requirements.

Creating configurable sequences
There
might be specific requirements when the sequences’ constraints or
properties depend on the values in the configuration object. The UVM
resource mechanism is used in the AC sequences to bring in
configurability, as shown in Figure 8.

Figure 8: Configurable sequence

Though the hierarchical UVM configuration mechanism is designed
around components, the non-component object can access the configuration
field through the component handle. In case of sequences, ‘m_sequencer’ is the handle to the sequencer that executes the sequence. It is a built-in member of the uvm_sequence class. The configuration parameter can be accessed in a hierarchical context through the ‘m_sequencer’ handle as shown below:uvm_config_db#(int)::get(m_sequencer, “” ,
"item_count",item_count);

The ‘set’ of the parameter is as follows:uvm_config_db#(int)::set(this, "env.agent.seqr", "item_count", 20);

Therefore,
when parameters change in a dynamic environment, the ACE sequences can
reconfigure themselves to meet the generation requirements at that point
in time. Thus, for different master and slave components that may
support a subset or full ACE, ACE-Lite, AXI4 or AXI3 protocol and work
with different bus widths or clock frequencies, the sequences can be
reconfigured to work with each of their associated sequencers.

Hierarchical sequence stitching and sequence libraries

The
functionalities supported by the protocol range from those that can be
mapped to atomic transactions to those that run into hundreds of lines
of testbench code. The sequence collection has a rich set of
functionality; there are sequences to initiate all the possible coherent
transactions. Sequences which do not cause a snoop of any cached
masters, which must cause a snoop of the cached masters that can hold a
copy of the cache line, which must cause a snoop of any of the cached
masters that can hold a copy of the cache line and more.

Given
the functionality that UVM provides, it is much more convenient to
stitch together low-level, proven or validated scenarios to create more
complex ones. This is how the ACE higher level and virtual sequences
are built up. Let’s take a look at how custom user scenarios can be
built using the sequence collection.

In this example, it must be verified that all the cache line states associated with a Readclean transaction need to be tested. This would require cache line initialization followed by cache line invalidation, then a basic Readclean.
A cache line initialization sequence initializes the cache line states
of a master's cache and its peer's caches to a set of random but valid
states. This ensures that all the different cache line state transitions
for a coherent transaction initiated by a master are verified. A cache
line invalidation sequence invalidates cache lines of a master. This may
be required for non-speculative load transactions. A basic Readclean
sequences initiates a Readclean transactions over a given set of
addresses. The basic steps are:

Address selection – Choose the set of addresses on which to test the sequence (user configurable)

Cache line initialization - Bring cache lines states to random but valid states for all masters.

Cache line invalidation - Load transactions may need to invalidate
its cache before initiating transactions, unless they are speculative.

A complete verification scenario (like that shown in Figure 7) can
be mimicked using the nested sequences as explained in Figure 4. With
the hierarchical approach, it becomes relatively easy to model any
scenario generation requirements regardless of how complicated they are.
The same approach when combined with the virtual sequences helps to
leverage this functionality across multiple interfaces and is highly
relevant in the system context. For example, there are multiple virtual
sequences that are part of the library and perform a combination of
different sequential coherent transactions from different masters to the
same slave.

// Write into M0’s local cache. Data is now dirty in local cacheM0 initiating MAKEUNIQUE to addr1

// Write data into memory. Data is now clean in local cache. Data in cache matches data in memoryM1 initiating WRITECLEAN to addr1

Apart from building the explicit virtual sequences, the uvm_sequence_library
can be used to achieve the same by adding the sequences registered with
the sequence library on the per-requirement basis for a specified
instance of the sequencer. Thus, sequences modeling functionalities such
as overlapping store operations to verify the interconnect behavior for
concurrent transactions, or those exercising multiple initiating
masters attempting simultaneous shareable store operations to the same
cache line, can easily be made part of the sequence library or
collection. The end user can then readily leverage this library.

Using the AXI interconnect and system level checks

The
system monitor observes transactions across the ports of a single
interconnect and performs checks between the transactions of these
ports. It does not perform port-level checks, which are accomplished by
the checkers of each master/slave agent connected to a port. In ACE, the
system monitor correlates coherent transactions and the corresponding
snoop transactions to perform checks. The checks in the system monitor
are geared toward checking the proper working of an interconnect DUT.

The
system monitor requires transaction-level inputs from the master and
slave ports that are connected to interconnect. By transaction-level
inputs, we mean transactions created by port-level monitors as a result
of signal-level activity. The system monitor does not require
signal-level inputs. Transaction-level inputs are provided by port
monitors. To provide transaction-level inputs, the system monitor could,
in turn, instantiate port-level monitors. UVM provides the capabilities
to easily connect various components. All transactions from the
port-level monitors of each of the agents can easily be provided to the
system monitor via transaction-level modeling (TLM) connections, thereby
eliminating the need for instantiating these port level monitors in the
system monitor. Figure 10 describes two examples for system-level
checks.

Figure 10: System checks

Thus, by leveraging the UVM capabilities together with coherency
knowledge, the system check provides robustness to verification of the
device under test (DUT).

Distributed phasing

Finally,
given the usage of such coherent systems in all handheld devices, it is
imperative to devise a mechanism for a power-aware verification setup.
Also, as mentioned earlier, different components might support a
different subset of the protocol. Some of the components might be power
aware and would be modeling components in power domains. Such components
would need the phase-aware sequences to be executing in user-defined
phases. Some of these might go to a powered-down phase in the middle of
simulation and on ‘waking up’ would have to catch up the other phases.
Again, the UVM hierarchical phasing schemes and configurable sequences
can be leveraged to help the user to model the different power state
transitions for the system.

UVM allows new domains to be created
and components to be grouped into different domains that have executed
their phases independent of each other. The default domain name is the
‘uvm’ domain, which contains the default runtime phases; see Figure 11.

Figure 11: Distributed phase synchronization

New phases can be inserted to the domains created. The components in
a specific user-defined domain can be made to sync with the other
domain at the end of run_phase. So, as shown in Figure 12, even if an
ACE component is powered down, it alone can be made to rewind back to an
earlier phase, wake-up and then get in phase with the other components
running the default runtime phases.