Monday, November 23, 2015

Accessing Multiple Registers at Once

As I already mentioned, I gave a talk at DVCon Europe this year on how to implement burst accesses to memories modeled using UVM_REG. The motivation for that was that the register package can already handle bus protocols that don't support burst operation, but it requires more user guidance for protocols that do support it. A question that came up afterwards on the conference floor was what the best way to handle burst accesses to multiple registers might be. I tried to sketch out an answer on a piece of paper, but it was rather late in the day and I couldn't really gather my thoughts. I also have a difficult time expressing myself verbally when talking about abstract concepts. Talking is much more difficult than writing, because you don't get the chance to go back and iterate over certain aspects of the topic. Talking about coding problems is also particularly difficult to do when not in front of a computer. People who've worked with me know that I always like to have an editor window open and sketch out some pseudo-code when discussing something in more detail.

The handling of register bursts is a question that comes up from time to time, on places like the Accellera forum or StackOverflow. Since the person who asked me the question is also a reader of the blog, I thought it would be worth making a post out of it.

A solution would need to take two factors into account. It would need to pragmatic, i.e. do the job with the least amount of code necessary. If you would have asked me this question a little while back I would have stopped here. In the mean time, I've been toying with the idea of using the register abstraction layer as a means of achieving reuse (both lateral and vertical) of sequences. Most probably you'll be seeing more posts on the topic. The second factor I would thus consider important is portability, i.e. being able to take sequences from one project and use them in another.

As an example, let's take a simple design that has four registers, located at consecutive addresses:

Let's assume that the DUT has an AHB interface that supports burst accesses. This means that it's possible to access all four registers using a single AHB transaction. Converting from register accesses to bus items is usually done using a register adapter:

I'll assume everybody is familiar with how an adapter works. If not, the UVM User Guide is a good resource to get you up to speed on how the register model is integrated. This adapter can only handle accessing one register at a time. We need some way of telling it that we actually want to access more registers.

As seen in the links above, people will recommend using the optional extension argument of the read(...) and write(...) tasks to instruct the adapter that the access it's converting is actually a burst to more registers. The use model would be to have a class containing information about whether a register access is a burst:

If num_regs is 1, then the access is a normal one, otherwise it's a burst. It's also a good idea to make the field of the extension random to allow for more generic sequences. When wanting to write all four registers at a time, we could set the values that we want our registers to take, construct an object of this class, set num_regs to 4 and pass it to the update(...) task:

The vanilla register adapter doesn't know anything about the extension we passed. We'll need a sub-class that can interpret the extra information and use it to generate a burst. If we don't pass an extension or pass an unsuitable extension, then we can just generate a SINGLE AHB transaction as before:

This is the tried and true way of doing it. It's also pretty easy to implement. The problem with it, though, is that it's rather coupled with the verification environment. Let's assume that we get a second variant of our DUT that is a bit more bare-bones and only has an APB interface. Ideally, we'd want to be able to run the same sequences (or a subset thereof) in this second verification environment. Accessing single registers isn't a problem, as these would be handled by the vanilla APB register adapter (code not shown for brevity). When starting the burst access sequence (the one with the extension), we'd still like to see all four registers getting accessed, albeit via four different APB transfers. This means we'd need to have a register adapter that can start four transactions in one go:

The reg2bus(...) method is a function, so it can't block. It can also only return one bus transaction. That would be the one corresponding to the register we called write(...) on. If we'd like to access the other three registers as well, one would optimistically think that the other accesses could be forked out. This could get us in a world of trouble with race conditions, because the order in which the accesses would get processed isn't defined. It also doesn't work as expected, because the update(...) task returns before all accesses are finished. For writes this might not be such a big issue, but for reads this would be fatal, since we wouldn't be able to rely on the values stored in the registers to be up-to-date. I didn't really investigate how to improve on this, since the whole idea seems silly. A register adapter isn't meant for this kind of operation. It can only start one bus transaction based on one register access, not more. This was all fine and dandy when that transaction could be a burst (as for AHB), but it falls apart when we need to translate sequences that try to access all four registers at once. This means we can't reliably run the sequences that use the extension mechanism in the APB verification environment, at least not while having them go through an adapter. They could still be reused if we employed a different means of translating from register accesses, using a register sequencer layered on the APB sequencer that would run a translation sequence (more on this later).

The main takeaway point, though, is that while using the extension is easy to set up for the initial DUT (the one with AHB), it becomes trickier to port it to any subsequent variants of the design that use different bus protocols, particularly so if the protocols don't intrinsically support burst accesses. Even for other protocols that do support burst accesses (e.g. AXI), we'd still need to create a sub-class of the corresponding register adapter that can extract the information contained in the extension.

The problem stems from the fact that we're trying to shoehorn an unsuitable abstraction. Calls to uvm_reg::read(...)/write(...) ultimately end up creating an abstract register access, of type uvm_reg_item. Such a register item (which is a sub-class of uvm_sequence_item) can model anything from a small access that takes one bus cycle, to a very big access that takes multiple bus cycles (also called a burst). We're trying to model an access to four registers as an access to one of the registers that includes some side information to say if it's actually a burst or not.

A better idea might be to not go the way of using an extension. Instead, we could create a register item "by hand", fill it up with the appropriate information and send it out to be processed:

Instead of starting a register item indirectly via a call to uvm_reg::write(...), we create one ourselves. We explicitly state that this is a burst access, by setting the kind field appropriately. The (misleadingly named) value field is actually an array that contains one element per burst transfer. Since we want to write to four registers, we set its size to 4 and its elements to the desired values of the registers.

This is one piece of the puzzle. Now we need to translate this uvm_reg_item to the bus transaction that the DUT needs to see. Trying to send this access through a register adapter might work for the AHB DUT, because the AHB adapter can start a single AHB transaction that is capable of representing the entire register item. Trying to send it through the APB adapter will lead to the same problem that we had before before, namely that we can't start multiple APB transactions based on it.

The UVM User Guide show us how to implement a different translation scheme, more sophisticated than the register adapter. As briefly mentioned above, it involves layering. As described in section 5.9.2.3 of the User Guide (UVM 1.1), we can have a register sequencer that serves as a landing pad for uvm_reg_items. A translation sequence running on the bus sequencer would get items from this register sequencer and could convert them to bus transactions.

Our translation sequence extends the built in uvm_reg_sequence, which already provides some facilities to perform translation (albeit based on a register adapter, which is the very thing we're trying to avoid). By overriding the do_reg_item(...) task, which gets called for each item that gets started on the register sequencer, we can implement our own scheme that generates one AHB transaction based on the contents of the uvm_reg_item to be converted. When creating this sequence, we need to specify the instance of the register sequencer and afterwards start it on the bus sequencer:

Now you might ask what the advantage is when doing it this way, as opposed to using the extension argument. Clearly we could save ourselves the trouble of creating our own uvm_reg_item in the register burst sequence (which takes up quite a bit of code, but even that could be encapsulated in a task) and just pass an extension to a call to write(...)/read(...) as we did before. The downside to this, though, would be that we would need a translation sequence that can extract the extension, which would create an unnecessary dependency. If we would be more diligent in creating our register item, we could even save ourselves the trouble of having to start a translation sequence for APB. If we'd fill a few more of its fields (like local_map and some others), the register package itself could handle splitting a burst into multiple transfers and run each of those through a register adapter. I didn't look too much into this, though... The reason for that is that I see this idea of creating our own uvm_reg_item for a register burst as a stepping stone for the next idea.

We could conceptually think of our burst access that covers multiple registers as a memory burst starting at a certain offset (in our case the offset of the first register) that is of a certain size (in our case 4). The uvm_mem class provides, aside from the write(...) and read(...) tasks, the burst_write(...) and burst_read(...) tasks which trigger bursts. We could shadow the registers with a dummy memory, that we would only use to start bursts. The register package would handle the heavy lifting of creating a uvm_reg_item based on our desired access.

Since our register model is probably generated from a specification, we don't want to touch that code. Instead, we can instantiate the shadow memory inside a sub-class and make sure that we instantiate this class in our verification environment instead of the original one:

We'll get warnings that the memory and the registers overlap, but these can be silenced.

We could call burst_write(...) on this memory with the appropriate arguments to trigger a burst that accesses all four registers. Since we have quite a few arguments to pass, this could get tedious, so we can define a helper task:

The update_regs(...) task is similar to burst_write(...), but it doesn't require us to pass an offset or the data values to be written. These are computed based on the desired values of the registers that the memory shadows. A similar task could be defined to read all the registers.

Integrating this sequence is even more straight forward than before. For APB, we don't even need the register sequencer; the register adapter will suffice. For AHB, we could either have a register sequencer layered on the AHB sequencer (as in the previous section) or we could use a custom frontdoor sequence (as described in my DVCon Europe paper).

I've omitted a lot of the infrastructure code to keep the post focused. You can download the full example from SourceForge.

I don't consider the third approach, using a shadow memory, to be much more complicated than the first one, where we were using the extension argument. Sure it requires a bit more code to declare the shadow memory, especially the convenience tasks, but even that could could be abstracted and made reusable. Layering the shadow memory on bus protocols that don't support bursts is effortless (assuming that a register adapter is already available with the protocol UVC), because UVM_REG already contains a lot of code to handle this. It's only for bus protocols that support burst operation that we need to make sure that register/memory bursts get converted properly. A good UVC for such a protocol will also provide infrastructure for this, in the form of a translation sequence.

When using a custom extension argument to implement such register bursts, the translation scheme always has to be tailored to support this, by extending the generic register adapter or translation sequence to extract the information stored in the extension. It's also rather unintuitive that simpler protocols (that don't support burst operation) cause more headaches. Using the extension argument in this way might also interfere with other uses for it, where a user needs to pass in other side information (such as protection levels) to be translated.

The decision which scheme to use in a certain verification environment depends on whether portability (due to lateral or vertical reuse) is or isn't important.

If you have any other approaches to handling register bursts, I'd love to hear them in the comments section below.

As I understand from your env. that you do not use an adapter if you use translation.

1- How the register model gets updated when you write to the register of the DUT? In the case when the adapter used, it is done by the predictor.2- Is it still possible to use extension with this translation case if I want to pass different items than the predefined reg_items?

1. The examples only focus on how to generate stimulus, so there's no code in there for handling model updates. Conceptually, prediction is done on a bus cycle level. This means that whenever you see a transfer, you want to predict what its effect was. An adapter as described in the UVM user guide is enough for this. How you handle prediction is totally independent from how you handle stimulus generation.

How to handle the registers which needs to be access using I2C slave - which is a DUT.Ie. To read the other blocks registers of the SOC - the I2C VIP writes the address into the I2C slave dut config registers and then the DUT will access the internal registers and gets the data and will places the data into data register. This data will be sent back to I2C VIP when it tries to read the data.

Since the reg adaptor has reg2bus and bus2reg which are functions - we can not implement them in those functions.

A shadow memory allows you to start bursts via the 'burst_write/read(...)' tasks. This will trigger one translation 'step' for each complete burst that you want to start. In that translation step you can decide whether you want to start multiple sequence items (for a protocol that doesn't support bursts, like APB) or a single one (for a protocol that does support bursts, like AHB).

Hi Tudor,How can we extract the "extension" from translation sequence? Inside the adapter's reg2bus function, we can use get_item to return a uvm_reg_item and cast it. In the translation case, how do we do it?

About

I am a Verification Engineer at Infineon Technologies, where I get the chance to work with both e and SystemVerilog.
I started the Verification Gentleman blog to store solutions to small (and big) problems I've faced in my day to day work. I want to share them with the community in the hope that they may be useful to someone else.