DDR4: Not just a speed bump

As standards evolve, a generational change typically means an increase in the data rate and minor adjustments based on design implementations of the previous specification. That is not the case with DDR4. When DDR4 made its debut, more than 20 new features were added, which is twice the number of specification changes introduced in any previous DDR generational change. Substantial changes were needed to keep DDR ahead of market demands for lower-power designs. Many of the changes in previous specification updates have been aimed at making the technology go faster or wider; the move to DDR4 incorporated changes to improve both speed and width. Let’s take a look at a few of the more significant changes in DDR4.

First, important reliability and manufacturing improvements were made to facilitate testing. For example, a write data cyclical redundancy check (CRC) was added to provide real-time error detection on the DDR4 data bus. If you get a CRC error and write is clear, then odds are it’s a read error. The CRC computation/validation across the data bus will “enable error detection capability for data transfers which are especially beneficial during write operations and in non-ECC memory applications.” [1].

When the DRAM detects a CRC error such as a short low pulse, it will flag an ALERT_n LOW. In addition, it will set the CRC error flag (MR5[3]) to a 1 and the CRC error status (MPR3[7]) to a 1. MR5[3] must be reset to 0, or clear, before the CRC error will be removed. However, since CRC and C/A parity errors give the same error code, ALERT_n LOW, how can you be sure which error you actually have? Thankfully, JEDEC suggested a way around the problem. Designers identify the error by the length of time the error lasts. If it’s low for six to 10 clock cycles, you have a CRC error, and if it’s 48 to 144 clock cycles, you have a C/A parity error. This C/A parity feature provides an inexpensive way to determine command and address bus symmetry over a link.

Perhaps one of the biggest challenges for DDR4 is probing at convenient DIMM/SODIMM locations. Because the BGA footprint, DIMM and SODIMM sockets differ from DDR3 to DDR4, the new release requires all new probing. Meanwhile, with increased data rates and decreased voltages, the data valid window is shrinking, making the threshold settings become critical for state mode measurements. As data valid windows shrink, the threshold setting is also critical for timing mode measurements. Timing mode, the most basic mode of a logic analyzer, tells when the events happened. It samples from a clock that is internal to the logic analyzer. Timing mode is asynchronous to the DDR4 system and because of this provides limited understanding into system signal flow.

State analysis is synchronous, meaning that the sampling clock comes from your device under test (DUT). The purpose of state analysis is to examine what happened. You do this by tracing the values on the bus. In this way, you can watch the code flow so you can track functional problems much more easily. Seeing the DDR4 commands, address and data as your DUT sees them relative to the system clock is critical to obtaining an accurate view of DDR4 activity. State mode is typically used in software debug to make sure your memory controller and DRAM are executing appropriately. It's also used for hardware debug and hardware/software integration for which there might be confusion about where errors are occurring. With the clock coming from your DUT, it's important to have accurate synchronous sampling so when your system starts its activity, you capture what occurs at the correct time.

To accurately capture data on a bus in state mode, the logic analyzer's setup/hold time must fit within the data-valid window because the location of the data-valid window relative to the bus clock is different for different types of buses (see figure 1). At DDR4 speeds, accurately capturing the data-valid window becomes trickier, as the data-valid windows available to the logic analyzer are shrinking. Using the state mode allows you to quickly identify abnormalities and still make it home for dinner.