Optimizing PCIe SSD performance

Server-side solid state drive (SSD) deployment in the enterprise market is growing by 55% per year and is expected to reach nearly 10 million units annually over the next several years according to Gartner Research. In applications such as transactional processing, data search, and data mining, the higher performance of SSDs offers tangible, real-world benefits. The cost of NAND flash has been moving steadily downward, and the number of companies servicing this market has been increasing. Because of this, differentiation on value (rather than price) is becoming more difficult, but value is the primary mechanism for storage solution providers to win and sustain their business.

Server and storage systems vendors can add value to what they provide to their customers in the form of performance, scalability, and reliability via their software and hardware architecture. In order for these complex systems to function at their peak, however, two of the largest contributors to these added values—the SSD controller and interconnect—need to be matched and tuned. At the most fundamental level, component vendors need to share a common vision of how a system should be designed and deployed. One of the most important decisions is the interconnect used to get the data from storage to host. PCI Express (PCIe) is rapidly becoming a key interconnect in enterprise SSD storage, and is expected to account for more than one-third of the server SSD market by 2015, according to Gartner.

Let’s look at why PCIe-based SSD storage systems (see figure 1) are being deployed, and why it is important for interconnect and flash controller suppliers to work together to achieve the best results.

Figure 1: PCI Express-based SSD architectures like this one popular performance benefits for applications such as transactional processing, data search, and data mining.

One key factor governing system performance is latency. A common misperception is that data and storage are the same; they are quite different. The only data that is actually being processed by an application at any given time is the active dataset in host memory. All other data is just storage, whether the medium is an SSD or a hard disk drive (HDD). With modern servers possessing powerful multi-core CPUs and including plentiful DRAM, a common bottleneck is the latency of moving data from the persistent storage device to system memory (see figure 2). When the processor has to wait to get the data it needs, cycles are wasted and cannot be recovered.

Figure 2: The transfer of data from storage to active dataset interfaces latencies that can compromise system performance.

SSDs enable 100 times lower latencies compared to electromechanical hard drives when the data is transferred from the storage medium to host memory—typically in the tens of microseconds compared to many milliseconds for disk drives. This substantially reduced latency in the storage medium needs to be matched with an appropriate interconnect, and PCIe, which offers latency on the order of 150 ns per switch hop, emerges as the optimal pathway. The latency advantage for PCIe is even greater in that there is a direct connection from the host to PCIe. If the SSD controller vendor works with the switch provider by putting a direct PCIe connection onto the controller, then not only is any other interconnect mechanism rendered unnecessary, but such alternatives require additional bridges that increase latency and reduce efficiency.

PCIe latency is critical, but so is software latency. Fully leveraging the speed of flash and other memories is going to require very aggressive efforts to reduce or eliminate software interactions. A group at UCSD has been doing work in that direction by moving some file system functions into hardware. The paper describing their work is here: http://cseweb.ucsd.edu/users/swanson/papers/Asplos2012MonetaD.pdf