EMC's Project Lightning has a storage array managing flash cache in servers networked to the storage array. Dell is thinking along similar lines. This is supposed to provide better storage service to the servers. Really? How?
An enterprise infrastructure architect working in the insurance area got in touch to give me a use case …

COMMENTS

Yes but where ?

Assuming that you can create a distributed coherent cache ( which EMC / NetApp has been claiming is impossible for the last ten years) then where would you put the SSD cache ?

On the motherboard ? How would the local cache software communicate back to the remote array, how often would the cache update ( EMC updates their Flash cache once per day ). This would most likely use a kernel driver in the OS e.g., VMware to use the cache.

On the CNA / HBA ? And make it part of the storage understructure would require support in the driver ? At what price would this highly custom piece of silicon, that would be bathed in Unicorn Tears and individually blessed by a virginal Tech priest as it left the factory ? I'd expect it to be orders of magnitude more expensive than the Fusion-IO product. Fusion IO is a a goodish flash drive built use the PCI-E bus in certain computers - but an entirely custom CNA with Flash and handy CPU/Software is quite different.

Mr Leon has some valid points, but only *some*

He is talking specifically about NetApp:

“This limits the number of VMs you can provision on 31xx and 60xx filers to around 300-500 before the CPU in the filers get really hot” – that’s factually incorrect: http://www.vmware.com/files/pdf/VMware-View-50kSeatDeployment-WP-EN.pdf (50,000 VMs on ten FAS3170 clusters => 50,000 /20 = 2,500 VMs per storage controller)

“and limits the performance of the VMs themselves due the 5-10ms latency of a typical storage request incurred by the network stack” – it varies depending on what networking gear is used & in most mass deployment scenarios (VDI for typical office worker) is irrelevant.

That being said:

- NetApp has their own server-side caching project: http://www.theregister.co.uk/2011/06/17/netapp_project_mercury/

It may work, but it's just a stopgap

There are certainly advantages to server-side SSD caching, the biggest of which is that it reduces load on storage arrays that are these days taxed far beyond what they were originally designed for, but in the long run I think we'll see server-side SSD caching as nothing but a complex stopgap making up for deficiencies in current array designs.

If you look at "why" it's claimed server-side cache is necessary, it basically boils down to:

-The array can't handle all the IO load from the servers, particularly when flash is used with advanced features like dedupe

-The reduction in latency from a local flash cache

The first is a clear indication that current array designs aren't going to scale to cloud-workloads and all (or mostly) solid state storage levels of performance. Scale-out architectures are going to be required to deliver the controller performance needed to really benefit from flash.

The second is based on the assumption that the network or network stack itself is responsible for the 5-10ms of latency that he's reporting. The reality is that a 10G or FC storage network and network stack will introduce well under 1ms of latency - the bulk of the latency is coming from the controller and the media. Fix the controller issues and put in all-SSD media, and suddenly network storage doesn't seem so "slow". Architectures designed for SSD like TMS, Violin, and SolidFire have proven this. Local flash, particularly PCI-attached, will still be lower, but that micro-second performance is really only needed for a small number of applications.

EMC and Netapp have huge investments in their current architectures, and are going to try every trick they can to keep them relevant as flash becomes more and more dominant in primary storage, but eventually architectures designed for flash from the start will win out.