Storage Protocol Comparison – A vSphere Perspective

On many occasions I’ve been asked for an opinion on the best storage protocol to use with vSphere. And my response is normally something along the lines of ‘VMware supports many storage protocols, with no preferences really given to any one protocol over another’. To which the reply is usually ‘well, that doesn’t really help me make a decision on which protocol to choose, does it?’

And that is true – my response doesn’t really help customers to make a decision on which protocol to choose. To that end, I’ve decided to put a storage protocol comparison document on this topic. It looks at the protocol purely from a vSphere perspective; I’ve deliberately avoided performance, for two reasons:

We have another team in VMware who already does this sort of thing.

Storage protocol performance can be very different depending on who the storage array vendor is, so it doesn’t make sense to compare iSCSI & NFS from one vendor when another vendor might do a much better implementation of one of the protocols

If you are interested in performance, there are links to a few performance comparison docs included at the end of the post.

Hope you find it useful.

vSphere Storage Protocol Comparison Guide

iSCSI

NFS

Fiber Channel

FCoE

Description

iSCSI presents block devices to an ESXi host. Rather than accessing blocks from a local disk, the I/O operations are carried out over a network using a block access protocol. In case of iSCSI, remote blocks are accessed by encapsulating SCSI commands & data into TCP/IP packets. Support for iSCSI was introduced in ESX 3.0 back in 2006.

NFS (Network File System) presents file devices over a network to an ESXi host for mounting. The NFS server/array makes its local filesystems available to ESXi hosts. The ESXi hosts access the meta-data and files on the NFS array/server using a RPC-based protocol

VMware currently implements NFS version 3 over TCP/IP. VMware introduced support NFS in ESX 3.0 in 2006.

Fiber Channel presents block devices like iSCSI. Again the I/O operations are carried out over a network using a block access protocol. In FC, remote blocks are accessed by encapsulating SCSI commands & data into fiber channel frames.

One tends to see FC deployed in the majority of mission critical environments.

FC has been the only one of these 4 protocols supported on ESX since the beginning.

Fiber Channel over Ethernet also presents block devices, with I/O operations carried out over a network using a block access protocol. In this protocol, the SCSI commands and data are encapsulated into Ethernet frames. FCoE has many of the same characteristics of FC, except that the transport is Ethernet.

VMware Introduced support for HW FCoE in vSphere 4.x & SW FCoE in vSphere 5.0 back in 2011

This protocol typically affects a host’s CPU the least as HBAs (required for FC) handles most of the processing (encapsulation of SCSI data into FC frames)

This protocol requires 10gb Ethernet.

The point to note with FCoE is that there is no IP encapsulation of the data like there is with NFS & iSCSI, which reduces some of the overhead/latency. FCoE is SCSI over Ethernet, not IP.

This protocol also requires jumbo frames since FC payloads are 2.2K in size and cannot be fragmented.

iSCSI

NFS

Fiber Channel

FCoE

Load Balancing

VMware’s Pluggable Storage Architecture (PSA) provides a Round-Robin Path Selection Policy which will distribute load across multiple paths to an iSCSI target. Better distribution of load with PSP_RR is achieved when multiple LUNs are accessed concurrently.

There is no load balancing per se on the current implementation of NFS as there is only a single session. Aggregate bandwidth can be configured by creating multiple paths to the NAS array, and accessing some datastores via one path, and other datastores via another.

VMware’s Pluggable Storage Architecture (PSA) provides a Round-Robin Path Selection Policy which will distribute load across multiple paths to an FC target. Better distribution of load with PSP_RR is achieved when multiple LUNs are accessed concurrently.

VMware’s Pluggable Storage Architecture (PSA) provides a Round-Robin Path Selection Policy which will distribute load across multiple paths to an FCoE target. Better distribution of load with PSP_RR is achieved when multiple LUNs are accessed concurrently.

Resilience

VMware’s PSA implements failover via its Storage Array Type Plugin (SATP) for all support iSCSI arrays. The preferred method to do this for SW iSCSI is with iSCSI Binding implemented, but it can be achieved with adding multiple targets on different subnets mapped to the iSCSI initiator.

NIC Teaming can be configured so that if one interface fails, another can take its place. However this is relying on a network failure and may not be able to handle error conditions occurring on the NFS array/server side.

Theoretical size is much larger than 64TB, but requires NAS vendor to support it.

64TB

64TB

Maximum number of devices

256

Default 8,

Maximum 256

256

256

Protocol direct to VM

Yes, via in-guest iSCSI initiator.

Yes, via in-guest NFS client.

No, but FC devices can be mapped directly to the VM with NPIV. This still requires RDM mapping to the VM first, and hardware must support NPIV (SW, HBA)

No

Storage vMotion Support

Yes

Yes

Yes

Yes

Storage DRS Support

Yes

Yes

Yes

Yes

Storage I/O Control Support

Yes, since vSphere 4.1

Yes, since vSphere 5.0

Yes, since vSphere 4.1

Yes, since vSphere 4.1

Virtualized MSCS Support

No. VMware does not support MSCS nodes built on VMs residing on iSCSI storage. However the use of software iSCSI initiators within guest operating systems configured with MSCS, in any configuration

supported by Microsoft, is transparent to ESXi hosts and there is no need for explicit support statements from

VMware.

No. VMware does not support MSCS nodes built on VMs residing on NFS storage.

Yes, VMware supports MSCS nodes built on VMs residing on FC storage.

No. VMware does not support MSCS nodes built on VMs residing on FCoE storage.

iSCSI

NFS

Fiber Channel

FCoE

Ease of configuration

Medium – Setting up the iSCSI initiator requires some smarts, simply need the FDQN or IP address of the target. Some configuration for initiator maps and LUN presentation is needed on the array side. Once the target is discovered through a scan of the SAN, LUNs are available for datastores or RDMs.

Easy – Just need the IP or FQDN of the target, and the mount point. Datastore immediately appear once the host has been granted access from the NFS array/server side.

Difficult – Involves zoning at the FC switch level, and LUN masking at the array level once the zoning is complete. More complex to configure than IP Storage. Once the target is discovered through a scan of the SAN, LUNs are available for datastores or RDMs.

Difficult – Involves zoning at the FCoE switch level, and LUN masking at the array level once the zoning is complete. More complex to configure than IP Storage. Once the target is discovered through a scan of the SAN, LUNs are available for datastores or RDMs.

Advantages

No additional hardware necessary – can use already existing networking hardware components and iSCSI driver from VMware, so cheap to implement.

Well known and well understood protocol. Quite mature at this stage.

Admins with network skills should be able to implement.

Can be troubleshooted with generic network tools, such as wireshark.

No additional hardware necessary – can use already existing networking hardware components, so cheap to implement.

Using DCBx (Data Center Bridging protocol), FCoE has been made lossless even though it runs over Ethernet. DCBX does other things like enabling different traffic classes to run on the same network, but that is beyond the scope of this discussion.

Disadvantages

Inability to route with iSCSI Binding implemented.

Possible security issues, as there is no built in encryption, so care must be taken to isolate traffic (e.g. VLANs).

SW iSCSI can cause additional CPU overhead on the ESX host.

TCP can introduce latency for iSCSI.

Since there is only a single session per connection, configuring for maximum bandwidth across multiple paths needs some care and attention.

No PSA multipathing

Same security concerns as iSCSI since everything is transferred in clear text so care must be taken to isolate traffic (e.g. VLANs).

NFS is still version 3, which does not have the multipathing or security features of NFS v4 or NFS v4.1.

NFS can cause additional CPU overhead on the ESX host

TCP can introduce latency for NFS.

Still only runs at 8Gb which is slower than other networks (16Gb throttled to run at 8Gb in vSphere 5.0)

Requires a 10Gb lossless network infrastructure which can be expensive.

Cannot route between initiator and targets using native IP routing – instead it has to use protocols such as FIP (FCoE Initialization Protocol).

Could prove complex to troubleshoot/isolate issues with network and storage traffic using the same pipe.

Note 1 – I've deliberately skipped AoE (ATA-over-Ethernet) as we have not yet seen significant take-up of this protocol as this time. Should this protocol gain more exposure, I’ll revisit this article.

Note 2 – As I mentioned earlier, I’ve deliberately avoided getting into a performance comparison. This has been covered in other papers. Here are some VMware whitepapers which cover storage performance comparison:

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage

About the Author

Cormac Hogan is a Senior Staff Engineer in the Office of the CTO in the Storage and Availability Business Unit (SABU) at VMware. He has been with VMware since April 2005 and has previously held roles in VMware’s Technical Marketing and Technical Support organizations. He has written a number of storage related white papers and have given numerous presentations on storage best practices and vSphere storage features. He is also the co-author of the “Essential Virtual SAN” book published by VMware Press.

Ok, that makes more sense as they are also the ones that discussed and covered the most as well.
I routinely encounter people who are surprised to hear that there is such a thing as shared and switched SAS from vendors ranging from Dell, HP, IBM, NetApp, Oracle and others along with support on the vmware hcl site (assuming that has not recently changed of course).
Granted most of those and other vendors along with their followers tend to focus on what others are talking about or are known (e.g. iSCSI, FC, FCoE, NFS). All of the above different supported interfaces have their place, benefits, caveats, supporters and detractors. The trick is figuring out which is best for your specific environment.
Oh, fwiw, I like them all when used where they make the most sense (or if you are a vendor the most dollars ;)…
Cheers gs

Thanks for the clarifications Erik. The “ease of configuration” is always a matter of conjecture. I’m sure readers will find your post useful in that regard.

Doug B

February 27th, 2012

I believe i have a correction in the FC/Protocol direct to VM section:
“FC devices can be mapped directly to the VM with NPIV. This still requires RDM mapping to the VM first…” should indicate that the RDM mapping to the HOST is required first.

Hello Doug,
You are correct. The LUN must first be mapped to the host, then the RDM must be mapped to the VM before NPIV can be used. I should have clarified that in the posting.
Cormac

Iñigo

February 28th, 2012

Nice comparison table, thanks.
Regarding FCoE support, can you confirm that “ESXi Boot from SAN” and “Virtualized MSCS” are not supported with any FCoE option? With CNA-based FCoE I was expecting the same feature support than native FC.
And within FCoE, are there differences in supported features between CNA-based and software-based FCoE?
Thanks

Hi Iñigo,
Thanks for commenting and good catch. The ‘Boot from SAN’ is available to HW FCoE, but not SW FCoE. I will fix that entry.
Unfortunately the MSCS restriction is in place, and this is clearly called out in the MSCS configuration guide.

It’s been a while since you wrote this, but it’s such a great post, I’m sure people will still be comming here for years

John

March 3rd, 2012

FDQN for iSCSI setup should be FQDN (fully-qualified-domain-name)

forbsy

April 29th, 2012

For the VAAI information, there are a number of new constructs for NFS:
NFS – Extended Stats
NFS – Space reservation (procure eagerzeroed thick VMDK)
Also, Protocol direct to VM. I guess it depends on the storage. NetApp has software called Windows SnapDrive. This facilitates FC to be mapped directly to the VM – without NPIV.
Nice article!

Thanks for the comment Ian.
Yes, I did omit the extended stats for NFS, just because it is not that visible, and I’m not sure how many partners have implemented it. Good catch though.
The NFS space reservation is referred to as preallocate space above.
Cormac

Very useful article….. nice compilation of each of the features and disadvantages of storage protocol….. an indeed a good comparison……will definitely benefit ones who always are in dilemma to choose among these protocols..
keep up the good work

We’re a bunch of volunteers and opening a new scheme in our community. Your site provided us with helpful info to paintings on. You’ve performed an impressive task and our entire neighborhood will probably be thankful to you.

Hey there! Someone in my Myspace group shared this site
with us so I came to look it over. I’m definitely enjoying the information. I’m bookmarking and will be tweeting
this to my followers! Fantastic blog and amazing design and style.

I´m in this market for a while and I´ve seen some discussions from the past that nowadays seems even funny like Ethernet and Token Ring performance discussions. Glad to know that I´m seeing now the history repeating. In few works: iSCSI will win (as Ethernet won in the past) just because will deliver more thoughtput with few dollars. 🙂

Rolf Bartels

September 30th, 2013

Can a single VMFS Datastore be presented to 2 seperate hosts, one using FC and another using iSCSI, as long as the storage supports both protocols ?