StarWind Virtual SAN: Can bruins get along with rocket science?

Target

Preamble 🙂 StarWind has a legion of weird looking guys and hot East European girls who adore writing long emails. Only to make things worse, they can wake you up in the middle of the night (Hwem time difference, clocks are for kids!) with a call just to chat about their tech and how cool they are. So… Think twice before providing them with your real email! Sticking with some throwaway junky one sounds like a much better idea in general. If you want to give their stuff a spin of course.

Status

Unknown. Expedition needed.

Mission

We know from TV; Russian bears can ride a bicycle in a circus and play an accordion to avoid being tazed & beaten to death. Some drunk punk told us bears are good in rocket science as well. Rocket science = high performance storage, hyperconvergence, and all that jazz. To either confirm or deny this statement we decided to give StarWind Virtual SAN scalability and performance a try. Run it, test it and compare it to a big guy Microsoft with Storage Spaces Direct (S2D) who’s our current favorite race horse, and a bunch of losers & cocksuckers (VMware vSAN, DellEMC ScaleIO & Nutanix) as well. So… Does StarWind suck or blow? Spit or swallow?

The brief

Let’s start with some history as we know you love it. Back in 1961, Russians kicked U.S. butt in the space race sending first human to Moon… Space! 🙂 At this point they are trying to kick democratic system butt with their steel toe military boot. Again.

StarWind Software takes its roots from Rocket Division Software. They brought iSCSI protocol support to at that time anemic Windows Server platform back in 2003 when iSCSI spec was still in draft, and it took Microsoft another maybe five years to kind of catch up. Microsoft bought stinking String Bean Software for piece of ujkv they have been told is iSCSI target, and write their own still buggy as hell and slow as hwem iSCSI initiator. When we’ve been discussing Microsoft Storage Spaces and that pure fact Microsoft doesn’t understand storage we really mean it. OK, back in 2008 StarWind has been the first one who clustered a pair of Hyper-V servers without SAS JBODs, external SANs or whatever, just pure Software-Defined Storage running inside hypervisor and… no stinking “controller virtual machines”! Nutanix can go hwem themselves, they rolled out their kludge pretending to do similar “NoSAN” thing somewhere around 2012. A day late and a dollar short. StarWind & Intel partnered in 2010 to deliver 1M IOPS with iSCSI and at that time people were talking about tens of hundreds at best. These days StarWind keeps releasing weird stuff like NVMe over Fabrics target for Windows, iSER which is iSCSI over RDMA, and virtual tapes with Amazon AWS cloud back end (Who’s a customer for that?!?). Question is: Can they amaze with their performance?

Now let’s take a closer look at what we actually have on the table in front of us. StarWind Virtual SAN (VSAN) was born to make you happy… Nope! It was born to turn your rusty servers into high performing SAN. StarWind calls their software “hardware-agnostic” and that is supposed to mean it can be deployed on any piece of ujkv… hardware that fits the bill. Remember installing Linux on a dead badger? We know you do! 🙂 Now StarWind want us to believe we can install StarWind on a dead bear 🙂 There are no hardware compatibility lists, love it or hate it. If this bloody thing can run Windows Server – it can run StarWind as well. Nice try!

Another quite remarkable fact about StarWind is you can get it for free. These crazy vodka-drinking oqvjgthwemgts provide a flexed-out version for SMBs and hobos… ROBOs completely free of charge. Restriction less and frictionless. Production allowed. Bells & whistles! Though, you still have to love PowerShell (Hwem you Microsoft! PowerShell everywhere!). Remember, Microsoft had this “CLI only” thing to manage Windows Server & Hyper-V before Project Honolulu roll out? Same ujkv here. CLI once and forever. No support. Only for brave. GUI and guaranteed support are for paying customers only. OK, enough of dithyrambs and pointless ujkv, let’s move closer to our topic which is performance, performance, and again performance. Plus, scalability!

We run this bloody mix on top of an all-NVMe datastore that should deliver awesome numbers. In theory. As usual, we expect to get a cluster that has its overall I/O performance increased every time we spawn an extra VM. However, as it happens quite often, what we expect is not what we get…

2. Deploy StarWind VSAN on each host. Next, we enable Hyper-V role everywhere as we run hyperconverged and configure MPIO (Multi-Path I/O) as StarWind does iSCSI (and iSER…) and we need MPIO for block tech. For now, we’ll do TCP & iSCSI leaving RDMA & iSER for the second part of the story we tell.

4. Connect StarWind virtual device on Host #1 to the one on Host #2 with iSCSI (two 127.0.0.1 sessions and two iSCSI sessions with the Host #2). Round Robin service policy is used by default. Format this resulting raw block device as NTFS volume and assign a mount point drive letter to it.

5. Create a Hyper-V VM with a “system” virtual disk somewhere on the local Windows host partition and create one extra “data” 256GB VHDX for tests, this “data” one will be placed on top of the StarWind virtual device.

7. Test StarWind virtual device performance with one VM running in the cluster.

8. Clone this test VM and pinpoint it to another Hyper-V host. Check the overall performance and clone the VM again… and again! For each newborn VM, we create a new StarWind virtual device and assign new dedicated “data” VHDX to it.

Before we start, we want to make sure that our Intel SSD DC P3700 Series 2 TB performs as it is said to. We do trust the vendor (Intel), but who knows how our disks are doing? Wear level, dead blocks etc. Now look at what our vendor says in its datasheet:

In this table, Intel claims its NVMe can reach massive 460K IOPS under 4k random reading pattern with 4 workers and Queue Depth=32.

Mini-conclusion

Checking network

After Windows Server 2016 installation and basic initial configuration, we installed the WinOF-2 (1.90.19216.0) driver for Mellanox ConnectX-4 NICs and set up the networks. Then, we checked the networking bandwidth between two Windows hosts (i.e., Host #1 and Host#2) with iperf for TCP and nd_send_bw for RDMA (nd_send_bw is included in Mellanox Firmware Tools).

Let’s look at RDMA network bandwidth first (OK, we don’t do RDMA with StarWind VSAN within our current test, but that’s just to make sure network is healthy and properly configured):

NOTE:Even though we used only fixed virtual disks today, we completely filled them with random data with dd.exe before each test. It’s a good idea to do so while creating or adjusting VM virtual disk’s size.

So, here are dd.exe launching parameters:

dd.exe bs=1M if=/dev/random of=\\?\Device\Harddisk1\DR1 --progress

Picking test utility launching parameters

Now, let’s decide on the optimal number of threads and outstanding I/O. For that purpose, we created a VM and pinned it to the Host #1. Next, we measured StarWind virtual device performance with 4K random read pattern and varying number of threads and outstanding I/O. At some point, the performance hit the ceiling and saturated. That was the point with the optimal number of outstanding I/O and threads that we were looking for.

Here are DiskSPD launching parameters under threads=1 and Outstanding I/O=1,2,4,8,16,32,64,128

Mini–conclusion

So, let’s interpret the numbers. Hyper-V virtual disk maximum performance on StarWind virtual device lays between 220300-220500 IOPS. The latency, in its turn, is between 1.12-1.16 ms. In our setup, we can reach such performance only using 4 or 8 I/O threads. Remember VM has a 4-core vCPU? That’s why we used the following test utilities launching parameters: threads=4 and Outstanding I/O=64.

Accomplishing orbit task

Setting up tools

Now, as we got here, let’s set up tools and carry out the measurements. As usual, we used DiskSPD v2.17 and Fio v3.5 today. We run our tests under the following patterns:

Everything is ready and set up. So, let’s jump right to the real measurements now! Just like during S2D testing, we measured Hyper-V 256 GB “data” virtual disk performance under the mentioned range of patterns. For that purpose, we started with one VM, cloned it to another host and measured the overall cluster performance. Since S2D has surprised us (unlike Nutanix CE and VMware vSAN), hwem those 12-VM warm up! So, we’ll kept on doing this until the total performance either hit the ceiling and reaches saturation point or hit the ground. Note that each VM has its own StarWind virtual device that kept its “data” virtual disk designated for our performance tests. Go!

See, the performance keeps on growing proportionally to the number of VMs in the cluster. But, we decided to stop right after 20 VMs. Why? Just look at what we got under other patterns:

Well, enough is enough. Let’s see what happens to CPU. If you want to build a hyper-converged environment you should keep an eye on your CPU utilization and here’s why: You have to leave some spare CPU cycles to these poor little things called “production VMs”, right? Here’s how CPU behaves and what StarWind leaves to number crunching:

Measuring single Intel SSD DC P3700 2TB performance

Now, let’s take a look at how a single NVMe drive performs under the same test patterns in Windows Server 2016 environment. It will give us kinda of a reference number set to judge whether running StarWind VSAN on an 8 NVMe datastore makes or breaks Hyper-V cluster day.

At this point, let’s do some presumptions to avoid making today’s measurement too complex:

1. All 8 NVMes in the datastore are available for reading. Thus, the overall datastore reading performance should be 8x of just one Intel SSD DC P3700 2TB value (Well, it’s all assuming we don’t have network or CPU or whatever other bottlenecks of course).

2. All NVMe drives are available for writing. Yet, there’s the small thing about VSAN: Replication. This means each block after being written gets replicated to the partner disk (Actually both writes happen in parallel, but who the hwem cares…). In other words, you get a “network” RAID1 as these NVMes “sit” on the different physical hosts with network in between. So, the overall performance will be ((IOPS-Write-one-NVMe)*N)/2. N here stands for a number of NVMe drives involved in writing (8 for our setup).

3. In case of 8k random 70%read/30%write pattern, at best, performance will be (IOPS-Read-one-NVMe*N*0.7)+((IOPS-Write-one-NVMe*N*0.3)/2). Again, N is the number of NVMe drives utilized for the pattern (8 in our case).

Deorbit and landing

Since we’re pretty much done with all that jazz, let’s present our mission report now.

First, let’s look at how StarWind VSAN performance depends on the number of VMs running in the cluster. Just a tiny reminder: Under all of those test patterns we used an individual Hyper-V VM Virtual Disk (VHDX) which was placed on top of StarWind virtual device, there’s no dogfight for LUN ownership! Think about this strategy as of “poor-man’s vVols” if you know what we mean. Overall combined StarWind performance grows nice and smoothly till 4 VMs are concurrently running in the cluster. Saturation point with close to top (80th percentile, FYI) performance numbers achieved comes very soon! Things change once 5th VM gets on board… Under 4k blocks and mixed loads, performance growth slows down but keeps on rising steadily till 20th VM gets deployed.

Wait, why did we stop our measurements after 20th VM, though everything was looking promising? Two reasons actually… Well, first of all this boozing bear has already knocked out anything else reviewed in our lab before. Kidding… But, second, let’s be honest, it also overwhelmed CPU cores on all the hosts rendering hyperconverged scenario pretty much useless beyond this point: There’s no CPU cycles left for production VMs and we didn’t intend to test storage-focused installation! But what else did we expect from ancient iSCSI crawling all over the neighborhood on the back of “guest from 80s” TCP? Anyway, even with close to 100% CPU usage, StarWind performed more-or-less like S2D with its bloody SMB3, SMB Direct, RDMA etc. technology. It might draw different colors with iSER, but let’s leave it for our second home run. Oh, BTW, just look at StarWind per-CPU utilization chart! Unlike S2D, StarWind does proper load balancing. Another reason to stop our further tests was insignificant performance gain with every new VM added. Indeed, performance growth under 4k random read pattern became less intense. Well, let’s just look at these numbers… Initially, while running from 1 through 9 virtual machines in the cluster, average performance gain was approx. 120K IOPS per VM. Next, from 10 through 20 VMs, we got only 46K IOPS / VM. The highest overall performance (1.7M IOPS for number fanatics) was observed while 20 VMs were running in the cluster.

Hold on, what about performance & growing number of VMs measured with other I/O patterns? Well, once 4th or 5th VM gets engaged, there’s no significant performance growth with all of these “other” patterns as well, so we can make another mini-conclusion here: StarWind needs only one running VM per cluster host to grab close to maximum IOPS, making it super-efficient in terms of resource utilization. Our current performance leader (ex-leader?) Storage Spaces Direct (S2D) needs way more than that… OK, back to StarWind. There we observed true saturation. Maximum performance under 64k random write pattern didn’t go higher than 140K IOPS. Under 64k random read loads, StarWind VSAN exhibited 310K IOPS. As for 1M sequential reading performance reached 18K IOPS aka whooping 18.5 GB per second. Holy moly, what a bandwidth!

What does it mean? Once again, you do not need to run a legion of VMs within your cluster for damn performance. SQL Server, Oracle, and scientific number crunching app of choice you can’t do all in-memory because of cost will simply love StarWind and virtual storage it manages. You can nail down all performance your underlying storage can provide flying as low as one VM per physical host. Looks good for a “mom and dad’s shop” IT environment, right? That’s actually what these Russians say their SDS was intended for. Idiots… They created a nuclear bomb in their garage and they went off blast fishing with it!

Bottom line. The table below provides everything you need to know about this vodka-powered dragster. In the “Comparing performance” column, we list the percentage of the theoretical values we could reach.

Well, we aren’t exactly rkuugf qh with the numbers we see. StarWind got biggest swnging fkem in the room so far. Yes, this Russian bear does surprisingly well with rocket science! Another nice thing about it worth mentioning: These guys prove iSCSI has balls. Look at our S2D testing again: StarWind performs generally better with iSCSI / TCP than S2D does over SMB3 / RDMA. So, weapons are nothing, tactics make whole lot of difference.

Next time we’ll see is StarWind capable of making another ground shaking home run with RDMA enabled. Stay tuned, we aren’t dead yet 🙂