Latency Distribution and Latency Percentile

Latency Distribution

To demonstrate the performance differences with these RAID configurations, I will be using an improved version of the Latency Distribution / Latency Percentile testing first introduced in our original 950 Pro review. I recommend briefly reviewing that page for an idea of how this testing works and its benefits in demonstrating large differences in latency (HDDs on the same chart as SSDs!). While the charts will be in a similar layout, they are now plotted at a higher resolution (600 points per line over the six decade logarithmic scale), as well as a number of behind-the-scenes changes to help clear up the presentation of the data.

The above chart is a bit busy and hard to read, but I've included it for proper perspective as to where the following charts have been derived. It is showing where each IO falls with respect to the time it took to be serviced (latency). The ideal result here would be a narrow peak as far to the left as possible, meaning that all IOs are serviced very quickly, with no stragglers causing unwanted delays. Lets quickly move onto the Latency Percentile translation of the above data, as well as the Percentile plots for a RAID of two and three 950 Pros:

Latency Percentile

You can click any of these graphs for a larger view!

With Latency Percentile, we see a 'neater' representation of the Latency Distribution data, in that we get a profile that slopes from 0% to 100% of the total IOs serviced by the SSD / array being tested. The ideal here is as steep of a slope as possible, and that slope shifted as far to the left as possible. If the slope tapers off, that means that a percentage of IOs are taking longer to be serviced, and where that percentage falls towards the right along the horizontal axis corresponds to the (higher) latency that those remaining IOs took to complete. You might have seen enterprise SSDs quote '99% latency' specs, meaning that 99% of the IOs fall at or below a specificed latency - this is the exact type of chart where such results would be derived from.

As you’ll note in the legend of the above charts, the IOPS does scale up as drives are added. Random write IOPS ramps nicely from 81k, to 174k with a second, and finally to just over 300k with the third 950 Pro added to the array. One of the main points I wanted to bring up in this article was that the reason an SSD ‘feels faster’ is not just because of the sheer IOPS increases – it is also due in large part to how the decrease in load seen by each SSD individually results in a major drop in overall IO latency.

To better demonstrate this, here is what it looks like when we test at a constant queue depth (QD=16) with varying numbers of 950 Pros:

You’ll note that the step decreases in latency percentile look similar to the percentile shifts seen in at the lower queue depths of our tests with fewer SSDs. As an additional data point, the 90th percentile point of that last chart showing the RAID spread at QD=16 shows latency dropping to 1/6 of the single SSD. Yes you are reading that correctly, the single SSD is taking *six times as long* to service the 90% most latent IO requests.

In simple terms, SSDs respond faster at lower ‘loads’, and running multiple SSDs in a RAID divides that load across them. You get the performance boost and IOPS scaling of a RAID, but you *also* get an overall reduction in latency for a given IO load. This effect is compounded further when you consider that the system has to do something with those IO’s it is requesting, and it is likely that a given application / CPU will not be able to request the multiplied ultimate IOPS of the RAID, meaning the array as a whole will run at a lower queue depth than a single SSD would. Since the queue is divided across the array, each SSD will be running at a queue depth even lower than its fraction of the array. An example to clarify: An application that reached QD=16 on a single SSD might only reach QD=9 on the array, driving the individual SSDs of a triple SSD RAID down to QD=3 (where the straight math would have given you QD=5.3).

Putting all of this together, running at an effectively lower Queue Depth (from the SSDs perspective) results in faster overall response of the array. Looking at the individual SSD latency results, running at those lower queue depths makes an even larger difference in IO consistency. The end result of this is a RAID of SSDs gives you a much greater chance of IOs being serviced as rapidly as possible, which accounts for that 'snappier' feeling experienced by veterans of SSD RAID.

Reads see the same type of effect, though it is less defined:

For the set of NVMe SSDs we were testing here, the spread was not as wide as it was on reads, but realize that we are on a logarithmic scale, and linear differences are less prevalent as you move to the right. There was still a 20% reduction in the 90th percentile latency across the spread at QD=16 in reads.

One additional point those with a keen eye would have noted is that in both cases (writes and reads, but reads less noticable due to the log scale), shifting from a single SSD to a RAID results in a ~6μs additional delay to *all* IO requests. This is the overhead cost of Intel's RAID implementation, and it represents things like the time taken to translate the IO addresses to the array. This added delay does have an impact at very low queue depths, but it is almost immediately outweighed by the increased 'acceleration' of the array as the queue depth climbs just a single point to overcome the effect.

Horses for courses. This motherboard was never meant to target the market sector that can do everything they need with a $400 system.
Many of us want or need significantly more power and are prepared to pay for it.

I'm not sure, since I can't find proper Xeon boards with enough channels, I've only been able to connect one 512gb 950Pro to my system with 25 256GB 850Pro drives to as many dedicated SAS lanes as I could manage (not nearly enough). That's hosting the storage for my workstation which is currently 44 Xeon cores with an additional 32 in transit now. The system has a little more than 800GB of RAM and uses dual 40Gb/sec Infiniband as a host bus and 10Gb/sec internet uplink.

Does this count as a power user? I didn't add any GPUs since I didn't have any need for them, but I'm considering adding a front-end device which hosts GPU as well.

I am using this configuration as a part-time data center for hosting labs for courses, but normally, I just use it for programming and compiling code and experimenting. I suspect I'll be up to around 2TB RAM and 200+ Xeon cores before 2017. My goal is to do it in a single rack with absolute resiliency. I have 4U sucked up with 52 3.5" hard drives though. They're big and ugly, but 400TB of SSD is still too expensive.

Oh that's certainly power user, but a different type of power user. Depending on the IOPS capabilities of the RAID cards you are using, this triple M.2 setup might be able to beat 25 SATA SSDs in some performance metrics.

This board is a high-end model. It looks like the main feature is a PLX chip which converts the 16 PCIe lanes from the CPU out to 32 PCIe lanes. This allows all 4 x16 slots to operate at x8 with 4 video cards installed. You still only get x16 bandwidth to the cpu though.

WEll, PLX isn't that great, its a stop gap until the Skylake-E's are out I wouldn't suggest it. Without it you would have 3 of these at 3.0 x4, so you would have x4 available for your video card. With the fast switching from the PLX you could do x16 and have x4 left over, but from every instance I Have dealt with PLX (mostly Z87 and Z97 boards) it is is not really even close to true 3.0 x16 performance. It only shows any returns when you have more than 4 cards installed. Personal opinion from experience, stay away from it. If you need the lanes go Haswell-E.

This of course was for shiz and giggles, to see what they could push it to, but if this was a "real" build I would only use 2 950 Pro's and the GPU at x8 with a non-PLX board (if they have any with 2 M.2 slots).

The difference between x8 and x16 is CURRENTLY negligible for gaming, might change with DX12.

Reads will see the same type of boost as with 2x 512's. Writes will see the same effect / proportion scaling up from one to two 256's, but since the 256GB model has lower write performance to start with, two 256s will not beat two 512s.

Even with the slower write speed of the 256GB model, a pair of 256s will still beat a single 512 in all but low QD (1-2) latency. Everything else will be better - higher sequential writes (~1.5x) and reads (2x), higher random performance at moderate QD, etc.

I was thinking a single 512 would be able to distribute the load similarly as 2x256 would in raid 0. I think that is true when only considering the memory chips. So where does the extra performance of the 2x256 come from?

The SSD controller will have a fixed number of channels. The 512 GB model just has twice the amount of memory attached to each channel. I believe Intel SSD controllers use 18 channels. I am not sure how many the Samsung controller uses. They wouldn't want to set up the controller to use half the number of channels with the 256 GB model since it would be effectively half the performance. You are not distributing across individual flash die, you are distributing across the channels of the controller. Twice the amount of flash die doesn't mean twice the performance. Double the number of channels can double the bandwidth though, if there is no bottleneck elsewhere.

RAID-1 is certainly doable for a pair of drives, and RAID-5 would be the more efficient choice for three (we talk about that part on page 3).

The effect of the reduced latency is faster response with a lot going on on the system (heavy loading, multiple apps hitting the array simultanrously). There is no existing consistent test using actual simultaneous launching of applications, so the closest we can come is with the testing we are conducting here. The reader will have to decide, based on their particular demand on their storage, how high they will be filling the queue (this can be monitored in Windows), if the reduction in latency is of benefit to them.

I do have an extension of this testing that will also evaluate as the SSD is filled and TRIMmed, but for now the setup was random access to an 8GB span of a full SSD / array.

For someone building a computer that'll mainly be used for gaming plus the usual everyday use scenarios, would the addition of a 950 pro provide a noticeably faster experience compared to a SATA ssd such as the 850 Evo or Pro?

Other than the increased performance over synthetic benchmarks, you won't see any discernible and tangible differences. I bought the 512GB 950 Pro NVMe drive to replace my 500GB 850 EVO as my boot drive and Windows 10 and all my applications load just as fast. Loading games such as BF4, SWBF, WoWS w/ tons of mods, and anything in my Steam library loads about 0.2~0.8 seconds faster with the 950 Pro.

Is it work the extra cost? YMMV, but for me it was not.

I eventually put that 950 Pro to test against a 480GB Seagate 600 into my web and database server and found it to be worth it in there with the much lower latency on DB queries and the ability to have more concurrent connections.

I think what Allyn and Ryan need to say out right and not have the assumption that readers will just figure out is that we are up against the laws of diminishing returns. Meaning, as SSDs get faster and faster with different NAND types, controllers and protocols like NVMe, we, as consumers, will start seeing less and less benefits. So what if you can shave off a few fractions of a second off loading your OS or an application? I am waiting to see what XPoint has to offer since it is magnitudes faster than current SSD technology. Perhaps it will usher in a newer performance benchmark. Or be a victim of diminishing returns...

If you compare the read write figures for a SATA connected SSD against a M.2 SSD (Samsung Pro Evo in particular) either using a M.2 slot or a Pcie slot with an adaptor then you will find the PCIe NVME SSDs are more than three times faster overall.

I have my OS on 1 X Samsung Evo M.2 and another two Evos (2 X M.2,s are on my motherboard) and another on an PCIe adapter card. ALL of the M.2,s regardless of connectivity perform at almost the exact same level. I also have 1x SSD drive connected via a SATA port for further storage.

I play games on my computer and ALL of them are stored on my M.2's. If I had known how much faster PCIe M.2's were over SSD,s I would never have even bought a SSD in the first place.

If you don't want to spend a lot of money to speed up your current computer without buying new CPU, MOBO, RAM, then invest in a PCIe M.2 connect either directly to the M.2 slot on the MOBO or if your current MOBO has not got a M.2 slot then buy an adapter for a few pounds and connect via a spare PCIe slot. Load your OS onto the M.2 and it will feel like you have supercharged your PC, load times from off to Windows will take about five seconds. use the spare capacity on the M.2 for Games and apps (Office for example). EVERYTHING you will do on your computer is SO much faster.

I would never go back to physical hard drives and or even SSD. Just make sure that any M.2 you buy is PCIe because the SATA M.2's are no faster than SSD' s, although they are still 3-4 times faster than physical HD's.

A very easy way to differentiate between PCIe/SATA M.2's is that PCIe M.2's have only one slot at the end of the stick and SATA M.'s have two slots. Maybe a long winded answer but M.2's (PCIe) are much much faster than SATA connected SSD's .

Joking aside, it seems kinda weird to be able to save physical space on a board that big which would most likely be put in a roomy case. Maybe because it is such an expensive MB meant for high end enthusiasts looking for options on builds and/or mods?
Otherwise, very interesting review, great write up!

Perhaps this is a silly question, but are the log scales for the graphs on page 4 labeled accurately? The scales jump from nano-scale (1e-9) to milli-scale (1e-3), but shouldn't the micro-scale (1e-6) be included in-between? If true, this would make the latency time reduction of running 3 drives in RAID only 1 order of magnitude instead of the 2-3 orders shown above.

A common misconception. RAID1 typically does not read data from both drives and then compare to see if they are the same, it will divide the reads across both drives and use the sector CRC's to ensure data integrity and only then will it switch to reading the other drive for the bad sector(s).

RAID-1 typically reads back data in 'performance' mode, meaning it stripes across the drives as if they were in RAID-0. No error checking happens here, but you can tell RST to 'Verify' the array, which will scrub both drivers front to back and compare data.

Those are some pretty impressive iops numbers. This seems an ideal setup for a high core count, write intensive OLTP database system.

I'm wondering about the DMI bottleneck though. I understand why putting the SSDs behind the chipset allows for UEFI-level RAID configuration. However, say that you don't want to use Intel RST, but instead rely on Linux' MD-RAID or Solaris' ZFS, then it would be better to have the m.2's wired directly to the CPU, no ? Then again, the question then becomes *where* you're going to get data to and from at a sufficient pace to keep that SSD array busy enough on a consumer level system like Z170.

Interesting article. Thank you very much for taking the time to document and share your findings.

Remember we were only writing randomly to an 8GB span of sequentially filled SSDs here. OLTP would randomly write to a much larger span of the SSD (if not all of it), so to get good sustained random write performance you will need enterprise SSDs which can better handle sustained workloads to 100% of the volume.

This totally does *not* apply when a queue is involved. For example, the OCZ R4 hit very high IOPS, but used SandForce SSD controllers in a RAID to get there, so individual IO latency was far higher than what we are seeing here.

Allyn: You put in a ton of work on this. Thank you for sharing! My wife has been griping her computer is slow and I always get the "good" stuff for myself, which is absolutely true, lol. So I was looking at all the latest technology, and was really wondering about the RAID 5 aspect with the 3 each M.2 connectors. You answered the questions I had. I have been using RAID 5 exclusively for many years, and my wife's old computer (~8 yrs) has a 1.5 TB raid 5 C: drive, which always has to rebuild if the system locks down, which can take a day or more. Raid 5 still works, of course, but slows down considerably when rebuilding. So, I am a little paranoid about using raid 5 for the C: drive. I use a single SSD C: drive and a 3TB raid 5 D: on my own computer. Your comments about loading the system using a GPT external USB drive are crucial. I obviously am rusty on the latest bios settings terminology, but I have built my own computers for the last 20 years, one every 5 years with the latest stuff, so there is always a learning curve since I do it so seldom and technology changes.
Your article helps a Lot. Thank You!

Great write-up! It was hard to get through some of the technical details but Allyn promised the next page was going to be amazing. I was expecting a free computer offer or something. For real though amazing details. Really excited about my next build!

Wow, amazing storage review!
I'm trying to decide whether to go 850 or 950 mSATA single 250gb. Three way RAID 950's is a different universe of performance. Love the new latency visuals. Ryan, should let Allyn keep this setup (make that a Patreon theshold)

Perhaps slightly out of context for this article, but can anyone comment on how this config would affect an SLI installation? I believe that 3 M.2's and multiple graphics cards will take up more PCIE lanes that are available in the Skylake architecture.

so basically your SLI cards would be forced slower? Which would have priority to the PCIE lanes or is it all multiplexed somehow?

This uses PCIe from the chipset. You will lose all of the SATA ports off the chipset to do this. This will take PCIe 15 to 26 from the chipset. The lower PCIe links are still available for USB, network, other controllers, and probably last PCIe slot. The graphics cards would be running off the CPU PCIe lanes connected to the x16 slots. I don't know if this board supports 3-way CrossFire by using an x4 from the chipset. That would run into bandwidth limitations due to the link between the CPU and the chipset. It isn't really relevant anyway. I am not sure what applications you would be running at home to really stress this set-up at all. What you'd you be running to stress this set-up and your graphics system at the same time?

Actually, it is only 20 PCIe lanes from the chipset. The HSIO lanes 1 to 6 are USB3 only, while some of the 20 PCIe lanes can be switched to SATA. Using 3 x4 m.2 takes 12 lanes, leaving 8 lanes for other controllers or slots.

"The end result result of this is a RAID of SSDs gives you a much greater chance of IOs being serviced as rapidly as possible, which accounts for that 'snappier' feeling experienced by veterans of SSD RAID."

While interesting as an exercise (I know only 3 ports) I find nothing short of sacrilege to suggest RAID5 setup on Flash based drives. If it was drivepool kind of setup then fine, but it's not. NVMe doing great in R1 or 10. There is no point wearing NAND with unnecessary parity writes (just like classic SSDs). Basically if you value your SSD all parity based Raid levels are out of the window. Even in enterprise environment SSD parity arrays are rarely encountered. And all of that with SSDs which cost 10-30x more than consumer grade drives. Simply 1 or 10 is much more convenient and easier&faster to recover. Time is money.

Modern SSD drives will run for years just fine under heavy RAID5. Plus with RAID, the real issue with failure is effect on cost over time (because there is no data loss with a single drive failure). And RAID5 is extremely cost effective, because you can get to, say, 1TB of data with only 3 500GB drives and still have parity. That compares favorably for cost with using 2 1TB drives in RAID1. The slight reduction in life from the parity writing is nowhere close to enough to offset that cost savings. Therefore, RAID5 should be a recommended configuration where expensive SSD cost is a chief concern.

The biggest downside is that this board only supports 3 M.2 drives, the minimum (and least cost-effective per Byte) number of drives needed for RAID5. This also means no expansion capability, which is possible with SATA SSDs in RAID5.

Boot of a 'clean' fresh install is essentially the same (or in some cases it takes a second or two longer due to different initialization of some BIOS when initializing NVMe devices during boot). Where the speed difference would be more seen is a 'well used' OS that has had a lot of other apps / startup processes / cruft generated over time. The additional SSDs would keep latency lower during the increased load seen during that boot. Still, we are talking a few seconds time, and that only happens while booting, which is a rare event (and why we don't focus on that aspect).

Why cant M2 slots be at right angles to the motherboard ?This would save space and allow better airflow and use less space. I could possibly see myself getting two m2 in raid0 at 120gb rather than a single 240gb.It would be interesting to see results of windows raid also, which has been flawless in my system (win7 ).

The ASRock Z170 OC Formula and Extreme 7 both also have triple M.2 and are around half the price. They've been out for some time now. You should check them out to see if they have a same or worse RAID implementation.

The RAID implementation is in the Z170 chipset, so it should be exactly the same. It does have some hardware acceleration, but it doesn't seem to have hardware parity calculations. It would be cool it they could make a PCIe x16 RAID card that can handle 4 of these SSDs. No home user needs such a thing though.

This specific board is probably really expensive sinxe it has a PLX chip to convert the x16 PCIe connections from the CPU out to x32. This allows for 4-way SLI with x8 PCIe to all 4 slots.

Each individual thread (program) that hits the storage will add *at least* one to the QD figure. Apps can individually ask for multiple sectors at the same time, or can 'ask ahead', which builds the queue. A simple windows file copy can run at QD=4 with nothing else going on. QD can spike past 64 on a powerful, multi-core system during boot where dozens of other apps and services are simultaneously launching. Note: SATA devices can't exceed QD=32, so if the OS climbs higher, the queue backs up into the OS itself, no additional benefit will come from a SATA SSD (since it can't see further ahead than the next 32 requests).

Great job as usual Allyn, excellent info and methodology. I have one of the 950 pros and had to use an Asus Hyper X4 riser card on my X99 because the native M.2 are worthless 10G slots and that's when it dawned on me that I wouldn't be able to do RAID with M.2 because the rest of my PCIe slots are occupied.

What we REALLY need are PCIe x16 riser cards that can support up to FOUR M.2 2280 cards for either RAID or JBOD, but the most important thing is it consolidates space and slots which is a problem now due to the way M.2 are routed with HSIO since they need to work individually or in tandem.

Do you have any info on this from industry OEMs? An x16 riser card that could take 4x M.2 would be awesome!

There certainly is a lot of engineering elegance to be had
with four M.2 @ x4 PCIe 3.0 lanes = x16 PCIe 3.0 lanes.

However, a PLX-type chip is required because
PCI-Express does not generally allow multiple
discrete devices in a single PCIe expansion slot.

HP and Dell have already developed same,
but the HP version requires an HP workstation.
For photos and discussion:

Google "Cheap NVMe performance from HP"

We published a WANT AD for same several months ago,
and one storage expert confirmed that h/w RAID
controllers are "works in progress" but
he was limited by an NDA and couldn't say
much more. To locate our WANT AD:

As such, the upstream bandwidth of a single
NVMe M.2 connector is exactly the same
as the upstream bandwidth of the DMI 3.0 link.

It should be very interesting when Optane
(Intel 3D XPoint) non-volatile memory
becomes available in the M.2 form factor:
that development should create lots of pressure
to increase the upstream bandwidth to satisfy
that extra demand.

At the moment, barring any major changes in
Intel's latest chipsets, RST and RSTe
will only work DOWNSTREAM of the DMI 3.0 link:
RST does NOT work with the x16 lanes controlled
directly by any Intel CPUs, as far as I know.

Allyn, if you're reading this, could you
possibly confirm or update any of the above, please?
I would like to refine my understanding of these
issues, so as not to mislead anyone else.

If Intel's RSTe only works downstream of the DMI 3.0 link,
it seems that a 12 Gb/s SAS controller with 8 x 12G ports
should exceed the DMI ceiling e.g. by configuring a
RAID-0 with 12G SAS SSDs e.g. Toshiba PX04SL SSDs:

Yes, a PCIe x8 RAID card can exceed DMI 3.0 (4 lane) bandwidth, but you are adding a bunch of latency and a lower maximum cap on ultimate IOPS that the RAID controller can handle. Intel RST (SATA) actually beats most add-in RAID cards as far as IOPS scalability goes. It would also not be able to communicate to the host via NVMe, so there would be the same sort of IO overhead seen with SATA. It's basically the long / expensive way go reach those high figures.

So to answer a quick question,
with three M.2 SSDs installed, will my graphics card run at x8?
Also to clarify this, I will have two SATA ports available for any optical or HDD or SSD to add further on?

I plan to build a high end gaming computer for iRacing.com use. It seems to me that 3 m.2's in Raid 0 are not going to increase my FPS, though it may help with load times from the sad's to memory for tracks and cars.

Also, isn't Raid 0 a bit unstable? If there is any sort of memory error on either SSD won't that lock up the OS?

Even 2 SSD's in Raid 0 doesn't appear to add a significant advantage for my system.

Could this be used for VOD server? I was thinking of 3 pcie ssd cards with 1.5TB of total amount of space(raid 5), so it would need about 2TB of pcie SSD storage... And it would also need 2 gigabith ethernet cards...

I have a question for you guys. I bought 3 950 PRO M.2 SSD. I want the best performance in my PC so which motherboard and processor should I buy. What is the best option for me? I plan to install one or two graphics cards in SLI.
Your article is a few months old, so maybe there are better products available .﻿

I set about building a PC for video production and the top priority was HDD speed in order to capture up to raw 4k. In my search I found this great review and it convinced me to go with this motherboard and (2) 950 SSDs. I put all my faith in RAID0 because I backup regularly and archive to RAID1. My downfall was that I also used the RAID0 as a boot drive. I am now re-installing windows for the 3rd time, but learned my lesson this time and using another SATA drive for my boot device.

The problem I discovered is the BIOS will decide to reset CSM to enabled which in turn disables Intel RST and breaks the RAID.

The first time was my doing when I installed a video card that did not support UEFI. The second time was after not using the PC for a week. After booting up, my RAID was marked as failed. Checking the BIOS and it decided again to reset CSM to enabled for what reason I do not know. The only option was to delete the RAID and create a new volume. This does not recover the drive contents, but I found a utility that let me recover the partitions and get my data, but it still was not bootable until a re-install.

Does anyone know of a way to lock down that CSM so it will not change on its own?

Do you also do video editing? Timeline scrubbing and rendering can really use lots of I/O in both cache and source disks...

What do you think of the idea of configuring your system so that you're booting from a regular SATA SSD, pointing your NLE software cache to a RAID 0 of 2x M.2 950s, employing the third M.2 950 to hold all of the project source files (raw video, audio, and media, and finally, having 2x HDDs in Raid 0 to catch the transcoded video files?

Or... this is more straightforward: one m.2 for cache, one for source, and one for the target.

I do video editing also. After my multiple RAID failures, I reconfigured my system to use one SATA as the boot drive, two M.2 as RAID0 for capturing and temporary working drives, then another SATA to move completed work to. I lost trust in the M.2 drives as RAID0 (more so in the BIOS), so nothing important stays on those drives prior to shutting down.

I did discover something interesting while working with different configurations. When I first installed the two M.2 drives, I put them in two adjacent slots. This defeated a majority of my SATA ports. I did some research about theshared hardware on the motherboard, then moved the drives to the outside M.2 slots. This gave me use of faster SATA ports.

I'm trying to find the best (value) high power setup for 3D content creation. I'm thinking the i7-5820k is the best value and I want as fast as possible application startup/speed so it sounds like a couple 512G 950 Pros in RAID0 would be the best option (also considering RAM disk). I'll be using a GTX1080 as soon as the price settles down :).

Is this motherboard the best option for 2x 950Pros in RAID0? Am I gaining significant performance in application startup/speed with such a setup or is a single 950 Pro adequate?

asrock z170 oc formula has 3 m.2 slots and has everything except 4way sli and even will have left 4 sata3 ports after all m.2s as it has 2 asmedia controllers 2 more than this and if u want wifi it also has a place for those laptop wifi cards(both dont come with wifi by default) and biggest factor is it comes for $200~250 half of what this costs and it even supports tridentz 4300mhz ram this doesnt so i dont see the benefit of this board over asrock's

Thanks!! I'll check out the Asrock. Sounds like exactly what I'm after.

Do you know if setting up 2 512G Samsung 950 Pros in RAID0 is pretty straightforward with the Asrock? I'm not very knowledgeable about PCI lanes and all that so I'm hoping I just plug in the drives, configure the BIOS and I"m done!

Used your guide to do a triple m.2 raid 0 stripe on my gigabyte z170x soc force motherboard, as the boot drive. Worked great, super fast, until I updated the bios now it tells me "reboot and select proper boot device or insert boot media in selected boot device and press a key". I committed the ultimate sin and do not have a backup of my drive. I don't want to lose my info. Is there any way to fix this? By re doing the raid 0.

Need your help
Thank you
Ps you might want to do a video on this, as I'm sure others with a raid 0 setup have made the same mistake

MSI claims double performance up to 64Gbps with M.2. I tried Raid0 with Intel RST and I get the same results as you.. maxing out at about 25.6Gbps. Any idea what MSI is talking about or how to go beyond 32Gbps?

Maybe I should try running them both as single drives simultaneously, or do software raid.

The configuration in this article is not the fastest RAID setup possible. The CPU has too few PCIe lanes. You need a 40 lane CPU (Enthusiast line).
While I cannot boot from it, I have three Samsung 951 AHCI drives in Windows software RAID 0 and I get 6.6GB/s sequential read seed. My system boots from a fourth M.2 drive on the motherboard's M.2 slot. All my software and games are installed on the RAID volume and all have very short load times.

Why use a $400 motherboard to test this, when HP and Dell deliver a method for splitting the x16 lane into 4 times a x4 lane that these 950pros actually use: Dell 4x m2 and the hp z-turbo quad pro boards (it will not allow me to place links)

And guess what, they also solve the thermal throttling issue at the same time.

Has anyone attempted running 2 raid 0 arrays (1x Sata & 1x m.2 nvme) using the z170 Chipset? My current setup has a 960 Evo 256 GB for OS and a WD BLACK 1tb HDD for storage. I'd love to double my storage but if I could do so while adding increased performance that would just be icing on the cake. Unfortunately, this is my first pc with an SSD and I'm a little green when it comes to raid arrays but if anyone has any input it would be greatly appreciated. I'm planning on starting this sometime in may so hopefully I will be able to figure with out by then. :-)