Posted
by
kdawson
on Tuesday January 27, 2009 @04:28PM
from the that's-no-hard-drive dept.

theraindog writes "Intel's X25-E Extreme SSD is easily the fastest flash drive on the market, and contrary to what one might expect, it actually delivers compelling value if you're looking at performance per dollar rather than gigabytes. That, combined with a rackmount-friendly 2.5" form factor and low power consumption make the drive particularly appealing for enterprise RAID. So just how fast are four of them in a striped array hanging off a hardware RAID controller? The Tech Report finds out, with mixed but at times staggeringly impressive results."

Of course, they ran all their tests in Windows. I wonder how much of the results in some of the tests (like program installation) are really due to how fast NTFS can handle lots of little files and not due to the drives they were testing.

It would have been nice to see some quick tests under Linux with ext3 / XFS / reiser / ext4 / btrfs / flavor_of_the_month just to see if that was really the drives or a vastly sub-optimal access pattern.

Windows filesystems don't even have an optimal access pattern. At least with ext2/3 you can optimise for RAID stripe and stride in a way that works regardless of the underlying RAID implementation, and significantly reduces the number of disks involved in reading/writing metadata.

What, you shove thousands of dollars worth of I/O into a system, and run it through the paces with a CPU that sucked in 2005? I'm not surprised at all that most tests showed very little improvement with the RAID.

What I want to know is if the RAID controller had a battery backup unit installed so write caching could be enabled. There is no BBU shown in the article's picture of the controller.

I recently built a new Exchange server with 6 X-25Ms (we couldn't get the 64GB X25-Es when we ordered it) hooked to a 3ware 9650 in three separate RAID1 arrays. Turning on write caching switches the whole feel of the system from disappointingly sluggish to there-is-no-way-these-tire-marks-were-made-by-a-'64-Buick-Skylark-conver

RAID5 has terrible random write performance, because every write causes a write to every disk in the array. it's VERY easy to saturate traditional disks random write capabilities with raid5/6. So, it's rightly avoided like the plague for heavily hit databases.

I'm not certain how much of the performance hit is due to latencies of disks. So i feel it would be an interesting test to also see raid5 database performance.

Also, Raid1 (or 10 to be more fair when comparing with RAID5) in a highly saturated environme

RAID5's write performance is so awful because it requires so much reading to do a write.

I have to read from _every drive in the array_ in order to do a write, because the parity has to be calculated. Note that it's not the calculation that's slow, it's getting the data for it. So that's multiple operations to do a simple write.

A write on RAID1 requires writing to all the drives, but only writing. It's a single operation.

RAID1 is definitely faster (or as fast) for seek-heavy, high-concurrency loads, because each drive can be pulling up a different piece of data simultaneously.

If you setup your Raid block size and your Filesystem block size appropriately, you won't have to read-before-write, at least not very often. Setting up RAIDFrame on NetBSD, with a 4-drive raid-5 setup, performance was dismal because every write was a partial write (3 data disks meant that it was impossible for the FS block size to match or be an even multiple of the Raid block size). Going to 3 drives or 5 drives performance increased about 8-10 times.

Because four drives in a RAID-10 are three times as reliable as the same four drives in a RAID-5. Arrays of large drives are more vulnerable to drive failures during reconstruction than arrays of small drives, and RAID-5 is much more vulnerable to a double drive failure than RAID-10 [miracleas.com]. In RAID-5, you lose data if any two drives fail. In RAID-10, you lose data only if the drives that fail are from the same mirrored pair, and there's only a 1 out of 3 chance that two randomly selected drives will be from the s

Good controllers let you set the behaviors as do good implementations of software raid. For instance on Solaris with SVM you can set a raid 1 to read only from a the primary, roundrobin alternation, or (my favorite) read from whichever drive that has a head in position closest to the requested block. For random read biased application the final option wins hands down on latency, for sequential streaming reads the roundrobin seems to be the best option, and for absolute hardware reliability the "read from pr

RAID 0 is not redundant, they are not really 'disks' any more and they could be independent disks rather than inexpensive. Sorry I know you were trying to be funny but I felt you could have more fully reduced the issue.

Dude, 4 of these drives can keep up with my 110 spindle FC SAN segment for IOPS. Here's a hint, 110 drives plus SAN controllers is about two orders of magnitude more expensive than 4 SSD's and a RAID card. If you need IOPS (say for the log directory on a DB server) these drives are hard to beat. The applications may be niche, but they certainly DO exist.

That's right. Marketing switched "Inexpensive" for "Independent" years after the term was coined, because they couldn't convince people to buy their non-Inexpensive disks for RAID use as easily under the old meaning.

Yes, well, that, or maybe it's just that the notion of "expensive" disks is gone. These days, you pay a tiny amount per GB, which usually goes down with increasing size. Oh sure, you may pay a bit more at the top, but it's not much.

It used to be that you could get huge drives. I'm not just talking about the fact that they would store like 20+MB, but they were also physically huge. I used to have one that was 5 platters and two 3.5" slots high (though my memory is fuzzy). These suckers were EXPENSIVE; m

Indeed, telling us to ignore the extra minute in the X25-E RAID0 boot times compared to the other setups is highly disingenuous. RAID setups are slower to boot because you have to load the RAID BIOS first, if you really care about fast booting it's something you need to be aware of. There were also CPU bound case where the RAID0 setup performed slightly worse than the single disk, an obvious sign of a performance hit due to the RAID card.

Or LSI, we get half the IOPS from our LSI based HP P400 than we do from the ICH in our HP workstations when using the X25-e's. Reports on the web lead me to believe that there are NO hardware raid cards that can keep up with these beasts which is a shame because I can't see using them without battery backed write cache. I'm going to look into the big boy P800 and possibly the new P410 but right now I'm kind of underwhelmed by the fact that $5-600 raid cards get beaten badly by the 'free' ICH controllers on

Doom levels????Office tasks???Okay folks I can only see a few groups using this kind of set up.Not one Database test?I mean a real database like Postgres, DB2, Oracle, or even MySQL. Doom3... yea those are some benchmarks.

Intel's X25-E Extreme SSD is easily the fastest flash drive on the market, and contrary to what one might expect, it actually delivers compelling value if you're looking at performance per dollar rather than gigabytes

I hope someone got a healthy commission from Intel for writing that...

Let me get this straight. Is it possible to do any kind of article on a commercial product without it being "astroturfing" of some form or another? Or is it only the negative articles that can be done? I just want to know the SlashDweeb rules.

Is it possible to do any kind of article on a commercial product without it being "astroturfing" of some form or another?

Yes, it is. They didn't need to write it as

Intel's X25-E Extreme SSD is easily the fastest flash drive on the market, and contrary to what one might expect, it actually delivers compelling value if you're looking at performance per dollar rather than gigabytes

There was no need for that, either. I rather doubt that someone is forcing you to read anything on this website. You could read something completely different if you prefer, or not read anything technical at all.

I stand by my criticism of this article. The headline did not need to be such blatant advertising of the Intel drives.

Other than just using one of these Flash RAIDs as a swap volume, is there a way for a machine running Linux to use them as RAM? There are lots of embedded devices that don't have expandable RAM, or for which large RAM banks are very expensive, but which have SATA. Rotating disks were too slow to simulate RAM, individual Flash drives probably too slow, but a Flash RAID could be just fast enough to substitute for real RAM. So how to configure Linux to use it that way?

If you're really looking for high performance storage, you should go with a DRAM-based solution. This has almost no latency and can scale to any interface. Depending on your budget, you can get SAS 3GB/s 2 ports with 32GB capacity for a bargain $24,000 (URL:http://www.solidaccess.com/products.htm/> and if you need more performance or storage space, spring for the serious iron--a FC 4GB/2, 2 ports at a mere $375,000.

No need to raid this puppy. Make sure you spring for the redundant power supplies and r

When will someone come up with a hardware or software RAID solution to enable several USB flash drives to appear as a single drive on Windows? with relatively reliable & fast (12MB/s write, 30MB/s read) 16GB flash drives as cheap as £16 each [play.com] I'd love to cram as many as I could inside my Eee and have them appear as a single drive instead of many individual drives.

As SSD drives come into the market used, how will people know how close these drives are to "used up". That is to say, we will have to worry that these cheap drives on ebay will have lots of "bad" spots that can no longer be written. We are going to be needing a program or device of some kind that can certify the state of a drive so as to set a fair value on it. I expect a lot of unhappy people when used drives get installed and start failing soon after. There will have to be some pretty sleazy warrantees to cover used SSDs.

I think you're comparing against SATA drives. People that worry about IOPS are normally using FC drives which are much more closely aligned in price with SSDs. (btw, been a while since I was in the market for FC drives)

SSD drives are not as unreliable as people seem to think, It would take a *minimum* of 5 years of *continuous* writing and thats for the cheaper SSD's like you would find in a netbook rather than the $300 Intel top of the line ones, not to mention ones being used in a RAID would have a longer life.

Besides even if some blocks go bad you can map around them, the SSD itself might even do it.

Besides, you are unlikely to be using the same drive in 5 years time and magnetic drives have a much higher chance

Where SSDs is in disk operations that are usually lagged out by seek times; a big unwieldy database that gets a lot of writes and no downtime, for instance, is happiest when it lives on a striped SSD array.

Coincidentally, this is exactly the type of workload which is most likely to shorten a magnetic drive's life.

I'll be sure to do that, and replace them every 5 years when they run out of write operations.

Winchester drives, on the other hand, use a time-honored complex system of delicate moving parts, and last virtually forever. They certainly do not start experiencing sudden failures if kept in continuous service for more than 5 years.

All modern hard drives are Winchester drives; Winchester drives are just the first iteration, made by IBM, who figured they'd ship two 30MB platters and name the hard drive after the Winchester 30-30 rifle. Who the hell modded you insightful, especially for claiming a system of delicate moving parts lasts virtually forever...

MTBF is a highly inaccurate way to show how long you should expect a drive to live. The whole Seagate Fiasco is a prime example of why NOT to believe them.

It can be a good ballpark figure, to differentiate between enterprise class drives and consumer drives, but should NOT be an expected number.

There are too many things to take into account: temperature surrounding the drive, how many days it's on for, how long per day it's on for, how many writes to the drive, how much voltage is supplied, etc etc..

MTBF is a highly inaccurate way to show how long you should expect a drive to live. The whole Seagate Fiasco is a prime example of why NOT to believe them.

Misuse of a statistical figure is a problem with those misinterpreting it. Obviously things have changed since schools taught the difference between the mean, the mode, the median, and the minimum. If I run an ISP then MTBF is useful for me to calculate costs, both in replacements and labour costs. It's not supposed to be a measurement for consumers though that will be buying single unit quantities.

Buying a hard drive is like buying a washing machine. If I'm lucky it will go on practically for ever. On the other hand if I'm unlucky it could die tomorrow. As Piranhaa says, there are too many variables. All I can go on is that if it comes with a garauntee of 3 years then I assume the manufacturers have designed it to mostly exceed that figure otherwise they would end up losing money on the product. I still have to ensure I have a contingency plan in case it breaks down.

Hint: You should learn that people tend to compare things using some "measure". In the disk world, that's MTBF. 2 million hours MTBF is comparable or better than other enterprise drives. Hence original poster is, indeed, clueless.

Think of it like this let's say you have a 1MB log file that changes every hour, and 1 MB system file that never changed. You keep writing that log file to the same place until 1,000 times. The drive then says ok if that file is really volatile let's swap the log and system file so the log file is in a place that's been overwritten 2twice, and the system file is in a place that's over written ~1,000 times. You can then write to that log file 1,000 more times and now the average usage of the disk is ~1,00

See, in the enterprise environment that I work in the majority of our big hardware is leased. I am quite willing to use what I can to maintain performance and reliability. That being said my system is built entirely on 15K drives of various sizes. I am not worried about five years or so of read/write that SSD drives have, all I want to see is a track record. I expect to replace most of the drives I have now within five years so this "five year limit" many like to toss out is immaterial to me. Reliability over that lifetime is of more importance.

Besides, the nice benefit of SSD drives is I don't need special enclosures (read: ones that can handle the torque these puppies can put out)

[quote]Given the 100GB per day x 5 year lifespan of Intel's MLC SSDs, there's no cause for concern from a data reliability perspective for the desktop/notebook usage case. High load transactional database servers could easily outlast the lifespan of MLC flash and that's where SLC is really aimed at. These days the MLC vs. SLC debate is more about performance, but as you'll soon see - Intel has redefined what to expect from an MLC drive.[/

I really don't get this obsession with page files these days. Say you have 4GB ram and an 4GB page file. Memory is cheap these days, so rather than using 4GB of (relatively slow) SSD, why not just get another 4GB ram?

SSD shouldn't be for paging. That would become very expensive (even with wear leveling) if you have a minimal amount of RAM (say 256M) to run large (say 16G) operations. It would also be slow since you have the overhead of whatever bus system your hard drive/ssd is connected to.

Technically hard drives aren't supposed to be paging either, it's just a cheap and simple trick to avoid having people pay a lot for (expensive) RAM or have their programs crash when occasionally they run out of RAM. However if your system is paging heavily it's better and faster with more RAM.

Anecdote: I worked at a place once where cheap ($500) hardware was sold as dedicated SQL/IIS servers (you could fit 10 of them in 5U) and a lot of customers thought they could run whatever they wanted (Microsoft ran MSN for a whole country of one for a while) in them but they only supported a maximum of 2G RAM (4G according to BIOS but the modules back then were too expensive). Of course PHB just said: let them swap and besides the heavy slow downs they ran fairly fine. Well, those heavy users all crashed their software-RAID's in less than a year (the heavy load made Windows get the RAID system out of sync and then you had the first hard drive fail). The temperature was fine but simply swapping out was too much for the cheap hard drives (Maxtor and Seagate) and they all failed.

SSD shouldn't be for paging. That would become very expensive (even with wear leveling) if you have a minimal amount of RAM (say 256M) to run large (say 16G) operations. It would also be slow since you have the overhead of whatever bus system your hard drive/ssd is connected to.

You talk like you know what you're talking about; but then the reader realizes you don't understand what happens when the CPU spends 99% of its life in wait state waiting for paging operations. Swap is not a high-intensity workload; swap workload increases six orders of magnitude faster than CPU workload, meaning when you start swapping, you spend lots of time swapping.

As the hard disk is external, this number increases with CPU speed; a swap operation taking 1,000,000 cycles on a 1GHz CPU (1mS) will ta

I don't think anyone should be using a page file at all if you have 4 GB or more of RAM. Maybe even 2 GB. It just doesn't make sense. With that much memory what good is a 512 MB page file going to do really? And if you're swapping more than 512 MB of RAM to disk your machine is going to be thrashing like mad and unusable anyway.

It's stupid that many OS's allocate 2 times your RAM as a page file. Are you really going to swap 8 GB of RAM to disk? I mean seriously, that would be unusable.

I'm Betaing Windows 7. Before going to bed I set up a swap partition for it. After getting up the next morning and checking, it was full.

I have *no idea* what W7 put in there while I was sleeping.

In any modern operating system, including Windows , swap isn't just used for out of physical memory conditions. It's also used to "page out" portions of the operating system and libraries, shared objects, dlls, etc., that aren't being used at the moment. This actually speeds your system up by allowing more memory to be used as disk read/write cache.

I've looked at Linux boxes with 64GB of memory in them and only using 25% of that. I usually get asked by someone, "wasn't 64GB enough? Why is there some usage in swap right now?" It's normal, I explain. The kernel just pages out sections of Linux that aren't needed, to free up more RAM for filesystem caching.

I think perhaps Windows 7 just has a more aggressive way of doing this, probably because if you need to use some obscure Windows Directmedia SuperDRM doubleplusgood Plugin X, it's just as fast to reload it out of swap into memory as it is to load the binary from disk. But 99% of home users will never load that plugin so it can stay safely swapped out, giving you more precious memory for applications and disk cache.

I'm no expert, but wouldn't that be a redundant statistic? if it handles normal read/writes faster than a disk drive, then could you presume paging would be faster as well?

Although it would be interesting to see a RAM-less PC try and run on SSD's only... somehow using normal data read/write, and memory read/write on the same SSD (if thats possible). Guess that's what we'll end up with eventually anyways, where your amount of MEM is the amount of free-space you have on your SSD, no longer seperated component

Most people have this mistaken belief that SWAP is interacted with as often as RAM (hundreds or thousands of times a second, at least; RAM is interacted with sometimes hundreds of thousands of times per second). They think swap is an actual extension of RAM, not a long-term slow storage shelf.