If you use consumer oriented hard drives in an enterprise environment that needs high up times, you're going to have a bad time.

For hard drives, I'd go with a Western Digital RE series drive or equivalent from another manufacturer that's rated for use in a RAID. For consumer SSDs, the Intel 320 series and Samsung 830 drives are among the most reliable; a SLC based enterprise drive would be the best for durability but they come with a much higher price tag.

You could try cheaper drives in a RAID 6 or 10 + hot spare, but if uptime is key it's best not to skimp.

Yesterday, HP gave me a quote on their 200 GB Enterprise Mainstream SSD (which is slower than their Enterprise Performance SSD). Price? $2,219.

My thought would be to buy four higher-quality consumer SSDs (Intel, Samsung, possibly Mushkin?), put three in RAID 5, and use the fourth as a hot spare. Then buy a consumer HDD (maybe WD Red) and schedule a nightly backup to it. Additionally, we do have offsite backup.

Would this be sufficient in terms of reliability? Am I going to run into garbage collection issues or slowing down over time? Am I going to use up the number of writes SSDs get since it is for a database?

That would probably do well enough for reliability. Performance-wise, it's tough to say. Write speed will already be hampered by RAID 5, the lack of TRIM support couldn't hurt much more. When I was running two Intel SSDs in a RAID 0 on my dekstop, I didn't really notice any speed problems from not having TRIM support, but if this is going to be a heavily written database it might be a problem.

I'd read through some SSD reviews, either here or elsewhere, that talk about slowdowns on heavily used drives and decide how much of an issue it'll be for what you're using it for.

The main issue with using consumer mechanical disks in a RAID array is that the read retry timeout is long enough that the RAID controller may drop the disk from the array prematurely when soft read errors occur. This isn't a total show-stopper, but it is definitely non-optimal. You should use RE (RAID Edition) drives in any critical servers -- the retry behavior is tuned for use in a RAID array, the drives are tested more rigorously, and the warranty is longer.

In a pinch (if seriously cost constrained), you could use Caviar Blacks. I wouldn't go lower than that.

You may want to consider RAID-6, which provides an extra level of redundancy.

And DO NOT try to use RAID as a substitute for regular backups. You need both for a mission critical server.

The years just pass like trains. I wave, but they don't slow down.-- Steven Wilson

For only 14 users, I think your RAID5 + hotspare and nightly backup should cover it.

That's unlikely to be enough users to hit a garbage-collection wall, and with the array rebuild speeds of SSD's, you could survive two of the four disks failing and still maintain availability. If you massively overprovision - say you use only 128GB of the 256GB disks, you'll lower the write amplification and also massively increase the amount of activity needed to put the drives into a tortured state. You'll still have 256GB from a RAID5 in this configuration which is more than the 200GB HP quote you mentioned

The only problem would be if the second drive failed before the hot-spare was up to parity with the first dead disk. On an SSD with a proper PERC controller I wouln't expect this "vulnerable" period to be longer than 10-15 minutes for 256GB drives.

You could improve the usefulness of the local mechanical backup by running bit-level or block-level mirroring to it. You'd probably find that you could afford to do it several times an hour, even with multi-gigabyte databases thanks to the miracle of bit-level effeciency. If you had to manually bail on the SSD array and resort to the mechanical disk, you'd probably only have to roll-back a few minutes. I'm no expert on free/opensource bit-level stuff, but Rsync or Deltacopy are probably worth looking at to start with.

Some people ask me why I have always enclosed my signature in spoiler tags; There is a good reason for that, but I can't elaborate without giving away the plot twist.

Someone in the office just mentioned the idea of keeping our network storage on the other server instead, making the only task of the new server handling the database (SQL and Dynamics CRM). This is counter-intuitive to me because the HDDs in the older server are small and we can purchase any size we want new. However, if the network storage becomes a separate issue, we really only are using 30 GB for our database currently. I bet if we double that counting for future use, add 30 GB for the OS and SQL (a wild guess), and add 60 GB so we aren't fast approaching filling up the drive (to lengthen SSD life and keep the speed up), we'd still be at around 150 GB of total storage needed. Maybe RAID1 across three high-end consumer 240-256 GB SSDs? Or RAID1 across two with the third as a hot spare?

I mentioned having the HDD run nightly because I would assume it couldn't keep up with the SSDs. But if there is a way to have it work more quickly, more frequent is better.

We're in the process of setting up a cheap offsite server that can back up everything every half hour, I think. I'm not quite as involved in that project. But absolutely, we're still including regular backups, not just redundancy.

I don't know that I would describe us as heavy writers in the database, since we do use it for a lot of reading too. Currently, it takes a good second for us to open a contact record in our database. And sometimes writes can take a long time. I just want it to feel as snappy as the Tech Report website. And since it is all internal and uses very few images, maybe that can be a reality, haha.

LukeCWM wrote:I don't know that I would describe us as heavy writers in the database, since we do use it for a lot of reading too. Currently, it takes a good second for us to open a contact record in our database. And sometimes writes can take a long time.

It really, really sounds like your database could use some indexing and/or query optimization work, plus some bigger software caches. Even the fastest hardware can't help you if what is supposed to be a simple 1-record fetch turns into a multi table join/sort thing.

There is a fixed amount of intelligence on the planet, and the population keeps growing :(

If you have a HW RAID Controller consumer drives would be a No-No unless you are talking about SSD's or planning a ZFS installation. Since this is for DB work 1TB or even 500 GB drives (or smaller) would suffice easily. You want speed over capacity. I would also go with RAID 10 compared with RAID 5 or 6. A) There's no write penalty and read is just as good on any advanced HW controller. There's also the possibility of suffering a 2 disk loss if the stars align on 4 drive setups more with 6.

Unfortunately, after all this, I'm still not sure I've learned enough to confidently make a decision. Yes, SSDs are fantastic for random IO, which should make them excellent for databases. But most of the people raving about them are using Fusion-IO PCI-Express SSDs. Their cheapest option for servers cost $6,000, and the only thing cheaper is really meant for workstations, priced at $2,500. Still quite expensive for such low capacities.

I understand that SSDs are shipped with extra storage to transition into use as older sections slow down. I hear enterprise SSDs typically have 20% of spare storage for this function, while consumer drives typically have 7%. Two different people speculated that you could just format a consumer SSD at only 80% capacity to achieve the same thing, but I can't find any evidence.

I can't get much info on whether TRIM will work in a server environment, or how bad it gets if I need to rely on garbage collection only. I don't know how idle it needs to be for garbage collection to take place, and how quickly it runs into the need for garbage collection. My office is only open 9-5, but the server will still be on all night.

I gather there are broad concerns about running SSDs in RAID, especially consumer SSDs, and including RAID 1. I can't find much hard data.

Perhaps an option is to spend a bit more for enterprise SSDs, but to just not purchase them from Dell/HP/IBM, since I think they have an absurd markup on storage and RAM and pretty much any "upgrades". But Newegg doesn't have many options for Enterprise SSDs, and Tiger Direct has none. Where does one go to acquire these without going through Dell?

Also, Samsung has some new enterprise SSDs coming out that sound cool: the SM843 and SM1625. Supposedly they are available now, but I can't find any place that sells them or even mentions a price, haha.

So far, they've written over 5.5 PB (yes, that's petabytes) to the same Samsung 830 over 230 days. It is showing increasing errors. Speculation is that it won't make it to 6 PB. Then again, could any of us really complain if a drive failed after writing just 1 PB? By today's standards, that's a lot of data!

LukeCWM wrote:I understand that SSDs are shipped with extra storage to transition into use as older sections slow down. I hear enterprise SSDs typically have 20% of spare storage for this function, while consumer drives typically have 7%.

Yep, this is called overprovisioning. The NAND chips in SSD drives have 3000-5000 write/erase cycles on each page (like an old HDD block) before they wear out. To minimise the amount of cycles used up, SSD's use spare area on the drives for garbage collection, replacing retired pages, and generally being smart about only erasing and rewriting a page when necessary. More space typically means more endurance.

LukeCWM wrote:Two different people speculated that you could just format a consumer SSD at only 80% capacity to achieve the same thing, but I can't find any evidence.

I could try and explain it, but I won't. Read this instead to get some idea of what overprovisioning does.

LukeCWM wrote:I can't get much info on whether TRIM will work in a server environment, or how bad it gets if I need to rely on garbage collection only.

I can't say for definite, but I ran drives in RAID before TRIM was supported and it was fine. Given that TRIM in RAID is still a relatively recent thing, and applies to Intel controllers only, assume that it won't work in a proper server using a PERC or PCI-E SATA controller. As for how bad it will get between garbage collections, It depends largely on the SSD controller, how hard you stress it, and how much overprovisioning you've done. More spare area means it's less likely to get itself into a tortured state. Same article as above, I think. From memory, I think the Sandforce SSD's are the most aggressive with garbage collection, so you'd probably want the Intel 330 series. Maybe try a trio 180GB models and format them as 100GB each for RAID5 to give you the 200GB you were originally looking at. If it doesn't work out for you it will still cost much much less than any Dell/HP/IBM.

LukeCWM wrote:I gather there are broad concerns about running SSDs in RAID, especially consumer SSDs, and including RAID 1. I can't find much hard data.

Neither can I. I've run SSD's in RAID0, RAID1, and RAID5 and I can't say I've had any problems. I've never really hammered the SSD's in a database server though, and only one of those RAID arrays was even in an actual server. In a RAID1, the mirror disk will get the same data written as the master, It shouldn't suffer any more than a single disk. RAID5 parity updates could affect the write amplification, but I've not seen or heard about any major problems with SSDs in RAID5.

The only thing to warn against for SSD's is that in a RAID1 or RAID5, all disks get written to evenly, which means that if they use up all their write/erase cycles they will probably all fail at a similar time. If you are slack about replacing hot-spares, then the increased chance that two could fail before integrity is restored is more worrying. Both RAID methods only support one disk failure before the array is lost (this is nothing new) However mechanical drives tend to fail more gradually and are much less likely to fail on the same day. One idea to avoid this 'similar lifespan' failure mode, is to partition the drives for slightly different sizes. You'll lose even more total array space, but each drive will have different spare areas to use for replacing worn out NAND pages. Theoretically, that staggers drive lifespans a bit more but I've not seen hard evidence of this in practice.

LukeCWM wrote:Perhaps an option is to spend a bit more for enterprise SSD

Yes you could, but given that consumer SSD's are so much cheaper, you might as well try it. Even if you get multiple disk failures after 12 months, it'll still probably save you money because of the rate at which enterprise SSD's are reducing in price. What costs $2000 today might cost only $800 next year, so you could (in this example) splurge $1200 on this consumer experiment and still break even.

I think you're overthinking this though, you have 14 users, right? I really can't imagine 14 users are going to destroy an SSD array that quickly. The people burning through X-25M drives on a weekly basis are running multiple virtual database hosts and writing petabytes to the array daily. Even burning through disks the performance gain is worth the risk and drive costs. You'll have to evaluate the price/performance/risk based on your own criteria.

Some people ask me why I have always enclosed my signature in spoiler tags; There is a good reason for that, but I can't elaborate without giving away the plot twist.

We hired an outside IT expert for consultation today. Unlike the other hardware recommendations we've received, he recommended all SSDs (total of eight). I thought this would cost megabucks, but he said, "Nah, those 180 GB Intel SSDs are only $175 each." I asked if they were enterprise, and he said yes. I asked him what other brands he has installed, and he said he likes Intel the most, but also Mushkin and Crucial, and that he's had bad luck with OCZ. I don't think he realizes these are consumer drives, but he said he has installed them on numerous virtualized servers including databases. He's never had an issue with them. Well, I guess that works for me, haha.

I think based on the help I've gotten in this thread, the links I posted above, and the consultation this morning that we can move forward with SSDs.

Unfortunately, I don't think we can wait for the Intel DC S3700, although it looks like a perfect match with an acceptable price (considering it is enterprise). Also, I don't think we can wait for the Samsung SM843 or SM1625. Does anyone have for Samsung 830 vs 840 Pro? Or if Intel is best, which model is best suited for this use? I get confused with all of Intel's models: they don't seem linear, since some updates also arrive with steps backwards in other areas.

As another option, have you considered the Intel 320? These are a bit older and not the fastest drives on the market, but they are affordable, have only one known failure mechanism (the 8MB bug) which was fixed in a firmware update a long time ago, use the highly reliable Intel controller from the X25-M, and have power failure capacitors to protect the SRAM cache from corruption in a sudden shutdown. These were never available in 180GB capacities AFAIK, so your consultant must have been referring to the 330 or the 520 series, which are SandForce-based.

TR storage reviews are good, but the tests are aimed at consumers, covering the enterprise sector's questions a little less than the Anand storage reviews which go into massive, coma-inducing depth sometimes.

When looking at storage upgrades it's worth sampling the existing server over a working day to see where the bottlenecks are and which transfer sizes, queue-depths and IOPS you want to improve most. I get Dell storage consultants to turn my perfmon logs into pretty graphs, but I'm sure there are free tools that will do the same.

Some people ask me why I have always enclosed my signature in spoiler tags; There is a good reason for that, but I can't elaborate without giving away the plot twist.

Out of curiosity, and I apologize if I missed this in the thread, but why aren't you considering spinning drives? An SSD RAID setup for 14 people seems like overkill to me. You mentioned up-time as a priority as well, I feel like you'd be better served by 4 or 6 enterprise SAS drives in a RAID 10 with a good RAID controller. An IT consultant recommending consumer drives for this situation also worries me. It sounds great, but don't get blinded by the allure of SSDs (spinning drives have successfully hosted databases for many, many years).

My thought is, why wait if you don't have to? True, we don't have that many users, but if the technology is there, I don't like to choose the inferior option. This is, of course, assuming that SSDs are ready for prime-time in server environments. And maybe the answer is that some are, but consumer SSDs aren't.

I read Google's cached version of Anandtech's review of the new Intel DC S3700. In the conclusion, the author wrote:

I view the evolution of "affordable" SSDs as falling across three distinct eras. In the first era we saw most companies focusing on sequential IO performance. These drives gave us better-than-HDD read/write speeds but were often plagued by insane costs or horrible pausing/stuttering due to a lack of focus on random IO. In the second era, most controller vendors woke up to the fact that random IO mattered and built drives to deliver the highest possible IOPS. I believe Intel's SSD DC S3700 marks the beginning of the third era in SSD evolution, with a focus on consistent, predictable performance.

Maybe it will take the introduction of consistent, predictable performance to make SSDs ready for servers.

Absurdity, are you of the opinion that good spinning drives will be fast enough for a database with so few users?

LukeCWM wrote:Absurdity, are you of the opinion that good spinning drives will be fast enough for a database with so few users?

Yes. Also, like I said above, given the description of the symptoms, I think that the database structure itself needs to be addressed. Even if you go to SSDs and gain massive IOPS, they still won't help you forever if your DBMS is spinning data unnecessarily.

There is a fixed amount of intelligence on the planet, and the population keeps growing :(

Yes. Also, like I said above, given the description of the symptoms, I think that the database structure itself needs to be addressed. Even if you go to SSDs and gain massive IOPS, they still won't help you forever if your DBMS is spinning data unnecessarily.

I believe morphine makes a very good point as a poorly structured database or no optimization (proper indexing etc..) can't entirely be fixed with faster hardware. For only 14 users... spinning disk would definitely be fast enough *IF* the system is optimized. Also I did not see it mentioned but database servers love RAM. At the very least more RAM can alleviate slow performance through caching. More sophisticated setups can utilize RAM for temporary databases.. not to mention you could use a combination of spinning disks and SSDs for tempdb. I'm not sure what database system you are using but more RAM is usually a good idea... but again only if your database system is optimized.

The database is Microsoft Dynamics CRM, and it sits on top of SQL. Currently, we are hosting Dynamics CRM 4.0 and SQL (and Exchange) on a quad core (no hyper threading) Harpertown server with 1 processor and 4 GB of RAM (due to 32-bit OS ... ridiculous, I know). We're purchasing a new server specifically for SQL and Dynamics CRM, and we're upgrading to Dynamics 2011. I'm expecting 24-32 GB of RAM and a single Sandy Bridge processor, or if Exchange and all the other duties are included, they will be virtualized on a second OS on the same server. If that is the case, we'll likely add the second processor and double the RAM.

Since we're probably ordering the server early next week, I'm trying to arrive at a storage decision soon.

I will be responsible for a lot of the customization in the new CRM, including things like entity creation and form customization. But this is all still stuff an advanced user could do without ever getting his hands dirty in the mechanics of it. I don't know anything about "structuring the database" or "optimization". Would you be more specific, enough so I can do some research on my own?

For 14 users, SSDs are probably overkill. Depending on usage patterns for your application, 15k RPM drives could be overkill.

Example: I admin servers and storage at work. We run many databases with a couple hundred simultaneous users. We have it all (and many other servers) running on 24 15k RPM drives without issues or performance problems.

To right-size your storage, you really need an idea of how many IOPs you need. 14 users that only retrieve data every 5 or 10 minutes is a lot different that 14 users that are running large reports continuously.

"I take sibling rivalry to the whole next level, if it doesn't require minor sugery or atleast a trip to the ER, you don't love her." - pete_roth
"Yeah, I see why you'd want a good gas whacker then." - VRock

Example: I admin servers and storage at work. We run many databases with a couple hundred simultaneous users. We have it all (and many other servers) running on 24 15k RPM drives without issues or performance problems.

To right-size your storage, you really need an idea of how many IOPs you need. 14 users that only retrieve data every 5 or 10 minutes is a lot different that 14 users that are running large reports continuously.

This, basically. If querying for a single contact is currently causing UI lag, I'd wager the bottleneck is either how the database is structured or within the application itself. What backend database program are you using? MSSQL, MySQL, Oracle (hah)? Depending on the nature of the bottleneck faster disks be a cost effective way of making the latency go away, but there are plenty of circumstances where even 1 million IOPS won't make your application faster.

Edit: I'm an idiot and you answered my question a few posts back. Dynamics and MSSQL should not require anywhere near SSD levels of IOPS. We run a ~350GB production database in MSSQL 2008 that can handle many thousands of IO requests concurrently with a <100ms median response time on a pool of storage that can't sustain more than 3K random read/write IOPS. This isn't to say three good consumer SSDs in a Raid 1 with a hot spare is a bad idea, just that I suspect your bottleneck is elsewhere.

I see there's a problem here - seems that the CRM application you're using simply uses whatever DBMS as a backend, without you knowing the implementation. Without knowing that, it's hard to optimize.

What you *can* do, however, is give the server plenty of RAM (those 24GB you mentioned sound good), and then google about optimizing cache sizes and RAM usage for the DBMS in question - I assume it's SQL server. Simply tuning the caches and buffers that the database system can use will help a lot, perhaps even enough to keep the current usage dataset (i.e. the stuff you're accessing most often) in memory.

Also, having gobs of RAM allows the operating system (and the DBMS itself) to keep more file data in memory, further helping matters.

There is a fixed amount of intelligence on the planet, and the population keeps growing :(

LukeCWM wrote:The database is Microsoft Dynamics CRM, and it sits on top of SQL. Currently, we are hosting Dynamics CRM 4.0 and SQL (and Exchange) on a quad core (no hyper threading) Harpertown server with 1 processor and 4 GB of RAM (due to 32-bit OS ... ridiculous, I know).

Exchange and MSSQL on the same box with 4 GB of RAM, less usable? I'd say that's the bottleneck right there unless they're running on 100Mbps NIC cards. Though even then...

I was under the impression SSDs would be nice because of their faster access time for each request a user makes. But if I'm understanding what everyone is saying, the current bottleneck may be elsewhere, and the benefit of SSDs isn't so much about reducing the wait for one request as it is performing many requests within a very small amount of time. If the benefit of SSDs is less about pleasing one user at a time and more about pleasing many users at a time, I could see that using spinning drives is a viable option for a business the size of ours.

That said, a Seagate 15k RPM SAS 6 Gb/s 300 GB drive is $210 on Newegg (and much more from Dell, I would assume). That's only slightly less cost per GB than a consumer SSD, although it completely solves the issues of reliability, used up write cycles, slow downs, and group failure. If none of these things were viable issues (which has yet to be proven), it would make sense to go with SSDs simply because they have much better performance and the cost is so similar.

I wish we could wait for the new Intel S3700. I think that would solve all questions of endurance, data reliability, and consistent speed. Even if we had to pay MSRP, I think $2.35/GB is not so bad considering you get all of these advantages and also some of the best speeds we've seen yet.

LukeCWM wrote:The database is Microsoft Dynamics CRM, and it sits on top of SQL. Currently, we are hosting Dynamics CRM 4.0 and SQL (and Exchange) on a quad core (no hyper threading) Harpertown server with 1 processor and 4 GB of RAM (due to 32-bit OS ... ridiculous, I know).

Exchange and MSSQL on the same box with 4 GB of RAM, less usable? I'd say that's the bottleneck right there unless they're running on 100Mbps NIC cards. Though even then...

Are you running small business server or something?

Yes, it is a small business server. Hardware for the replacement will be ordered next week. But maybe everyone's right that storage isn't the current bottleneck. I just want to be forward thinking enough to buy the best product for the money now since we rarely purchase.