So I am planing for some storage here and I've got some questions and would love some input. We need some fast storage, I have a mix of about 40 Designers and Engineers that are bringing my current storage servers to their knees. Process explorer shows IO delta bouncing around between 8 and 80 GB. Copying files and opening files has become painful.

So my plan is to replace the current servers (5 yr old Dell PE2950's, WSS 2003) with new R710's with WSS2008. I will keep my 12TB Powervaults and move them to the new servers to keep old data and stuff that doesn't need to be super fast. Here is where the fast and the questions come in. I want to add an http://www.oczenterprise.com/ssd-products/z-drive-r4-c-series.html OCZ Z-Drive for our engineers and designers to work off of. The 1.2 TB Card has max read/write speeds of 2000MB and Max IOPS of 260,000. The question is, If I have 4 bonded Gb ethernet ports on the server, am I wasting that 2000MB read write speed. In other words will the ethernet become the bottleneck.

25 Replies

Hard to say without isolating their work and measuring-to-failure their storage IOPS. The route I'd look at is getting a RAID controller with the CacheCade feature which will use SSD drives as a read/write cache for the rest of the array. That way you'll front your slower storage with fast cache and gain for everything that needs it.

Another benefit will be Dell's on-site coverage extending to this storage as opposed to OCZ's warranty support being relied on for your business needs.

So I am far far far from an expert on storage. What I do know is our current raid 10 gets slower and slower the more people that have open files.

I really am not sure if its throughput or iops that is the issue. We have recently added a bunch of engineers and designers who all have multiple cad files or illustrator files open that all have links in them to other parts/drawings. I have no idea if that kind of use is random or not.

Network is definitely not an issue right now, the servers are only at a few Megs on average

I did think about going with 15k SCSI RAID 10, but this option turned out to be not much more. And trying to plan for the future as our company is still growing rapidly Im not concerned about $5000. I would rather have too much than just meet current demand.

This person is a verified professional.

So I am far far far from an expert on storage. What I do know is our current raid 10 gets slower and slower the more people that have open files.

Not everyone's a storage engineer, but there are some great tools out there to help with this.

Dillon5095 wrote:

I really am not sure if its throughput or iops that is the issue.

There are a lot of tools to help you. You can use Perfmon to track write/reads per second, and read/write latency. (If your a dell shop Dell DPACK will make pretty graphs for you, HDS has a similar tool partners can deploy for you, or even Microsoft MAPS also free can do this).

Dillon5095 wrote:

I did think about going with 15k SCSI RAID 10, but this option turned out to be not much more. And trying to plan for the future as our company is still growing rapidly Im not concerned about $5000. I would rather have too much than just meet current demand.

If its growing I'd be concerned about a solution that will scale, and PCI-E cards do not scale particularly well.

Whats the current disk/RAID/Cache configuration in the 5 year old servers?

Whats the current NIC speed (end to end).A few other things to look at 2008R2 will let you use tons of memory for read cache to speed up loads, and SMB 2.0 (used on 2008 on) can provide better performance for file access.

This person is a verified professional.

Another benefit will be Dell's on-site coverage extending to this storage as opposed to OCZ's warranty support being relied on for your business needs.

OCZ's "next business day after we admit what the problem is". After the fiasco's with some of the sand force controllers doing random blue screens I'm not sure if I'd want to put them in production either. They claim they are business class, but they have zero track record in enterprise. For that everyone's been using STEC/Zeus for SSD's and for PCI-E Fusion IO are the only ones I've seen stand behind their products really (HP resells them for this reason).

Also the lack of VMware support makes these things pretty much niche case in my opinion.

This person is a verified professional.

So I am far far far from an expert on storage. What I do know is our current raid 10 gets slower and slower the more people that have open files.

That's how ALL storage works. The more you share it, the slower it goes. No exception there.

Part of the joy of some workloads is it makes sense to buy smaller drives, and just scaling up into infinity. Alternatively you can mix workloads on the server so some larger bulk functions will take up space but not performance.

There are a lot of tools to help you. You can use Perfmon to track write/reads per second, and read/write latency. (If your a dell shop Dell DPACK will make pretty graphs for you, HDS has a similar tool partners can deploy for you, or even Microsoft MAPS also free can do this).

Thanks for these, I'm checking out dpack right now.

John773 wrote:

Whats the current disk/RAID/Cache configuration in the 5 year old servers?

Ha! This is one of the problems for sure. The it company that set these up long before I got here were awful and I'm still cleaning up their mess. I do think new servers and moving to WSS2008 alone will help tremendously. With the SSD's I'm trying to satisfy the boss's thirst for new tech/cool factor. It may sound silly I know but as a company full of geeks, this kind of thing actually counts big when pitching ideas to the higher ups. Were always buying cool toys around here like 3d printers, CNC's, UV Printers etc. So when you say I want to spend 30k to upgrade our servers no one gets excited. But when you say I want to upgrade our servers and have a portion of them dedicated to mind blowing speed you get a better response.

As for size of the disks, thats not a huge concern. Honestly the smaller size will, I think help keep our departments more organized (its a disaster right now). You want to be on the fast storage then you make room by moving old projects off and into old containers.

John773 wrote:

Whats the current NIC speed (end to end).A few other things to look at 2008R2 will let you use tons of memory for read cache to speed up loads, and SMB 2.0 (used on 2008 on) can provide better performance for file access.

Currently 1 GigE. 2 ports on the box 1 I use for user traffic, the other we use for running backups, snapshots that kind of thing.

We went through this with our servers when specifying a system for a virtual world. My Dell rep insisted on me running their DellPack.exe for 24 hours on each server then sending them the results. They compiled the results of running DellPack on the servers and told me what IOPS and all those server were pulling. he wanted to make sure he was speccing a system that would handle our total IOPS needs.

The results file is proprietary, you will need a Dell rep to build the reports after gathering the data, but it was worth it.

I had to take off the .exe ending to post the DellPack.exe. Just rename it.

It was dirt simple to run it. I think it just defaults to running for 24 hours on the server. There really is no installation routine for this. You simply copy the file to each server you want to run in on and then just run the EXE from the server. I put it in a folder on the desktop of most of them, HPs and Dells. It asked a couple of questions and then ran for 24 hours. Producing a little file in that folder when it was done.

If you want instructions PM me. If you want the file from Dell give em a yell. They will set you up with a rep.

Whats the current disk/RAID/Cache configuration in the 5 year old servers?

Ha! This is one of the problems for sure. The it company that set these up long before I got here were awful and I'm still cleaning up their mess. I do think new servers and moving to WSS2008 alone will help tremendously. With the SSD's I'm trying to satisfy the boss's thirst for new tech/cool factor. It may sound silly I know but as a company full of geeks, this kind of thing actually counts big when pitching ideas to the higher ups. Were always buying cool toys around here like 3d printers, CNC's, UV Printers etc. So when you say I want to spend 30k to upgrade our servers no one gets excited. But when you say I want to upgrade our servers and have a portion of them dedicated to mind blowing speed you get a better response.

As for size of the disks, thats not a huge concern. Honestly the smaller size will, I think help keep our departments more organized (its a disaster right now). You want to be on the fast storage then you make room by moving old projects off and into old containers.

This makes me think that you don't really know. :-) Don't guess. Measure. MAP toolkit, if you have a Windows environment, is free to use and gets you your results in a non-proprietary format.

Naa, I know and it boggles my mind. Both servers were intended to be identical originally was my understanding. However C on one of them is RAID 0 (I know), the other is 5. One has 2 storage volumes one has one, so we have different drive letters on both, which makes replication and backup confusing.

Like I said, now that I finally get to set them up properly (start over) that alone is going to help tremendously.

Naa, I know and it boggles my mind. Both servers were intended to be identical originally was my understanding. However C on one of them is RAID 0 (I know), the other is 5. One has 2 storage volumes one has one, so we have different drive letters on both, which makes replication and backup confusing.

Like I said, now that I finally get to set them up properly (start over) that alone is going to help tremendously.

I meant more that you don't know the space/IOPS requirements. That's what you'd use the MAP Toolkit to measure.

So I am planing for some storage here and I've got some questions and would love some input. We need some fast storage, I have a mix of about 40 Designers and Engineers that are bringing my current storage servers to their knees. Process explorer shows IO delta bouncing around between 8 and 80 GB. Copying files and opening files has become painful.

So my plan is to replace the current servers (5 yr old Dell PE2950's, WSS 2003) with new R710's with WSS2008. I will keep my 12TB Powervaults and move them to the new servers to keep old data and stuff that doesn't need to be super fast. Here is where the fast and the questions come in. I want to add an http://www.oczenterprise.com/ssd-products/z-drive-r4-c-series.html OCZ Z-Drive for our engineers and designers to work off of. The 1.2 TB Card has max read/write speeds of 2000MB and Max IOPS of 260,000. The question is, If I have 4 bonded Gb ethernet ports on the server, am I wasting that 2000MB read write speed. In other words will the ethernet become the bottleneck.

Is this card overkill for my needs?

Thanks

@Dillon5095,

Thanks for using Dell for your server and storage needs. Have you checked our Free DPACK tool for measuring performance and storage needs? Many Spiceworks community members use it and have very positive feedback about it.

You say IOPS, but you talk about Throughput also, which is the bottleneck? If you have high throughput needs a RAID 10 with SATA may actually make more sense?

Seagate Constellations are rated for 115MBps of transfer if doing stream reads/writes (so thats 920Mbps). so you could in theory ~4 disks to saturate reads, and 8 disks to saturate writes.

Now if your needing random write/reads then the SSD may be needed.

$6500 OR $1000 Worth of SATA drives.

For random writes he needs something doing log-structured file system. And for high IOPS and low $ per TB he needs something doing automatic storage tiering. This two shrink whole selection set to a very reasonable number. Throw in more limitations and you'll be done.

This person is a verified professional.

I'll have to dig more into this. I've got VDI (the mother of all blenders) showing 100% steady state writes being served as cache hits from one of my old AMS's in the field. (although I think profile data may be stored elsewhere so this is just the linked clones which may be close enough to out of order using normal comand queueing based on our load). HDS has some crazy caching algorithms and I've never been able to get a straight awnser out of them other than read the patents.

I'm comming of vacation and am going to try to request a few days in the fall to do a short white paper on VDI storage sizing in education as we are expanding existing deployments and adding another school district. I'm wanting to get some "real meat" data on VDI and what drives usage times. Especially on student pools. (actual idle station percentage by TOD, what are the application outliers in terms of IOPs/CPU usage, remote usage numbers, time spent surfing wikipedia vs. porn etc).

I'll have to dig more into this. I've got VDI (the mother of all blenders) showing 100% steady state writes being served as cache hits from one of my old AMS's in the field. (although I think profile data may be stored elsewhere so this is just the linked clones which may be close enough to out of order using normal comand queueing based on our load). HDS has some crazy caching algorithms and I've never been able to get a straight awnser out of them other than read the patents.

I'm comming of vacation and am going to try to request a few days in the fall to do a short white paper on VDI storage sizing in education as we are expanding existing deployments and adding another school district. I'm wanting to get some "real meat" data on VDI and what drives usage times. Especially on student pools. (actual idle station percentage by TOD, what are the application outliers in terms of IOPs/CPU usage, remote usage numbers, time spent surfing wikipedia vs. porn etc).

To be a pig you need to think like a pig. OK, you're VM1. I'm VM2. Your LBA range is 1000-2000 and my LBA range is 4000-5000. You write 2 blocks and 2 write two blocks. There's a difference between writes and device should do TWO random writes to media. HOW. CACHE. CAN. HELP.

This person is a verified professional.

Why couldn't I absorb a one second 10,000 iop spike then destaged from cache over the next 5 seconds if they have zero IO happening after the spike? You still have to feed the IOPS, but given how bursty some workloads are (I get that steady state OLTP doesn't work for this, but others that do short quick bursts of small io should right?)

I get that a log style system can fight steady streams I random writes, but assuming my cache writes through large block steaming, why wouldn't this work?

Why couldn't I absorb a one second 10,000 iop spike then destaged from cache over the next 5 seconds if they have zero IO happening after the spike? You still have to feed the IOPS, but given how bursty some workloads are (I get that steady state OLTP doesn't work for this, but others that do short quick bursts of small io should right?)

I get that a log style system can fight steady streams I random writes, but assuming my cache writes through large block steaming, why wouldn't this work?

Fighting pulsating workload can be done. But random I/O does not go anywhere.

Large blocks will happen only if VM file system or hypervisor file system will do clustering (keeping blocks near each other) in a very good manner. I would not consider this as being axiomatic.