Domodude

Instead of looking at a single NAS with more drives, how about adding a second NAS?

That's always a possibility but I would like to concentrate everything in one box. My aim is to have a 3 or 4 TB RAID5, with a spare drive already installed. There should also be room for expansion, because 3TB will soon turn out to be too little (no matter how big your drive is....). So that requires at least 3x1(3GB)+1(RAID5)+1(spare)=5 drives.

All those larger RAID solutions are ridiculously expensive! Maybe just an old PC with a lot of bays? How much CPU would you need for, lets say, a box with eventually 6TB RAID5 storage plus a spare drive?

Well that's just great... I don't think I want to buy 100 Fiire Chiefs just so I could use one. Thank you for the heads-up Tim. I did look at the Gyration remotes but I just don't like that awful Windows logo right in the middle...It would be kind of insulting to have all this nice Linux thing going and have to look at that stooopid logo every day. Any ideas? I actually called Gyration and asked them if they would manufacture now or in the future the exact same remote but without the Windoze logo. Their answer: "Hell no...We won't go...", and he referred me to a company called Fiire. How about that? So I asked him about manufacturing a remote that (just like some ladies watches and cellphones) you could open the clear plastic flip cap where the logo is and put in any picture that you wanted. He said No to that also. But he did say that I could, myself open up the remote and change the logo...but did clarify that that would void the warranty...How about that now?

Just build small form factor pc, with hardware you have laying around, or really inexpensive mobo/cpu combos, get a 3com or highly available/inexpensive NIC card, put some controllers in for extra storage (scale as you need) and run FreeNAS.

Going to be cheaper, and then you will not be limited as to the number/size of the drives you decide to use/add.

Regards,

Seth

Logged

".....Because Once you've LinuxMCE'd....."System stats located at my user page:

Instead of looking at a single NAS with more drives, how about adding a second NAS?

That's always a possibility but I would like to concentrate everything in one box. My aim is to have a 3 or 4 TB RAID5, with a spare drive already installed. There should also be room for expansion, because 3TB will soon turn out to be too little (no matter how big your drive is....). So that requires at least 3x1(3GB)+1(RAID5)+1(spare)=5 drives.

All those larger RAID solutions are ridiculously expensive! Maybe just an old PC with a lot of bays? How much CPU would you need for, lets say, a box with eventually 6TB RAID5 storage plus a spare drive?

Is your aim to have highly reliable disk? Because your design is already "wasting" 2 disks and not achieving it for that level of wastage. If you are that keen to get reliability that you are prepared to waste one disk on parity and another for hotspare, then you definitely shouldn't be using RAID5. Not only is it nowhere near as reliable as most people incorrectly think, it can be less recoverable because of the striping in a loss of array config scenario. It is also vastly slower that any other RAID type (both read and write, and particularly for random access). No serious enterprise uses RAID5 any more, it was always just a compromise to reduce cost by "wasting" the minimum amount of disks. Typically RAID 10 or RAID 01 or meta RAIDs are used.

You could set up 4 x 2TB disks in a mirror set - this would give you 4TB storage, with less disks probably for a similar price. But you would have high performance both during normal operation and during degraded operation without the need for a hot spare. Plus, even if the RAID completely collapses in a disaster, all the data is on both pairs of disks in a form that can be read by a non-RAID system, making it far more recoverable.

Is your aim to have highly reliable disk? Because your design is already "wasting" 2 disks and not achieving it for that level of wastage. If you are that keen to get reliability that you are prepared to waste one disk on parity and another for hotspare, then you definitely shouldn't be using RAID5. Not only is it nowhere near as reliable as most people incorrectly think, it can be less recoverable because of the striping in a loss of array config scenario. It is also vastly slower that any other RAID type (both read and write, and particularly for random access). No serious enterprise uses RAID5 any more, it was always just a compromise to reduce cost by "wasting" the minimum amount of disks. Typically RAID 10 or RAID 01 or meta RAIDs are used.

Actually most large organisations are using RAID5/6 for most data as it performs nearly as well as RAID-10 in transactional random I.O in most cases and you get a lot more capacity for your $. The exception is where there is a very high write ratio. Most modern disk subsystems will mask any RAID5 write penalty in the disk controller cache.

One of my customers actually did the test with a disk subsystem and 64 x 15K rpm drives, set up in either RAID5 or RAID-10, and the "crossover" point where RAID-10 got faster than RAID5 was 50% R/W ratio. Higher write than that the RAID10 was better, less than that RAID5 was better. This was for random I/O workload.

This might be different using a server set up to do RAID in software, or RAID cards (either fakeRAID or real RAID). YMMV and I have not researched this to get any real numbers.

A bigger concern is data scrubbing which must be continually carried out checking all the drives for soft or hard errors before a total drive failure in a RAID array. If this isn't done then the odds of hitting a hard read error during RAID rebuild are quite high. This holds for both RAID-10 and RAID5. Of course if you have a RAID5 array with a large # of disks you make the odds higher as you have to rely on ALL the other disks in the array being 100% OK. With RAID-10 you only have to rely on one other disk being OK.

I'm sorry, Indulis, I simply do not agree. For the last 8 years I worked for a large multinational - as the Infrastructure Operations manager. I was intimately involved in the design, specification, purchasing and project implementations of numerous SAN, DAS, NAS, iSCSI, archival platforms and ultimately made the decisions on what approach to take. I made these decisions in light of years of experience with storage systems and in conjunction with Professional Services project advice from vendors such as EMC and IBM. On several occasions they even modelled performance of various suggested configurations.

In point of fact, on high end systems like SANs, the actual containers are so far abstracted from the RAID subsystems through meta- and hyper-LUNs, as almost to make no difference. Nevertheless, in the last 5 years, across hundreds of LUNs, many hundreds of servers, and several data centres, I recall only once implementing a RAID5 array, and that was a compromise at the time (which of course have a tendancy to stick and come back to haunt you!)

Notwithstanding that, asking such a vendor to implement a RAID 5 array is invariably met with strange looks, and ardent advice to the contrary. In my interface with peers in other medium and large enterprises with which this organisation operated, none ever considered RAID 5 to be an "enterprise" solution. In fact I have to go back 12 years, to my days as a small solutions provider to offices and retail establishments of 10 people or less and a single "server" plus dialup modem, before I can recall regularly using RAID 5.

Saying that RAID controller caches mask write penalties is a profound misunderstanding. It is simply wrong. Caches always run at 100%, therefore it merely shifts the problem... and that is circular... you don't get something for nothing.

Bringing this to the point of LMCE. Performance between different RAID technologies varies dramatically depending on what you are doing, as you pointed out. For instance, random writes on RAID10 are quiet poor compared with other technologies (such as RAID5). However, in pretty much every other test, RAID10 is far superior to RAID5. And in particular, sequential reads are vastly faster - note the vast majority of what LMCE does is sequential reads. Note that with RAID10 (and equivalents), adding spindles progressively and nearly linearly improves performance for most operations. In RAID5, write operations in particular get slower and slower as the subsystem has to read more stripes from more disks to calculate parity before writing it. This in turn causes blocking I/Os within the disk subsystem. Advanced systems can offload some of this to an extent, but never completely.

Transactional read performance is almost irrelevant - this is where "caching" does come in. In every database technology you can name, the db engine sets up a read-through buffer in core, and typcially has very sophisticated replacement algorithms. eg in MS SQL and particularly MS Exchange, the bulk of reads need to come from this buffer for the system to perform acceptable - this is particularly true of Exchange, which commonly achieves 70-90% buffer hits. This demonstrates where a disk cache becomes pointless - the buffer is usually so much larger than the disk cache and dedicated, that if it didn't hit the buffer, then it certainly isn't going to hit the disk cache! This is usually even true of high end Clariion and Symetrix SANs which have disk caches of at least 8-16GB (not MB!)