We currently have a Thecus N12000V with 16 TB in RAID 10 (currently half full). From what I've read online, this nets about 50MB/s. Due to too many people connecting to it, most people are getting 3MB/s at best. (All using SMB, computers are all Win 7 Pro which doesn't support NFS).

Would it be efficient to have 3 of these same NAS boxes constantly RSyncing, hook them up to a Server 2008 R2 server and let it handle load balancing between the 3 using DFS? Is there a better load balancing solution? I'm trying my best to not make this a "shopping" question; if I need to make it more specific, please let me know.

2 Answers
2

I would recommend against a hand-rolled distributed filesystem arrangement with different people accessing different drives for the files because you will hit significant sync issues when two people modify the same file in different places.

There are a number of distributed filesystems (and distributed block device arrangements you can build filesystems on) out there (see http://stackoverflow.com/questions/269179/best-distributed-filesystem-for-commodity-linux-storage-farm for a could of pointers to open source solutions). These are geared towards fault tolerance, size scalability, and remote access performance, but they will by nature improve multi-client access scalability too if you have multiple local nodes. Unfortunately you are not going to be able to use these in off-the-self not-enterprise-targeted NAS boxes (at least not with any NAS box I've seen).

A speed limit between 50 and 100Mb/s with a gbit network adaptor is likely to be imposed by either the network interface or the I/O controller that is managing the RAID, but your key bottleneck for throughput is likely to be the drives rather than either of those. As several people are pulling data off the drives, the heads will be bouncing all over the place, getting a bit of data for client 1 from here, moving to get a bit from client 2 from there, then client 3, then ..., then back to servicing client 2's request, ..., ..., ... - and during each head move the drive is not able to transfer data. The IO controller can reduce this effect using a few tricks (clever elevator algorithms fro each drive (see http://en.wikipedia.org/wiki/Elevator_algorithm), clever sequencing of requests between drives when the requested data is in more then one place, and so on) but there is a limit to how much help these methods can be.

If the NAS box has its own RAM for caching rather than relying on the drives to do all this, you might find it has options somewhere for controlling read-head and read sequencing which can help by further reducing the amount of head movement required to service the same set of concurrent requests (though be careful to test any changes thoroughly as you can make things worse rather than better and the best options for one access pattern could be atrocious for others).

Another option to look into is SSDs as these negate much of the latency that bogs down spinning-metal based drives in random- or multiple- access situations. While it will likely be far too expensive to replace all your space with shiny new SSD technology there are half-way options. Hybrid drives are spinning-metal with a chunk of SSD storage built in that it uses as cache (and being non-volatile they can use it buffer write operations as well as speeding reads). Some NAS boxes have built-in support for using SSD drives as a large non-volatile cache for the larger traditional drives which removes the need to replace all the drives with hybrids and would likely be more be more efficient, though IIRC this feature is only on more expensive boxes at the moment.

I agree the issue is clearly the drive itself, not the network. We have 10GB connections between switches and Link Aggro on the NAS box. The solution I'm looking for must work with our current NAS boxes though, I can't switch to another manufacturer. That's why I'm trying to find a solution that essentially just makes the two existing NAS boxes look like one, and distributes the work evenly across the two.
–
Copy Run StartJun 17 '13 at 13:52

Thanks. I looked through the settings and contacted Thecus support and that "load balancing" is just between two NIC's, not between multiple servers. P.S. your link was just a link to their knowledgebase homepage.
–
Copy Run StartJun 12 '13 at 20:56

@CopyRunStart you'll probably want to use 802.3ad mode(Dynamic Link Aggregation) it utilizes all the connections and provides LB and FT. Updated the link as well.
–
coleJun 12 '13 at 21:51

Thanks I already have that enabled. The bottleneck is not the network, but the drive itself.
–
Copy Run StartJun 17 '13 at 13:48