Do you really need fileservers all around the world and have each of them be equal masters for all files? Could you get away with 1 being the writable copy and others being mirrors (the rsync solution proposed above)? Or could you have just one file server and have it accessible around the world (e.g. NFS v4 to help hide the worst of the latency)?

I would tend to steer away from Coda especially considering what they say about stability and try and see if you can achieve what you need with more a main line solution (NFS, rsync, some form of version control).

I will say there's a lot out there in the world of distributed and cluster filesystems (and, frankly, most of the free stuff is half-baked). One problem you'll have, though, is that most of the good quality solutions out there are not made to work with replication across wide area networks. They are generally designed for local area replication, because WANs usually have quite limited bandwidth and high latency. If you really want to do WAN replication with a distributed filesystem, you might try looking at GlusterFS, which does not require shared disks and can replicate (Lustre doesn't require shared disks but doesn't do replication itself -- they suggest hardware RAID or block-device replication like DRBD, which would be awful over a WAN). I don't know how well it'll perform, though. Like some others have mentioned, an above-FS replication mechanism might ultimately work better with your particular use-case.*

As notfred mentioned, Coda has nice features, but it's kind of a dead end these days. Coda and AFS were created in the 80s and early 90s at CMU as distributed filesystem research projects. Now that the research is done, Coda is very lightly maintained and it never really had a large user base to begin with. Typically with research software, the research goals are not really in line with creating and maintaining production-ready software.

shank15217 wrote:You could try OCFS2 from Oracle. Its free and is a hell lot easier to setup than coda.

OCFS2 is a shared disk filesystem. The developers point this out in the intro to the OCFS2 1.4 manual: "OCFS2 is a symmetric shared disk cluster file system. ... As the name suggests, such file systems require a shared disk (SAN). All server nodes in the cluster must be able to perform I/O directly and concurrently to the disk. A cluster interconnect that provides a low latency network transport is used for communication in between the server nodes." If file servers are around the world, they're not going to be on a common, low-latency SAN.

* Actually, I've worked as a developer on a high-performance parallel cluster filesystem from one of the large commercial vendors, and one of the "value-added" storage solutions they are developing is essentially designed to deal with use cases similar to yours. Basically, the system provides multi-site WAN caching and reconciliation to support "follow the Sun" development where you have multiple sites all over the world that all want a window on to a coherent data set, and you have updates coming in from different places (but cross-site concurrent write sharing is rare). I'm pretty sure it's not released yet, though.

Thanks for all the responses (and sorry for my slow follow up, I deleted a bunch of stuff from someone's server recently and I've been cleaning up the mess!).

notfred wrote:Do you really need fileservers all around the world and have each of them be equal masters for all files? Could you get away with 1 being the writable copy and others being mirrors (the rsync solution proposed above)? Or could you have just one file server and have it accessible around the world (e.g. NFS v4 to help hide the worst of the latency)?

I was really hoping to be able to do this. I wanted to create a system that would need minimal (zero) training from the users' point of view. All the other options would seem to require some level of training and what with all these offices being staffed by locals, not all of whom speak English very well, I was hoping to avoid that since it would be me spending hours repeating myself over the phone.

I've only used NFS a few times and only at a very basic level... how would NFS4 help hide the latency? Does it do some fancy client side caching or something?

bitvector wrote:I will say there's a lot out there in the world of distributed and cluster filesystems (and, frankly, most of the free stuff is half-baked). One problem you'll have, though, is that most of the good quality solutions out there are not made to work with replication across wide area networks. They are generally designed for local area replication, because WANs usually have quite limited bandwidth and high latency. If you really want to do WAN replication with a distributed filesystem, you might try looking at GlusterFS, which does not require shared disks and can replicate (Lustre doesn't require shared disks but doesn't do replication itself -- they suggest hardware RAID or block-device replication like DRBD, which would be awful over a WAN). I don't know how well it'll perform, though. Like some others have mentioned, an above-FS replication mechanism might ultimately work better with your particular use-case.*

As notfred mentioned, Coda has nice features, but it's kind of a dead end these days.

Yeah I'd had a quick look at DRBD and got the impression that this kind of job isn't what it and its ilk were designed for.Point taken on the status of coda. I joined the mailing list and haven't had a single message from it in about 2 weeks now... that's pretty dead!

Well looks like I'm back at some kind of version/document control system... I've been looking long and hard at Alfresco, has anyone reading this actually used it?

Is something like CVS/SVN a possibility? The documents are MS office files (2003), PDFs and jpegs etc. Do you think completely non technical types could handle it? I like the idea of some users with laptops being able to sync directly to a CVS server and then work off-line.

notfred wrote:Or could you have just one file server and have it accessible around the world (e.g. NFS v4 to help hide the worst of the latency)?

I've only used NFS a few times and only at a very basic level... how would NFS4 help hide the latency? Does it do some fancy client side caching or something?

Yes, V4 is meant to client cache the writes and just notify the server that it has updated the file. It doesn't need to write to the server immediately, it can wait for a while. If any other client requests the file though it will force a flush of the cache.

cheesyking wrote:Is something like CVS/SVN a possibility? The documents are MS office files (2003), PDFs and jpegs etc. Do you think completely non technical types could handle it? I like the idea of some users with laptops being able to sync directly to a CVS server and then work off-line.

If you are using centralized repository control, I'd go with svn over cvs always these days since svn does everything better (unless there is a particular legacy reason). The general thing about cvs/svn in this case is that they are designed with the primary use case of doing source control. Although they work for binary content like office docs and jpegs, they're optimized for source code (text) and smaller files. It's possible the repository revision data will get really huge with repeated revisions of opaque binary files.

To me this all depends on your primary objective/use case. Are you trying to do version control? Or are you trying to synchronize replicas? If it is the latter, stuff like rsync probably makes more sense. rsync has a master/replica assumption though; it assumes one side is the master copy and makes the other side look like it (so if you made changes to both sides, one side's will get lost). rdist extends that idea to let the master push to multiple replicas. Unison is a successor of rsync that does two-way merging. It assumes both sides may be modified and tries to reconcile them (but doing something like rdist with unison would be hard because there's no "master," so each sync between a pair of hosts could induce further changes on each side). If you can make some guarantees about concurrent updates or replica "ownership," this would be easier to manage with such tools.