Posted
by
timothy
on Tuesday September 10, 2013 @01:32PM
from the phrase-your-answer-in-the-form-of-a-cookie dept.

Koookiemonster writes "Our company has many projects, each one with a folder on a Samba drive (Z:\). Our problem is syncing only the programmers' current projects (~30 at any time) between Z:\ and their C:\Projects\-folder on five Windows 7 laptops. If we sync the whole Z:\-drive, our projects folders would be filled with too many subfolders, making it difficult to navigate. The folders contain OpenPCS projects (PLC) and related files (Word, Excel, PDF documents); a common project folder is 50 MB. Is there any easy to use, low-budget sync software with scripting, so that we could e.g. only sync folders that exist locally?" (Read more details, below, of what Koookiemonster is looking for.)

"Many programs do support selective sync, but choosing what to sync is awkward; projects and who works on them change daily. It is important that subscribing to a project is as easy as copying it from Z:\ to C:\projects\. The Z:\-folder with all of our current and past projects is located on a desktop PC running Ubuntu Linux. It can share files e.g. via Samba or FTP. All PCs are on the same (W)LAN. Off-site backups of Z:\ are taken care of via rsync. The company has three programmers, who usually handle their own projects alone, but very often others need to add files to projects. Bigger projects need more programmers. Currently we use FreeFileSync with a custom piece of Javascript to make batch files that synchronize e.g. folders C:\projects\123_ProjectName\ and Z:\123_ProjectName\ if the local folder exists. However, that solution lacks versioning, real-time sync and deletion support. It only syncs when we press a button, and then older files are overwritten by newer files (two way sync; older files go to a "sync-deletions"-folder).

PS. Bonus points for solutions that allow renaming project folders without renaming them on all laptops."

Perhaps you haven't spent enough time with git. I've used it for years to manage data stores with content ranging from rST documentation bound for rendering to a very widely read open source "howto" site (with constant edits and merges from a small team of technical writers) to large scale development projects. In fact, I use it for all my file and source control to this day, and my employer's dev group uses it as well to manage a rather extensive codebase responsible for driving an enterprise cloud hosting provider's operations.

If you've had problems with git, you should be aware that there's a huge community out there ready and willing to assist you with whatever workflow you've decided to adopt. One of the nicer things about git is the fact that you can use it in a very SVN-like manner if you like, or you can make as many branches as you want and manage things in a very distributed manner. Again, I suspect your primary problem is lack of experience.

This. Git is great if you want to maintain a full set of everything on every system - every copy is a full backup. If you want to do selective work, then svn checkout of particular branches is the way to go.

You apparently don't have the slightest clue how git works, nor how to use it properly. I certainly am no git guru, but have no issues with using multiple git repos in projects, building sub-systems out of it, nor branching and merging, which it truly excels at. I wouldn't wish SVN on anyone that wants to use branching. It's about as brain-dead as it can get and still "function", barely. The only thing SVN has going for it is that it is almost atomic, whereas CVS, MKS, etc, absolutely aren't. ClearCase is a separate system that can be used, but requires a full time expert admin for anything more than basic code repos. At least it handles branching almost sensibly. Mercurial was in the running, but at the time git had (and still has) a larger user base with active improvements in various tools associated with it for my needs, thus I chose git.

Learn a tool before you wrongly despise it publicly. You are wrong on every count:

You don't get a full copy of everything on every system if you don't want it. You can just check out a single branch.

Every copy is not a full backup. See previous line item.

Selective work is easier in git, you can clone the item(s) desired, branch locally, and merge when complete, pushing only if desired.

Lastly, you can essentially "checkin" every change for a full history of what you did on a local branch, revert, merge other branches, etc, with no effect on the main dev branch(es). And you can do all this without even being connected to a "main" server. Maintaining parallel branches with constant merges is cake compared to SVN and other central repo schemes. There really is no situation where I'd rather use SVN or anything like it that I can think of, when I have a choice.

Or just about any source code control system (most allow binaries as well). I can't think of any that don't allow syncing just files that are wanted. Further, these systems are designed to safely share the files, not just "sync"; files can be locked and made exclusive, or conflicts between multiple authors resolved, etc. This is essentially technology that's 40 years old.

I wouldn't use git. Git requires you to always clone the entire repository. That's fine if the repository is just source code and text files, but the more binary files you have the less attractive this can be.

I still favor svn over git myself, but it just suits my workflow better than git.

Seconded. I use a 65GB git repo across four machines (two of them virtual) to sync my entire work history on a daily basis. Occasional 200MB files in there, but mostly smaller stuff (source, binaries, PDFs, word documents, datafiles). Never had a problem so far. Admittedly, no branching, just keeping a few hundred thousand files in sync across four machines so that I don't have to worry where my files are when I go somewhere with my laptop only.

git is a rather poor choice for binaries due to the way it stores things. SVN is the better choice for binaries, especially for a situation where you want a central repository server.

But really, the answer is "nearly any centralized version control system" is a better choice then trying to rsync/unison stuff around. With the advantage that everyone *knows* they have the latest (by asking the server) and you get full revision history for everything.

Meh, I know you are posting in jest, but honestly, I've seen people from Unix-land also having no clue about source control/revision control systems. Proficient software people (Unix, Windows or whatever) know how and when to use such a system. Everybody else, well, you get what the submitter is describing:/

That's truel However I definitely have seen the syndrome of people not acknowledging software not created by Microsoft, or asking for bizarre or counterproductive requirements (sharepoint). Most Unix users who don't understand this stuff at least tend to follow along with whatever the team uses or project manager suggests.

However I definitely have seen the syndrome of people not acknowledging software not created by Microsoft

Yup, seen that. And I also saw the reverse: people going to insane lengths to refuse to use a Microsoft tool or system, despite it obviously being the best fit for their particular problem. There are quite a few such specimens (of both categories) on Slashdot, and, while the logic contortions can be funny at times, I'm annoyed to see how often misunderstood ideological purity trumps technical arguments.

Let's imagine your mother has received a PDF document in her email, and she has to add it to the repo. Would you really make her use git? Just making her email the file to you isn't a good solution in this case.

I would not make her use command line git, but I would make her use one of the many GUI clients available for git. If she can be trained to use whatever you're using now, she can be trained to use one of these.

I would be nice to her and spend some time hanging out with her after the training, of course. She's my mom. I would also expect her to need repeated training, but I would expect that of anyone who is not tech savvy.

Let's imagine your mother has received a PDF document in her email, and she has to add it to the repo.

If your programming staff have the computer skills of the stereotypical mother that you are using as your example, then your company is in much worse trouble than just not having a good source control system.

I've hear tons of good things about SourceTree [atlassian.com], a freeware Git / Mercurial client originally for Mac but recently released for Windows. I know it's used by lots of non-technical folk for non-coding purposes (think designers managing Photoshop projects), so it might work for your scenario too. I haven't used it myself though so I can't comment from personal experience.

By the way, you could set things so that you'd have a different repository for each project, similar to your set of root folders, instead of a

Create one repository per project (or per client), with a sensible naming scheme for the SVN URLs. For instance, put your repositories under/var/svn or/srv/svn, and give them all a 'proj-XYZ' style name (or 'client-XYZ').

We store all of our projects in SVN as well. Most of the files that end up in there are Word documents, Excel spreadsheets, plus a bit of HTML, scripts, and other things. So we're dealing with less technical users as well.

Sycing monotone between machines is easy. You use the sync command and everything is copied back and forth. Yes, it will distinguish the changed from the original version, and it will *not* copy the wrong way.

You don't need to merge before syncing. You don't have to sync all projects. But if you do, everything will automatically be bac

There is UI's like TortoiseGit for Git. But I'd recommend to evaluate other DVCS's too like Mercurial (& TortoiseHg). They may fit your needs better. Don't waste your time evaluating too much, the all are perfectly fine. But do pick a *distributed* VCS instead of simple VCS. They are Just Better(tm).

Putting a PDF into git in a way does not make sense, but then it does not hurt either. Sooner or later you'll notice it saves your day and therefore did make a lot of sense. It makes much more sense than put

Owncloud fails in a large multi user environment over low speed connections, and with office files if the users are working in the same folder that syncs. We tried it, and for single in a folder it worked great, once we added others (especially with low speed or low quality connections, aka traveling users) it failed miserably. We had many many many conflicts for no reason. many many 0 byte files do to cross sync issues.

I've found Bittorrent Sync great for selective syncing... You select a folder to sync and it gives you a hash that you can give to someone else who just needs that hash and they can specify any folder for those files. If you have a project wiki or something you can just put the hash alongside the project name and anyone that wants that synchronized folder can put in that hash. It does LAN syncing so it doesn't always have to hit the web, you could setup a server assign each project to a sync target and sh

We ditched OwnCloud for GoodSync. OwnCloud in theory works OK. However, if someone with a laptop is away from the Internet, it is real bad about losing all the client settings and just quits syncing. Also, in the right circumstances, it can generate a huge amount of files by versioning during collisions.

GoodSync is not perfect, either, but it is far superior to OwnCloud. Is it free? No. Work better? Yes.

In theory, GoodSync has support. I was not thrilled with it. We kind of got cross and they thought I was

Again, try the new version 1.4. As discussed in https://github.com/owncloud/mirall/issues/29 [github.com] inotify on linux does not guarantee that the sync client will get notified of file changes under all circumstances, so we do need to poll. Any constructive hints are welcome.

It's time someone constructed a version-control file system. I mainly use subversion and it can be a real pain when you used mv instead of svn mv. (And other related commands). These things should be transparent (not that that's an indictment of Subversion).

6. VSS - My experiences with it were not good. I hear it's improved. I'd look at it ONLY if I was in a Microsoft only shop.

VSS is a deprecated tool at this point, last updated in 2005, and only having extended support from Microsoft at this time. Team Foundation Server [wikipedia.org] is Microsoft's current source control offering.

It's not great (not my first choice), but it is not that awful either. I like the concept of a vob and the powerful access control models that come with it. It's just the idiotic complexity of attributes and objects in ClearCase that uglifies it.

I wonder how a company even gets into this situation. Granted, I don't think many schools teach the fundamentals of how to actually manage a project but I think even in the greenest of startups there should be at least ONE person who's held a real job before and has seen this. Nothing here is new. SCCS was started in 1972. Having multiple programmers on a project has existed since computers were invented.

My guess is that you get some clueless business oriented guy making a startup ("I'll get rich, my id

Each user pulls down the directories relevant to him/her from the overall repository, updates at will from the central source, and pushes up changes at will with a couple of mouse clicks.

TortoiseSVN will even give you handy little icons on your local folders in Explorer to tell you if what you have in your local directory isn't synced with the central server, and it's two clicks to make that happen. I actually think you don't want to "force synchronization" on an ongoing basis, seems like a great way to overwrite a lot of your developers' (and others') ongoing work.

Yours is the answer I wanted to write and is the solution I've used for just such situations. However, it violates one of the first ridiculous requirements: "It is important that subscribing to a project is as easy as copying it from Z:\ to C:\projects\."

Koookie, that may be an easy way to copy files, but it is a miserably difficult way to manage projects - even for developers. SVN takes a little time to set up and understand and requires maybe a day's worth of training for any developer who hasn't used sou

Our approach (we use SVN) is to create working copies for every single client under a C:\Clients\xyz scheme. Where 'xyz' is the 3-letter codename for each client.

Each of those working copies is created with a script (stored in a 'svn-scripts' repository) that does a "file children only" checkout. The svn command for that is "svn checkout --depth files URL targetdir".

Because we only bring down immediate file children in the root of the repository, it doesn't take up much space. And bringing down a par

We used to use ViceVersa Pro to sync our team but eventually moved over to Plastic SCM which has been friggin' awesome. It not only supports code, but also art assets. Plus it has the best support for branching. One team can be working on a branch specific to one project, while another works on a second branch while the main trunk stays clean and build-able. You can even have developers run their own local repository on their desktop/laptop and have them replicate/merge either on a schedule or when they connect to the LAN (if they work offsite alot).

You only bring stuff back to the main trunk when you're ready to merge a branch back in. You can even merge branches separate from the trunk. We check everything into it, code, art, and our Doxygen output. It's been a time save on orders of magnitude.

Syncing files like this is a mess. Perhaps you should look beyond share drives. You are trying to solve the technical problem, but if you step back you might see a business problem. Consider 3 alternative approaches:1) Keep the files on the share drive and do not mirror them locally.2) Use a source control system (Ex: GIT, TFS, Subversion)3) Use a groupware / content management system / document management system. ( Ex: Sharepoint, Confluence, QDMS, Lotus Notes, Microsoft Exchange, Drupal, SAP, Groupwise)

You have programmers. You have multiple projects. They might be working offline. For this, you really need a Distributed Source Control system such as git or mercurial. I personally recommend mercurial as it's got good Windows tools (TortoiseHg and HgScc for Visual Studio integration). You can put your "pure" repository on your share, then have the programmers push to it -- or, better yet, have an "incoming" for each project to which anyone can push, then a "pure" to which only project leads have write acce

If you hire experienced, competent programmers, they will be able to solve this issue for you. First they will suggest using version control (it seems frighteningly likely from your writeup that you're not currently using it). Probably git, but there are other good ones.

At that point the problem will become redefined. What you want is a script that:
- Iterates through the local working directory.
- Finds project folders that are NOT being worked on, and are also currently clean (no uncommitted files),

I was at one company where most programmers were independent. It was sort of an odd business model that derived from the mainframe world; an author would write an application, write the docs, package it up, maintain it, and also collect a declining percentage of royalties. If the program was successful you could hire a developer to assist.

So my boss and I were the only people working on 2 programs. My boss learned with a "Learn C in 20 Days" book and previously had only done mainframe assembler and servi

Because I flat out refuse to believe the entire team doesn't know any better.

It's way more common then you'd think. I worked for a company back in 2000, with close to 20 developers. Not a single one of them knew how to do version control. Instead, they all resorted to making copies of files, tacking on dates or times or just numbers, and relying on the backup tapes if they had to undo a particular change. It was hellish and you'd hear at least one conversation per week where they were trying to figure out who had the latest version of XYZ.

I rolled out VSS (hey, it was 2000, our choices were limited) with SourceOffSite for the remote workers. If I did it today, it would be either Hg, git or svn.

Very few schools back in the 90s or early part of the decade taught VCS concepts or forced students to use them. This is slowly changing with the advent of things like 'github' which has a big mindshare and introduces people to the concept of VCS.

Yep. In a former job at a very large financial services firm, I worked on their source control/build/packaging/deployment systems, and I was stunned by the number of developers who not only resisted using source control, but actually would not even acknowledge the value of it. One time, I raised gasps in a meeting of several dozen devs by making the statement "If you have a problem using source control, you have no business being a professional software developer." Yes, that was a controversial statement

If you don't want to do Version Control as others have suggested then I recommend Super Flexible File Synchronizer. It is a great product with lots of options in regards to what does and does not get sync'd. It is inexpensive to boot.
http://www.superflexible.com/ftp.htm [superflexible.com]

Well, actually probably rsync will be sufficient for your needs. And rsnapshot is probably a little more than your needs. I suppose that only thing you need to configure is a rsync server on windows, a nice writeup you will find here: http://www.stillnetstudios.com/snapshot-backups-howto/ [stillnetstudios.com]

I recently used this to configure my wife's windows PC, so that it will work with rsnapshot, and backup all her projects. After configuring rsync server on windows, the rsync operation

You said you have programmers working on these projects. They probably each have their own preferred way to do this, why not ask them? If they can't come to a consensus, you could have them write their own solution.

"I'm a contractor. I have a team of carpenters who are tasked with building a house. It seems this is going to require the driving of a large number of nails. My team of carpenters would like to know what sort of tool or mechanism would work best to drive these nails. Right now, we have one guy who holds the nail while another guy hits it with his thermos. This does eventually drive in the nail, but 90% of the time the nail bends, and it's denting our thermoses. I wonder if there exists some genius, super-carpenter bad-ass out there who might be able to suggest a better way."

I second git. git scales badly with large files. If this is a problem, you could have a look at git-annex http://git-annex.branchable.com/ [branchable.com] The concept needs some time to grasp, but it's really powerfull.

Git is absolutely not a good first version control system for people who are clueless about version control. (Such as, evidently, your developers).

Git requires prior experience with at least two simpler version control systems. In git, you often run into scenarios that require you to understand its complicated repository representation so that you can choose the best steps to unravel them, based on understanding the ramifications of each approach.

The implementation of git is not hidden from the user behind a robust set of "no brainer" use cases.

The decentralized model alone will confuse the heck out of workers with no prior version control experience.

Use a system that has a centralized server from which working copies are checked out, like Subversion.

SVN or Git for code repository. Easy to set up, lots of windows and linux tools, command line based even if you want that.

However for documentation, i recommend confluence, or the many free wiki based collaborative solutions. This allows people to post on a wikipedia like site their documents. They become automatically searchable, people can collaborate in the documentation with version control built in. Confluence allows you to drag and drop ppt, import and export word documents, drop excel files in th

Sparkleshare (http://sparkleshare.org/) is a "transparent" front end for Git which turns it into a simple file sharing tool. This would probably be appropriate for most of the actual "file sharing" applications the OP mentions (gaining many of the advantages of Git while keeping the complexity hidden until its needed), while obviously any source code fprojects should find their way into some kind of version control repository, probably Git as well, with TortoiseGit (http://code.google.com/p/tortoisegit/) be

The developers at my company, who use windows laptops, keep a lot of their code in VisualSVN server [visualsvn.com] and then use eitherAnkhsvn [collab.net] to do the checkout straight into Visual Studio, or TortoiseSVN [tortoisesvn.net] if you want to be able to right-click in any folder and checkout to that location.

I agree that a VCS is the best choice, as suggested above. If, for some reason, you don't want to use that (eg: too many binary files grow you repo too much), unison [upenn.edu] is a great choice. It uses rsync to sync files, and keeps track of which files where modified on which side.

I'm curious about a couple of things. Of course pretty much everyone here is screaming "source control" But how is that you have a programmer working on THIRTY projects at one time? Perhaps I'm misunderstanding your use of "project" but I think I would go crazy if I was trying to juggle thirty different projects. Perhaps some sort of consolidation is in order.
You also mention if you sync the whole thing, navigation would be a problem with too many subfolders. A good source control will help alleviate this

This works great with only one caveat: most office documents are binary files (or are treated like binary files by the SCM), so you'll need to put a process in place to lock the file in question prior to editing to prevent people stomping over others' changes to them.

Maybe, maybe not. MS Word has its own functionality to merge changes from different versions, so you might not want to lock it down. The big problem with locks is people forgetting to check things back in and locking everybody out.

No. Only those that require a "checkout; modify; check in" work pattern. Git, SVN and others follow a "modify, merge, commit" pattern which requires no locking. Though SVN can be configured to "require" (more like "request") locking on binary files which would help.

This is also one reason to use something other than binary only or proprietary format for documentation, or to split the documentation up into many small files. Text or basic HTML works great. Wikis or wiki-like servers are also good (though you may end up in a fight against the MS faction in IT that insists you use Sharepoint instead, at which point you're better off changing companies).

The problem with things like docx is that while it is textual it is not necessarily structured uniformly so that minor

That's why you pair GIT with GITLab(a private/local version of GIT Hub). Or if you want to purchase something, go with RTC or JIRA. RTC & JIRA have the added advantage of issue/bug tracking, but all offer you the capabilities of adding a SharePoint/Wiki like site to your revision tracking.

Mbvious Microsoft questioner is obvious (syncing "C:" and "Z:"?). Microsoft shill would have recommended Azure and VS2013 Cloud Services for source and revision control. Which I would never, ever do - give all my source code to another firm? Hells no. I hesitate to put anything in github for the same reason.

This works great with only one caveat: most office documents are binary files (or are treated like binary files by the SCM), so you'll need to put a process in place to lock the file in question prior to editing to prevent people stomping over others' changes to them.

That's typically not needed. MS Office products, for instance, have built-in diff review-n-merge capabilities. And even without such a capability, a src-controlled binary is simply that, a coarse-grained resource. Whoever commits the last is the one whose changes prevail. This is what is done with SharePoint anyways or TeamSite anyways. No need for locking.

You still need to lock your files before you bang hard on the awesome Office built in functionality. It doesn't change the fact that you diff and merge your changes with Bob from accounting while he's working on the same thing, you check yours in, then he checks his in, and magically your changes disappeared. I wasn't arguing about the value of the internal revision tracking of the tools in question; your changes will still get blown away the first time you work on a document at the same time with 2-5 ot

Having no need for Windows, I really don't keep up with its capabilities. But, really, this is the year 2013 --- Windows is still that pathetic, that basic tasks like syncing files between multiple computers take special software that doesn't just come with the OS? Why is anyone still using that crap? Are corporations really so utterly incompetent on IT issues that they'd put up with shit like this because they don't know any better?

Dude, even on Linux, you need special software to synch files (rsync, git, whatever.) Since file synching is an app-specific functionality, this is not a OS problem. This is an operator problem, and I've seen Linux/Unix sysadmins doing the same kind of crap job as the one described in the original question/article/whatever.

This is a solved problem on both Windows or Linux/Unix. But incompetent IT staff exists on both domains. You can't defensively program against stupid (nor should you.)

"But why 'sync' stuff at all? Why not work directly on the file server (Z: in this case)? Why mess with 'local' files that will even need synchronization?"

Because that way each developer would step on each other's toe. I bet that when working on an idea they still go renaming their files like foo.1, foo.2, etc. and that more or less fortnightly if not more frequently there are discussions on why you didn't ask me about file bar, now you have overwritten it with your own incompatible version.

As long as the system allows for concurrent deployment of multiple processes, multiple installation paths, and doesn't start services as daemons on conflicting ports, you can have hordes of developers on the same development server. This doesn't typically work well for OS, embedded, or resource intensive development projects, but it's a breeze for web projects and some basic application development. Still, this is 2013, and it costs less to get someone setup with a company laptop than it does to give them