Today I have our git clones all live on one site A. Then users from site B have to ssh over to do a git clone or to push in changes. These are bare repos where the update is through pushes.

Ideally, for git clone/push performance, I'd like to limit having to go over ssh.

I'd like to have a copy of git repo X live on site A and site B... and have some syncing mechanism between them. OR to have X live on both sites, but only allow pushing to A (and have that setup correctly at clone time on B)

I'm worried about the case where someone on site A pushes changes to the repo at site A at the same time that someone on site B pushes a truely conflicting change to the repo at site B.

Is there some 'sync'ing solution built into git for distributed open repos like this?

Or a way to have a clone from X set the origin/parent to the X from the other site?

2 Answers
2

I know this isn't the answer to the main question you asked, but it's going to be way, way easier if you don't have to deal with integrating two possibly-conflicting central repositories - sure, you could come up with a setup to automatically sync them, but at some point, conflicts have to be human-resolved.

If users from site B can ssh into site A, they should be able to push/pull directly from the repo on site A - push/pull via ssh is one of the main ways to work with remotes in git. Look at the man page - you'll ssh URLs among the list.

So at site B, you'd just add a remote:

git remote add siteA ssh://user@site.A/path/to/repo.git

If you really have bad bandwidth limitations and need to push/pull very frequently, I suppose this could indeed be a performance problem? I've never had any trouble with it myself.

You could theoretically give the two central repos post-update hooks which immediately push to the other central repo. This would work well until a push happened in both directions at the same time (maybe that's unlikely?) - then you'd need a true merge, requiring a non-bare repository, and possibly a human integrator to deal with conflicts. But as long as the there's no simultaneous push, the repos will always be in the same place, and you don't have to worry about conflicts. If B gets updated, A gets updated too, and a user trying to push something conflicting into A will be forced to resolve it themselves.

There are a couple of ways to do this. It sounds like you mostly want a closer place for people to pull from at the remote site B.

Since git repos don't make many changes to existing files (mostly just the refs/heads files) using something like rsync works just great for making backup/duplicate copies of repos. In this way, you could have a site B repo that people can fetch/pull from at site B.

This doesn't quite address the both-sites-push-to-the-same-branch issue you're worried about (this issue is non-destructive; in the worst case, someone would have to create a merge commit and push it to both repos to reconcile the sets of patches).

To fix that you could make site A the primary repo and site B a read-only slave. There's an option you can specify in your .git/config that is described in the git-fetch manpage: pushInsteadOf. Say your urls are "ssh://siteB" and "ssh://siteA". A configuration to support this would be something like this: