Hey, there! Here at Plataformatec we like to do project rotations. It means that every three months or so, developers can swap projects. It has lots of benefits like working with different people, getting out of the comfort zone, sharing skills and knowledge, and the best one: a new developer can spot problems that people working for a longer time in the project may not see, since they are used to it.

But, it comes at a cost. Each project has its own setup process and may slow down the development. We’re using Boxen from GitHub to solve this problem. It works very well and allows us to have a project environment quickly set up.

But recently we have run into a problem that Boxen couldn’t solve. We had a project which has multiple repositories and some of them are too large. It would take some time just to git clone their > 3GB size repos.

Our first thought was creating a tar file with gzip or lzma compression. The problem with it would be when extracting, since file ownership and permissions on it could be a problem just like symlinks. So, the solution was to git clone the smallest repos and git bundle the larger ones. Git bundle is shipped with git, but only a few people know about.

The workflow we have is simple. Someone with the repo already cloned and updated to the origin, type the following command:

$ git bundle create .bundle master

It will create a file called .bundle with a compressed version of the repository containing the master history. So, this file can be shared via a flash drive, AirDrop or netcat (using the internal network) among developers, and it will take way less time than cloning it.

Now that you have the file in hands, it requires two steps to work properly. The first one is extracting the bundle into a cloned repository. This can be achieved by:

$ git clone .bundle -b master

In case you want to clone into a different path other than in the current working directory, you can pass the path after the -b master option. As you can see, the repo is cloned from a bundle file just like it could be an URL or some bare git repo in your machine. That said, just take a look at your origin remote.

It is pointing to the bundle filename, so every time you fetch or push, it will try to do so in this bundle file. To fix that, we go to the second step which is setting the proper remote URL.

$ git remote set-url origin

That’s it. You’re ready to go. If you have multiple repositories to share, you can create a script to automate the cloning and the url setting for the origin. You can share all the repos with this script, for faster and easier setups.

Example

The Ruby language repository size is about 200MB. It is not big enough to require a bundle, but just as an example I guess it would be a nice fit.

For comparison how long does a local git clone take? If it’s faster then the bundle packing / unpacking then you could just clone locally from another developer machine…

Gustavo Dutra

Hi, Petteri!

It is faster cloning from a developer machine over intranet then over internet. Although, as git bundle does not require internet connection, cloning from a bundle package is faster then cloning from another developer repository. Also, it does not require any developer to be around when setting up a new project environment.

Petteri Räty

You still need to get the bundle copied from somewhere so you can’t compare just the clone with a clone over the internet. If you keep a bundle around, (to avoid needing an other developer) in for example local network somewhere, you might as well have a clone there.

Gustavo Dutra

Actually, Patteri, we have a flash drive available for that. So we can clone directly from the flash drive to someone’s computer (there is a script that do all the dirty work inside this flash drive).