Introduction

Git was created by Linus Torvald to manage the Linux Kernel code after BitKeeper withdrew permission for the kernel developers to use their proprietary system free of charge.

Linus specifically designed git to not be like CVS (which he hated from his time at Transmeta), to support a distributed workflow, provide very strong safeguards against accidental or malicious corruption, and to have very high performance. There later features have resulted in it becoming very popular in the open source world.

Further general information can be found on the git wikipedia page. The why git is better than x website gives an overview of the various advantages of git relative to other revision control systems.

Basic workflow

At any one time, while working with a git repository, there are three stages work can be saved under (see the gitglossary man page for further definitions):

working tree - the files that are in the current directory,

index - a stored snapshot of the working tree, and

branch - active lines of development.

It is useful to understand this in terms of what is happening behind the scene. A git repository is actually just a collection of objects, that is, chunks of data, identified by part of their SHA1 hashes. The types of objects are

blobs - a non-specific object (e.g., the contents of a file),

tree - a collection of blobs and tree references (i.e., a directory),

commit - information about a specific revision, including a commit message, references to parent commits, and an associated tree reference, and

tag - a reference object with a tag message.

The various branches are then just references to commit objects, which chain together in a directed acyclic graph to produce a history. The index is a tree object. Committing the index creates a new commit object that records the commit message, the current index tree, and current commit object as the parent. The current branch reference is then updated to reference the new commit object.

Local

Working locally with git consists of repetitively working with the files in the working tree, adding them to the index at some point, and then committing the index to the active branch at some further point. Possibly it is easiest to think of the index as the staging area. Changes/updates are accumulated in it (git add <filename>) until some reasonable state of development is reached, and then they are committed to the current development branch (git commit).

At any point in time, it is possible to fork off an new branch (git branch <commit> to create it or git checkout -b <commit> to create it and switch to it) or switch to an entirely different development branch (git checkout <commit>), the later of which will update the index and the working tree appropriately. Much of the power of git comes from being able to simultaneously work with several different active lines of development at once.

Different branches can be brought back together in various ways. It is possible to take the changes from a single commit from anywhere and apply it to the current branch (git cherry-pick <commit>). It is also possible to apply the changes made in another branch to the current branch. This can be done by applying them on top of the current work (git merge <commit>) or underneath it (git rebase <commit>).

Detailed information about all the changes made is available via the logs (git log <commit>).

Remote

Git is different than revision control software such as SVN in that it does not revolve around a central server. Each git repository is entirely self contained and technically equivalent to all other git repositories. Distributed work relies on a sharing model of swapping changes (commits) back and forth.

While this can be done entirely by email (git format-patch <revision range> to export and git am <message> to import), this is usually only the case for people submitting patches to large open source projects. In a fully trusted situation, it is usually easier to directly import (git fetch <remote> or git pull <remote> to fetch and merge) and exported (git push <remote>) changes to other repositories.

Although all git repositories are technically equal, it is common for one to be chosen as a stable reference repository, with developers choosing to only export complete and tested changes to it from their personal repositories.

Local work

The following provides some bare basic commands to get you going. It is strongly recommended to look these commands up in their man pages (man git-<command> or git <command> --help) as there is a host of useful options and alternative ways they can be run.

Turning on color for various outputs is also recommended. This can be done on a global level (i.e., store the option in ~/.gitconfig instead of the project specific .git/config) by the commands

Saving changes

First add the changes to the index via

git add <filename>

The commit the changes by running

git commit

Files can be deleted and moved by git rm and git mv.

Remote work

The following provides some bare basic commands to get you going. It is strongly recommended to look these commands up in their man pages (man git-<command> or git <command> --help) as there is a host of useful options and alternative ways they can be run.

This list of remote repositories git knows about can be obtained via

git remote -a -v

Duplicating an existing project

This adds a default remote origin for pulling future changes from. The last two commands set the system up so git push origin will default to sending any changes made to the local master branch to the remote master branch.

The second line sets the system up so git push <remote> will default to sending any changes made to the local master branch to the remote master branch. The third line sets the default remote to the created remote. The fourth line sets master as the default branch that git pull will merge into the local master branch.

Pushing and pulling

Assuming the default refspec has been setup, commits can be sent to the remote by running

git push <remote>

Similarly, commits can be brought down from the remote by running

git pull <remote>

Before attempting to send the local changes with the former command, it is usually a good idea to the later command with the --rebase option to incorporate any existing changes present in the remote repository under and the local changes (resolving any issues that arise).

It is also possible to just bring the remote changes into the remote tracking branch (run git branch -a -v to see these) without merging for inspection via

git fetch <remote>

All of these commands use refsepcs, which correspond to directories under .git, to specify the movement between local and remote branches. At some point, it is recommend to read the information about this in the git-push, git-fetch, or git-pull man pages in order to understanding the + option and fast forwarding.

Sharing Repositories

All users with a common sponsor belong to the same sponsor group. This gives them read only access to each others directories by default, which makes it possible to for each of them to pull changes from the others. For example,

will add other user's git repo as a remote repository to the local one and fetch it. For access off of SHARCNET's clusters, just make the remote a ssh directory

git remote add <remote> <user>@<cluster>:/home/<other user>/<git dir>

where user is your SHARCNET username and cluster is any of the SHARCNET clusters.

Cluster Repositories

The --shared=<umask> option can be used with git init to override the default permissions, where umask is a standard UNIX 0xxx umask specifications or the keywords group (group writable) or all (world readable).

This can be used to create a master repository that all members of the group can write to is an effective way to create a master repository. For example, the supervisor can do

git init --bare --shared=group

in some master dir under their home directory (so it is backed up), where the --bare option just specified that a working tree is not required. An existing repository can then be pushed into it via

git remote add master /home/<supervisor>/<master dir>
git push master

New members to the group can then get their own copy to work with by just doing

git clone /home/<supervisor>/<master dir>

For access off of SHARCNET clusters, /home/<supervisor>/<master dir> would have to be replaced with <user>@<cluster>/home/<supervisor>/<master dir> as before.

Sponsor group equivalent access can also be given to other specific groups and/or users by using FACLs.

The above does not grant full permissions. Because specified user and group permissions are always masked by the mask permissions (what non-facl aware programs call the group permissions), it just means that the named user or group gets up to equivalent access to the sponsored group (i.e., what was specified with --shared=<umask>).

Permissions to the sponsored group can also be revoked so just the named groups and users have access