Gitworkflow

This document describes the typical Git workflow that all Flux members are expected to follow.

Assumptions

There is a main Git repository on git-public.flux.utah.edu

All Flux members have a login on git-public.flux.utah.edu, and have created a "public" clone of the main repository. There are a number of reasons for each flux member having this public clone:

It functions as a backup for all your work; the repo is backed up while your laptop or emulab node is not.

Permits easy collaboration with other Flux members as well as external collaborators.

Makes it easy to push changes around within Emulab, say from your laptop to your elabinelab; push to your public clone and then pull from someplace else.

Most branches are local to you and do not pollute the main repository; create as many branches as you like, the rest of us do not see them. This avoids branches that are never resolved (joined) in the main repository.

Lastly, it functions as a record of individual work, as for reporting, proposals, budget justifications, etc. For many of us, our commit messages are better records of what we have done.

Policies

Flux members are expected to push their work from their various private repositories to their public one on a regular basis. Need to define regular. Is it daily?

Topic branches are strongly encouraged instead of working on the master branch.

Workflow

Background

It is necessary to understand that a GIT repository clone can have multiple remotes; the one you cloned from and any number of others. You can push/pull from any of the remotes, and GIT will keep everything straight since it uses a hash on each object to determine if a repository already has a particular change. Considering the Flux workflow that will be discussed later in the section, there is a central repository called emulab-devel. Say you clone this repository twice, perhaps on different machines. Lets call them "clone1" and "clone2". Further, you make clone1 a remote of clone2. Summarizing so far, we have the following arrangement:

Someone has committed a fix to emulab-devel, which you want to pull into both clone1 and clone2. One approach is to go to both repositories, and issue a pull, which will fetch the changes and merge them in:

clone1> git pull origin master
clone2> git pull origin master

Another approach, and the approach that is used in the workflow discussion below, is to pull the changes into clone2 and then push them to clone1:

clone2> git pull origin master
clone2> git push clone1 master

All of the changes that you pulled into clone2 are sent over to clone1. What if you had already pulled those changes into clone1? That is okay, since the changes all have unique hashes; a change in clone2 that is already in clone1, will be ignored.

Getting Started

Create your initial public clone of the main repository by logging into git-public.flux.utah.edu and issuing the following command (replace '<username>' with your account name on git-public):

Note: Git will say "Initialized empty Git repository in ..." Do not worry; your new repository is not really empty.

Create your private repository by cloning your public clone on your work machine. You may create as many private repositories in different places as you like, but for this discussion we'll just work with one.

Add the main repository as a "remote" of your private repository so that you can publish your work with a 'git push' command. Commits to the main repository will flow through your private clone to your public clone; you will not need to log into git-public to manage your public clone.

Initialize a couple of git configuration directives so that git will use your full name and email in commit messages. If you do not do this, it is difficult for others to figure out who has made the commits:

Also, if you are using a Windows machine for development, you'll need to set some additional configuration directives for git to automatically convert to LF line endings on commit and CRLF endings on checkout:

git config core.autocrlf true
git config core.safecrlf true

If you're using a unix-like environment for development, you may want to set these directives in case you add a file that has CRLF line endings:

git config core.autocrlf input
git config core.safecrlf true

This will put these settings in .git/config, and they'll apply to all future commits from that repository. If you want to set these options for all repositories on that machine, you can pass the --global option to have git set them in ~/.gitconfig. Once you've changed these settings, you'll need to check out the master branch of your repository again for them to take effect. To do this, run:

git checkout -f master

(Optional) Set up your public repository to send mail when you push to it. All of the following commands should be run on the git-public machine

cd into your public git repo for the following commands (/flux/git/users/<username>/emulab-devel.git)

Set the name of your repo - this is the short string that will be included in the subject line of mail messages. Suggested: 'emulab-<username>'. (This will also be placed in an X-Git-Repo header in the mail for easy procmailing).

git config --add hooks.gitmail.reponame emulab-<username>

Set the address that all mail should go to (unless you have a reason to do otherwise, this should go to the Flux commit list)

Set the Reply-To address for the mail to yourself, or a mailing list on which it's appropriate to discuss the development that's going on in the repository

git config --add hooks.gitmail.replyto <you@flux.utah.edu>

Tell gitmail to not send mail about commits that are already in the main emulab-devel repo. This is necessary to avoid seeing tons of irrelevant mail when you do a push into your public repo.

git config --add hooks.gitmail.excluderepo /flux/git/emulab-devel.git

Tell gitmail to not include trivial merge commits in the commit email. For a merge to be considered trivial, each file changed must be identical to at least one parent's version of that file. In addition, the commit message for the merge must also be the default one generated by git. Merges are very common when using git, so only including merge commits where files had to be merged (or changed to resolve conflicts) reduces the noise for those reading your commit emails.

git config --add hooks.gitmail.hidetrivialmerges true

Test your configuration - this won't send mail, but will show you debugging output along with the mail that it would have sent:

./hooks/post-receive -d -t

Send a test email - if it works, you're done!

./hooks/post-receive -t

There are plenty of other options to gitmail, which are documented near the top of the script. For example, if you want diffs included in the mail, you can set the 'showcommitextra' option to '-p'

Normal Work

Now, lets do some development work. First, create a branch in your public repository by pushing the head of the current local branch to a new remote branch. Then checkout the remote branch locally. This will also make sure your local branch tracks the remote branch.

If mybranch already exists in your public repository and you want to work on it in this private repository, create a local branch that tracks the branch in the public repository.

git checkout --track origin/mybranch

As you work in the new branch, you should regularly commit your work to the new branch in your private repository.

git commit -a -m 'Made an important change to foo'

Along the way, you may want to fetch changes from the main repository to see what others have been working on:

git fetch central

You can use git log to look at what others have changed:

git log central/master

Then you can bring these changes into your master branch. If you are working on a branch and have uncommitted changes you can first commit them (see above), or you can stash them:

git stash

And then switch back to the master branch and rebase it (merging is discouraged for changes directly on the master branch):

git checkout master
git rebase central/master

Applying Upstream Changes to Topic Branches

When working on a topic branch you may find that there are upstream changes to master that you need to apply to your branch. There are two ways to do this: merging and rebasing.

Which method you use is mostly up to you. Merging is always safe, but adds a merge commit each time. This can make the history more complicated and harder to follow, but it also shows exactly when you applied upstream changes to your branch. Rebasing is useful if you decide that the current master branch head is a better starting point for your topic than where you originally started it. This may happen if someone else committed something to master that directly affects your topic, and you don't want to show that you had to merge those changes in. This helps to keep the history of your branch simple and linear.

Note that you MUST merge if you collaborate with other developers by asking them to pull changes from your branch. Rebasing would change the history and create problems for the other developers when they try to get your latest changes.

If you decide you want to merge changes from upstream into your topic branch mybranch, do the following:

Resolving Merge/Rebase/Stash Conflicts

Sometimes when merging, rebasing, or applying a stash you may see an error message similar to the one below:

CONFLICT (content): Merge conflict in xxx/yyy/zzz

If you see this, you will need to resolve those conflicts by hand and mark the conflicts as resolved by re-adding the changed files.

git add xxx/yyy/zzz

If you were just applying a stash, this is all you need to do. If you were merging or rebasing, you must tell git you've resolved the conflicts as follows:

If merging:

git commit

If rebasing:

git rebase --continue

Publishing Topic Branches

When you have made changes to your private repository, you should push them to your public repository to back them up and enable sharing them with other developers. You can push all branches that exist in both private and public repositories with this command:

git push origin

If you wish to push only a particular branch, you may do so with this command:

git push origin mybranch

Merging Topics with Master

When you have finished development on your topic branch and wish to merge it with master, do the following:

If you will not continue development on this topic branch, you can delete it from your private repository and your public repository.

git branch -d mybranch
git push origin :mybranch

When you make a change and it should be installed at Utah

Our current procedure is that the emulab-devel branch called current should closely match the code running on emulab.net. In general, this branch should contain everything from master and also local unstable code we want to test ourselves but aren't (yet) cruel enough to inflict on the world at large. Working on such experimental code proceeds much like normal development on master except the name is changed to current:

git checkout current
# work on local copy here
git commit
# repeat above as necessary until ready to publish local changes
git pull --rebase central current
git push central current

If and when the current branch is judged sufficiently stable to become the new master, the procedure is essentially the same as case 1 in the "Merging Topics with Master" case. First, perform the push to centralcurrent as normal, and then:

More common will be the case when general fixes appearing on master should also be made on current. This is pretty much the same thing in the opposite direction, so first push the change to centralmaster, and then:

When Things Go Wrong

In an ideal world, the above instructions would be all we would need. We could incrementally add more and more commits which would continuously transform Emulab to a more and more perfect software system. Even if bugs were to be found, bug fixes could also be accommodated in the same manner. However, things are sometimes not that simple...

Code committed by mistake

Sometimes, commits will be made which in hindsight turn out to be a bad idea: either a commit of buggy or premature code, or a commit accidentally pushed to the wrong place. There are various ways to recover from situations like this:

If the commits have not been pushed beyond your personal repositories, it is best to discard them altogether. (There are several advantages to this approach: it cleans up the history; it makes subsequent bisection easier; and it conceals your mistakes.) git reset does this for you:

git reset --soft fabc0de

if you know that fabc0de is the last good commit before the mistake, or

git reset --soft HEAD~2

if you don't know the names of the commits but you know there are two you want reverted. (If things are so bad that you don't even want to keep the pieces for a post-mortem or a second attempt, you could use git reset --hard.)

Otherwise, if the commits have gotten to a public repository (e.g. emulab-devel), you will want to record a reversion in the history. For a simple commit (not a merge), this is simply a matter of identifying the bad commit:

git revert badc0de

This reversion will be a plain commit just like any other, so you can inspect it, push it to other repositories if appropriate, or discard it if necessary (see above).

Push Rejected

If you pull changes from the master branch of emulab-devel into a branch in your local repository, and then attempt to push the resulting merge up to emulab-devel you will get an error similar to the one below:

remote: Your push to refs/heads/master is being rejected because the
remote: current branch head (03fb39215433e1ee85fb1eaa3a5369218c62bae8)
remote: is not reachable by a first-parent traversal of your commit's
remote: history.
remote:
remote: error: hook declined to update refs/heads/master
To git-public.flux.utah.edu:/flux/git/emulab-devel.git
! [remote rejected] master -> master (hook declined)
error: failed to push some refs to 'git-public.flux.utah.edu:/flux/git/emulab-devel.git'

This was caused by merging master into your topic branch (or pulling emulab-devel's master branch into your master branch). Doing this will result in a very hard-to-follow history that makes tasks like reverting merge commits difficult. To prevent this problem a hook script has been installed that prevents these commits from being pushed. Use one of the following solutions to fix your history so so that your push will be successful.

For commits made directly to your master branch (e.g., quick fixes or commits that don't merit their own topic branch), using git pull --rebase is recommended instead of git pull. If you did a regular pull by mistake, do this to get going again:

git fetch central
git rebase central/master
git push

If you just merged master into your topic branch and want to push it upstream, you will need to undo the merge you just did and do it the other way around (meaning merge your topic branch into master). This assumes that you just did the merge and tried to push:

Erroneous merge commits

If the merge has not been pushed to a public repository, then the same git reset approach is preferred for the same reasons as with a simple commit. (Note that you will have to reset HEAD in each branch you are still concerned about.)

Otherwise, you will want to revert one or more branches to the state they were in before the merge. (Typically, only master matters enough to bother.) First, find the name of the bad merge and examine its log:

git log badc0de

Look at the line which reads (for example):

Merge: fabc0de... badbad1...

and identify which branch is which. Assume that fabc0de is a good mainline commit on master, and badbad1 was the head of the branch at the time of merging: since fabc0de was listed first, you want to revert the commit from master with respect to its first parent. That can be done like this:

git revert -m 1 badc0de

That is all that needs to be done on the master branch. Unfortunately, subsequent merges involving the same branches is complicated by the fact that a reversion now exists on master. The easiest way to cope is to fork a new branch from master after the reversion, and duplicate your earlier work onto it. If this is unsuitable for some reason, there are alternative approaches to recovering, although they have drawbacks too.

"CRLF will be replaced by LF in <file>" Errors

You will get this error if git determines it can't convert between <file>'s current CRLF line-ending format and the LF format and back without losing data. To fix this, do one of the following:

If the file is supposed to be text, make sure all line endings are the same (either LF or CRLF, not a mix of both). If this file should have CRLF endings, add it to the .gitattributes file and unset the crlf attribute. You will also get this if core.autocrlf is set to input and all line endings are CRLF.

If the file is supposed to be binary, add it to the .gitattributes file and set the binary attribute.

If you modify the .gitattributes file, make sure to commit your changes and push them up to the emulab-devel repository.

Hot Fixes

If you want to make a quick change to the central (emulab-devel) repository, without creating a branch in your local clone, first switch back to the master (stashing any local changes on your branch).

You are now in sync with the emulab-devel repository. Make your change, and then commit locally:

git commit -a -m "This is a hot fix"

You now need to push this commit to the emulab-devel repository and your public clone (created in Getting Started):

git push central master

Note that this method of making changes is encouraged for small quick fixes only, e.g., edit file and commit.

Making Stable Snapshots

In addition to our bleeding-edge development repository, called emulab-devel, we also be maintaining a stable repository called emulab-stable. (These repositories are described in the GitRepository page.) The idea is that, on a regular basis, we will update emulab-stable to a stable point of development from emulab-devel. Here, we describe the development-freeze process that we will follow to vet code before putting it into the stable repository.

The first full work week of every odd-numbered month will be a code-freeze week. The code freeze begins on the first Monday of the month and ends on the following Monday. If the Monday is a holiday than it will start on Tuesday and end the following Tuesday. During a code-freeze week, our job is to test the master branch of the emulab-devel code base. At the end of the week, if all seems well, we will update the master branch of emulab-stable to match the master branch of emulab-devel. If all is not well at the end of the week, we will decide whether to extend the code freeze or to abandon the month's update of emulab-stable.

During a code-freeze week, only bug fixes should be checked into the master branch of emulab-devel. We will enforce this policy by automatically rejecting all commit pushes during the week except those with log messages starting with the magic string "BUG FIX:". We considered more draconian policies, but decided that the "magic string policy" was sufficient and easy to implement. If multiple commits are pushed at once, each individual commit must start with this magic string (if one or more does not, they will all be rejected). Note that this policy does not apply to merge commits, but does apply to all new commits reachable from the merge commit.

At the end of a code-freeze week, the emulab-devel master branch will be tagged with stable-YYYYMMDD (where YYYMMDD is the current date) and pushed to emulab-stable's master branch. Then all tags which point to ancestor commits of emulab-devel's master branch are pushed to emulab-stable.

EE: I don't understand the need for the process described below. Someone, please elaborate.

At the beginning of a code-freeze week, the emulab-devel master will be merged into a branch called devel-during-freeze. At the end of the week, that branch will be merged back into the master on devel and the merged master will be pushed to the emulab-stable repository. Any new features or other development work should be checked into the devel-during-freeze branch during the freeze week.

Enabling/Disabling code freeze

To enable code freeze on the master branch of the emulab-devel repository, you can run the following command on git-public:

Recovering from rejected pushes

If you attempt to push during the freeze, and one or more of your commits do not contain the "bugfix" keyword, the push will be rejected.

Pushes of a single commit

If you tried to push one commit which was not a bug fix, don't do that. Either keep your commit to your public repository for now, or push it to a different branch (i.e. one other than master) on emulab-devel. Either way, you can merge it into the development master once the freeze is over.

If you tried to push a commit which really was a bug fix but you forgot to include the keyword in your commit message, then you should run:

git commit --amend

in your own repository, and edit the log message to mention "bug fix:" at the start. A subsequent push of that commit should then succeed.

If you pull from devel after creating your bug fix and you find that you need to amend it before pushing, you will need to undo the merge performed by git pull first. To undo the last merge and reset your branch to your bug fix commit, run:

git reset --hard ORIG_HEAD

You may then amend your commit as described above, pull from devel again and push your fix upstream.

Pushes of multiple commits

Things are slightly harder if you tried to push multiple commits and at least one of them did not include the "bugfix" keyword.

First, please double check all the rejected commits for any change which is not a bug fix. If there are any, then save them (e.g. with git stash, git cherry-pick to a temporary branch, or similar) until the freeze is over.

Next, please run:

git rebase -i central/master

to interactively edit the set of commits you are trying to push. (You could instead specify git rebase -i HEAD~4 or similar if you know how many problematic commits are involved.)

The interactive rebase should now open an editor with a list of lines each corresponding to one of the commits in question. You should delete any lines which refer to commits which are not bug fixes (note that these were saved earlier). If there are any commits which really are bug fixes but were not marked with the "bugfix" keyword, then please change the command at the beginning of the line from "pick" to "edit".

Once the editor process exits, it will reapply the correct commits in your history. If there were any commits you marked as requiring editing, it will pause for each one: you can clean up the commit (and presumably insert the necessary keyword if appropriate) by issuing the command git commit --amend followed by git rebase --continue. Once all commits have been fixed, you should be able to push the whole lot to emulab-devel.

(Recent versions of git allow you to use the command reword instead of edit, which might save a little bit of work in the procedure above.)

Pushing Fixes to stable/master Before Code Freeze

It may be necessary to push critical fixes to stable/master between code freezes. If this happens, first commit your fix to devel/master, then do the following:

Use git log to find the commit hash for your bug fix.

Make sure your copy of stable/master is up to date, then create a temporary branch based on it:

git fetch stable
git checkout -b stable-fix stable/master

Use git cherry-pick to apply the fix to your current branch. Since the hash of this new commit will be different from the original one in devel/master, pass the -x option to add the original hash to the commit message. This will make it easier to identify the equivalent commit in devel/master should the need arise.

git cherry-pick -x <commit>

Push the new HEAD to stable/master.

git push stable HEAD:master

Delete your temporary branch

git checkout master
git branch -D stable-fix

Updating stable/master After Code Freeze

At the end of code freeze, do the following to create a new stable snapshot. These steps assume you have a repository with two remotes, devel and stable, which point to the central emulab-devel and emulab-stable repositories on git-public respectively.

Stash any changes to your current HEAD that have not been committed. If git status states that you have no changes, you may skip this step.

git stash

Fetch the latest changes from both repositories

git fetch --all

Create a new, temporary branch which will become the new master for stable:

git branch snapshot devel/master
git checkout snapshot

To ensure clean history for those currently tracking the stable/master branch, do an "ours" merge from stable/master into our new branch. If there were no pushes to stable/master since the previous snapshot, no merge commit will be created:

git merge --strategy=ours stable/master

Tag the new HEAD, substituting in the current year for 'YYYY', the month for 'MM' and the day for 'DD':

git tag stable-YYYYMMDD

Disable code freeze on devel/master as described here, and push the current HEAD and new stable tag to both devel/master and stable/master:

Delete your temporary branch by checking out which ever branch you were working on before creating the snapshot, then deleting the temporary branch:

git checkout master
git branch -D snapshot

If you had uncommitted changes before creating the snapshot, re-apply them now:

git stash pop

Updating a Submodule

There are two parts to updating a submodule after the source code inside the submodule has been committed and pushed up. For this discussion, we will use the emulab-devel repo as the enclosing repository, and the geni-rspec repository as the submodule within emulab-devel.

Step One: Update the enclosing repository

Update the submodule hash that the enclosing repository stores. In a clean clone of emulab-devel: