A time traveller's guide to Git

Brandon Keepers explains how to rewrite history in Git.

Shares

While scientists have crushed the dream of travelling back in time, Git offers control over the fourth dimension when the wrongs of the past need to be corrected. The distributed version control system allows commits to be amended, discarded, reordered and modified to scrub the history of a repository.

But, heed the warnings of an experienced time traveller. Git obeys the law of causality; every commit in a Git repository is inextricably linked to the commit before it. Changing one commit alters all the commits that come after, creating an alternate reality. Altering the past can be dangerous and — except in rare circumstances — should only be done if the events being altered have not been observed by anyone else. Branches that have already been pushed to a remote should not be altered.

Join me as we explore ways to rewrite history with Git.

01. Amend recent history

For whatever reason, the human brain seems wired to remember something important just after pressing the 'Send' button on an email, and the right words always come to mind after a conversation is over. Likewise, I often realise I made a mistake immediately after making a commit in Git. The safest and most common form of rewriting the Git history is to amend the latest commit.

This article was written in a Git repository. The first commit was to create a README explaining the purpose of the repository.

Oops, after committing, I realised that I had committed article.md, which was just some notes and the first few sentences of the introduction. I did not intend to commit that file yet, so let's remove it from the history.

$ git rm --cached article.md rm 'article.md'

The --cached argument to git rm tells Git to stage the removal of the file, but to not actually delete the file from the filesystem. If you also want to delete the file, simply leave that argument out.

We can also make other modifications like we would if we were going to create another commit, such as making edits to the README.md and staging them with git add. Amend the previous commit by passing the --amend flag to git commit:

02. Undo recent history

Sometimes, a commit has so many mistakes that it's easier to just undo it. Maybe it was committed to the wrong branch, or a directory of unwanted files got accidentally added.

$ git reset HEAD^

This tells Git to remove to the previous commit, but to keep the changes introduced by that commit locally. git reset is powerful and can be destructive if used improperly. It's worth reading more about it on git-scm.com.

The log now shows that the latest commit is gone, but article.md is still modified.

$ git log --oneline 667f8c9 Add README

$ git status -s M article.md

From here, the changes can be committed on a different branch, stashed, discarded or modified and recommitted.

03. Maintain a tidy history

If you have used Git with a team, then there is no doubt that you have seen a push get rejected.

While this message looks big and scary, it's actually quite helpful. The hints tell us that since we started our work, one of our team members pushed changes and we need to get them, usually by running git pull. The hint also recommends checking out the note about 'fast-forwards' in the Git docs. I second that recommendation.

Running git pull will fetch the remote changes and create a new commit that merges them with our local changes. While there is nothing wrong with the merge commit, it adds unnecessary complexity to the revision history.

You can see that our Git history is now much cleaner and easier to scan.

Unless a repository is being pushed to multiple remotes, rebasing when pulling is almost always a good idea. I have Git configured to rebase automatically.

$ git config --global branch.autosetuprebase always

Keeping the revision history tidy may seem superficial, but it helps immensely when managing a large project.

04. Clean up recent history

Sometimes it is not clear until after a few mis-steps that there's a better path. Git's flexibility makes it easy to create checkpoints along the way, offering a point to return to if things go wrong.

In my daily development, I commit as often as possible. Anytime I think to myself, "OK, that is done, now what?", I commit. While this leads to a revision history that accurately reflects the order of events, the noise of many tiny commits can actually inhibit the maintainability of large projects. So once I am ready to share my changes with my team, I review my unpublished commits and clean them up.

An interactive rebase allows commits to be edited, squashed together or completely removed from the recent history of a branch.

While reviewing my progress on this article, I discovered a few embarrassing typos. Since the repository had not been shared with anyone yet, I covered my tracks by fixing the typos in the original commit. I preserved my original mistake, so you can follow along by checking out the typos branch of the repository.

# Rebase 667f8c9..7445019 onto 667f8c9 # # Commands: # p, pick = use commit # r, reword = use commit, but edit the commit message # e, edit = use commit, but stop for amending # s, squash = use commit, but meld into previous commit # f, fixup = like "squash", but discard this commit's log message # x, exec = run command (the rest of the line) using shell # # These lines can be re-ordered; they are executed from top to bottom. # # If you remove a line here THAT COMMIT WILL BE LOST. # # However, if you remove everything, the rebase will be aborted. # # Note that empty commits are commented out

As the note explains, commits can be rearrange to change their order, or pick can be changed to one of the other commands.

I moved the two typo fixes to just after the commit where they were introduced and changed pick to fixup to meld them in to the original commit. After saving and closing the editor, Git will apply the changes:

The log shows that the typo fixing commits are now gone. The fixes were applied to the original commits and there is no evidence of my poor spelling (in this branch).

$ git log --oneline4787614 first draft of pull --rebaseee719e9 first draft of reset00165a8 first draft of amend section667f8c9 Add README

This rebase worked without any other interaction, but occasionally a rebase will require manual fixes for merge conflicts. If that happens, don't freak out. Simply read the messages. Git will usually help get you out of a bind.

05. Rewrite all of history

All the Git commands we have examined so far are useful for modifying recent commits, but sometimes more extreme measures are necessary, whether it is to remove sensitive or extremely large files, or to simply make a project easier to manage.

git filter-branch supports a hand full of custom filters that can rewrite the revision history for a range of commits.

My first legitimate use of git filter-branch was on a large project where the server and the client were both in the same repository. As more people were added to the team, and tensions between the hipsters and neck-beards rose, it became obvious that two repositories would be more appropriate.

A simple solution would've been to clone the repository twice, delete the unnecessary files and move the remaining files around. But that leaves two repositories with duplicate histories that take up unnecessary space. Instead, we cloned the repository twice, and used the --subdirectory-filter to create two new repositories that only contained the changes for the relevant parts of the application.

$ git rebase -i 7bb9109^

Many people use different email addresses for personal and work projects, which can easily result in commits to a repository using the wrong email address. The --env-filter can modify basic metadata about a commit, such as author information or the commit date.

Suppose that early on in a project, someone committed some extremely large assets, and now everyone that clones the repository has to wait for those assets to download. Or maybe you are open-sourcing a project that has some sensitive data stored in it.

All of the following changes will rewrite the full history of a repository, essentially making it a new repository. Pushing to the same remote that was used originally will get rejected.

$ git push ! [rejected] master -> master (non-fast-forward)

It is possible to force Git to push all changes to an existing remote, but remember that this could have adverse effects for everyone else working on the project.

$ git push --force --all --tags

06. Power and flexibility

Git's powerful features, extreme flexibility and often unintuitive command line may seem overwhelming, but taking time to learn and experiment is a worth-while investment. When in doubt, pass --help to any Git command to learn more. Understanding how and when to rewrite the revision history will give you complete control over your projects and make them easier to manage.

Brandon Keepers is a maker and breaker of things at GitHub working mostly on Speaker Deck. When his face is not dimly lit by a computer screen, you can find him with a book in his hands, playing racquetball or basketball, running with his dog, or enjoying good food and drink with his wife.