7.1 Git Tools - Revision Selection

By now, you’ve learned most of the day-to-day commands and workflows that you need to manage or maintain a Git repository for your source code control.
You’ve accomplished the basic tasks of tracking and committing files, and you’ve harnessed the power of the staging area and lightweight topic branching and merging.

Revision Selection

Git allows you to specify specific commits or a range of commits in several ways.
They aren’t necessarily obvious but are helpful to know.

You can obviously refer to a commit by the SHA-1 hash that it’s given, but there are more human-friendly ways to refer to commits as well.
This section outlines the various ways you can refer to a single commit.

Git is smart enough to figure out what commit you meant to type if you provide the first few characters, as long as your partial SHA-1 is at least four characters long and unambiguous – that is, only one object in the current repository begins with that partial SHA-1.

For example, to see a specific commit, suppose you run a git log command and identify the commit where you added certain functionality:

Git can figure out a short, unique abbreviation for your SHA-1 values.
If you pass --abbrev-commit to the git log command, the output will use shorter values but keep them unique; it defaults to using seven characters but makes them longer if necessary to keep the SHA-1 unambiguous:

Generally, eight to ten characters are more than enough to be unique within a project.

As an example, the Linux kernel, which is a pretty large project with over 450k commits and 3.6 million objects, has no two objects whose SHA-1s overlap more than the first 11 characters.

A SHORT NOTE ABOUT SHA-1

A lot of people become concerned at some point that they will, by random happenstance, have two objects in their repository that hash to the same SHA-1 value.
What then?

If you do happen to commit an object that hashes to the same SHA-1 value as a previous object in your repository, Git will see the previous object already in your Git database and assume it was already written.
If you try to check out that object again at some point, you’ll always get the data of the first object.

However, you should be aware of how ridiculously unlikely this scenario is. The SHA-1 digest is 20 bytes or 160 bits. The number of randomly hashed objects needed to ensure a 50% probability of a single collision is about 280
(the formula for determining collision probability is p = (n(n-1)/2) * (1/2^160)). 280
is 1.2 x 1024
or 1 million billion billion. That’s 1,200 times the number of grains of sand on the earth.

Here’s an example to give you an idea of what it would take to get a SHA-1 collision.
If all 6.5 billion humans on Earth were programming, and every second, each one was producing code that was the equivalent of the entire Linux kernel history (3.6 million Git objects) and pushing it into one enormous Git repository, it would take roughly 2 years until that repository contained enough objects to have a 50% probability of a single SHA-1 object collision.
A higher probability exists that every member of your programming team will be attacked and killed by wolves in unrelated incidents on the same night.

The most straightforward way to specify a commit requires that it has a branch reference pointed at it.
Then, you can use a branch name in any Git command that expects a commit object or SHA-1 value.
For instance, if you want to show the last commit object on a branch, the following commands are equivalent, assuming that the topic1 branch points to ca82a6d:

$ git show ca82a6dff817ec66f44342007202690a93763949
$ git show topic1

If you want to see which specific SHA-1 a branch points to, or if you want to see what any of these examples boils down to in terms of SHA-1s, you can use a Git plumbing tool called rev-parse.
You can see Chapter 10 for more information about plumbing tools; basically, rev-parse exists for lower-level operations and isn’t designed to be used in day-to-day operations.
However, it can be helpful sometimes when you need to see what’s really going on.
Here you can run rev-parse on your branch.

Every time your branch tip is updated for any reason, Git stores that information for you in this temporary history.
And you can specify older commits with this data, as well.
If you want to see the fifth prior value of the HEAD of your repository, you can use the @{n} reference that you see in the reflog output:

$ git show HEAD@{5}

You can also use this syntax to see where a branch was some specific amount of time ago.
For instance, to see where your master branch was yesterday, you can type

$ git show master@{yesterday}

That shows you where the branch tip was yesterday.
This technique only works for data that’s still in your reflog, so you can’t use it to look for commits older than a few months.

To see reflog information formatted like the git log output, you can run git log -g:

It’s important to note that the reflog information is strictly local – it’s a log of what you’ve done in your repository.
The references won’t be the same on someone else’s copy of the repository; and right after you initially clone a repository, you’ll have an empty reflog, as no activity has occurred yet in your repository.
Running git show HEAD@{2.months.ago} will work only if you cloned the project at least two months ago – if you cloned it five minutes ago, you’ll get no results.

The other main way to specify a commit is via its ancestry.
If you place a ^ at the end of a reference, Git resolves it to mean the parent of that commit.
Suppose you look at the history of your project:

You can also specify a number after the ^ – for example, d921970^2 means “the second parent of d921970.”
This syntax is only useful for merge commits, which have more than one parent.
The first parent is the branch you were on when you merged, and the second is the commit on the branch that you merged in:

The other main ancestry specification is the ~.
This also refers to the first parent, so HEAD~ and HEAD^ are equivalent.
The difference becomes apparent when you specify a number.
HEAD~2 means “the first parent of the first parent,” or “the grandparent” – it traverses the first parents the number of times you specify.
For example, in the history listed earlier, HEAD~3 would be

Now that you can specify individual commits, let’s see how to specify ranges of commits.
This is particularly useful for managing your branches – if you have a lot of branches, you can use range specifications to answer questions such as, “What work is on this branch that I haven’t yet merged into my main branch?”

Double Dot

The most common range specification is the double-dot syntax.
This basically asks Git to resolve a range of commits that are reachable from one commit but aren’t reachable from another.
For example, say you have a commit history that looks like Figure 7-1.

Example history for range selection.

You want to see what is in your experiment branch that hasn’t yet been merged into your master branch.
You can ask Git to show you a log of just those commits with master..experiment – that means “all commits reachable by experiment that aren’t reachable by master.”
For the sake of brevity and clarity in these examples, I’ll use the letters of the commit objects from the diagram in place of the actual log output in the order that they would display:

$ git log master..experiment
DC

If, on the other hand, you want to see the opposite – all commits in master that aren’t in experiment – you can reverse the branch names.
experiment..master shows you everything in master not reachable from experiment:

$ git log experiment..master
FE

This is useful if you want to keep the experiment branch up to date and preview what you’re about to merge in.
Another very frequent use of this syntax is to see what you’re about to push to a remote:

$ git log origin/master..HEAD

This command shows you any commits in your current branch that aren’t in the master branch on your origin remote.
If you run a git push and your current branch is tracking origin/master, the commits listed by git log origin/master..HEAD are the commits that will be transferred to the server.
You can also leave off one side of the syntax to have Git assume HEAD.
For example, you can get the same results as in the previous example by typing git log origin/master.. – Git substitutes HEAD if one side is missing.

Multiple Points

The double-dot syntax is useful as a shorthand; but perhaps you want to specify more than two branches to indicate your revision, such as seeing what commits are in any of several branches that aren’t in the branch you’re currently on.
Git allows you to do this by using either the ^ character or --not before any reference from which you don’t want to see reachable commits.
Thus these three commands are equivalent:

$ git log refA..refB
$ git log ^refA refB
$ git log refB --not refA

This is nice because with this syntax you can specify more than two references in your query, which you cannot do with the double-dot syntax.
For instance, if you want to see all commits that are reachable from refA or refB but not from refC, you can type one of these:

$ git log refA refB ^refC
$ git log refA refB --not refC

This makes for a very powerful revision query system that should help you figure out what is in your branches.

Triple Dot

The last major range-selection syntax is the triple-dot syntax, which specifies all the commits that are reachable by either of two references but not by both of them.
Look back at the example commit history in Figure 7-1.
If you want to see what is in master or experiment but not any common references, you can run

$ git log master...experiment
FEDC

Again, this gives you normal log output but shows you only the commit information for those four commits, appearing in the traditional commit date ordering.

A common switch to use with the log command in this case is --left-right, which shows you which side of the range each commit is in.
This helps make the data more useful:

$ git log --left-right master...experiment
< F< E> D
> C

With these tools, you can much more easily let Git know what commit or commits you want to inspect.