Managing source code with Mercurial

A powerful, flexible system for managing project source code

Managing the source code for a software development project is only
slightly less important than writing it in the first
place. UNIX® and Linux® systems offer a rich selection of
version control system (VCS) packages, each of which takes a slightly
different approach to this common concern. This article focuses on
the Mercurial source code management system, often simply referred
to as hg. Mercurial provides a powerful, modern, and light-weight
solution for source code control that makes it easy for developers
to make and debug their changes to a software project while
maintaining a stable, centralized source code repository that all
project members can depend upon.

William von Hagen has been a writer and UNIX systems administrator
for more than 20 years and a Linux advocate since 1993. Bill is
the author or co-author of books on subjects such as Ubuntu Linux,
Xen Virtualization, the GNU Compiler Collection (GCC), SUSE Linux,
Mac OS X, Linux file systems, and SGML. He has also written
numerous articles for Linux and Mac OS X publications and Web
sites. You can reach Bill at wvh@vonhagen.org.

Source code management on UNIX and Linux systems

Identifying and tracking the changes made by multiple developers
and merging them into a single, up-to-date codebase makes
collaborative, multi-developer projects possible. VCS software,
also referred to as revision control systems (RCS) or source code
management (SCM) systems, enable multiple users to submit changes
to the same files or project without one developer's changes
accidentally overwriting another's changes.

Linux® and UNIX® systems are knee-deep in VCS software, ranging from
dinosaurs such as the RCS and the
Concurrent Versions System (CVS) to more modern systems such as
Arch, Bazaar, Git, Subversion, and Mercurial. Like Git, Mercurial began life as an open source
replacement for a commercial source code management system called
BitKeeper, which was used to maintain and manage the source code
for the Linux kernel. Since its inception, Mercurial has evolved
into a popular VCS system that is used by many open source and
commercial projects. Projects using Mercurial include Mozilla,
IcedTea, and the MoinMoin wiki. See Resources
for links to these and many more examples.

VCS systems generally refer to each collection of source code in
which changes can be made and tracked as a repository. How
developers interact with a repository is the key difference
between more traditional VCS systems such as CVS and Subversion,
referred to as centralized VCS systems, and more flexible VCS
systems such as Mercurial and Git, which are referred to as
distributed VCS systems. Developers interact with centralized VCS
systems using a client/server model, where changes to your local
copy of the source code can only be pushed back to the central
repository. Developers interact with distributed VCS systems using
a peer-to-peer model, where any copy of the central repository is
itself a repository to which changes can be committed and from
which they can be shared with any other copy. Distributed VCS systems
do not actually have the notion of a central, master repository,
but one is almost always defined by policy so that a single
repository exists for building, testing, and maintaining a master
version of your software.

Why Mercurial?

Mercurial is a small, powerful distributed VCS system that is easy
to get started with, while still providing the advanced commands
that VCS power users may need (or want) to use. Mercurial's
distributed nature makes it easy to work on projects locally,
tracking and managing your changes via local commits and pushing
those changes to remote repositories whenever necessary.

Among modern, distributed VCS systems, the closest VCS system to
Mercurial is Git. Some differences between Mercurial and Git are
the following:

Multiple, built-in undo operations: Mercurial's revert, backout,
and rollback commands make it easy to
return to previous versions of specific files or previous sets
of committed changes. Git provides a single built-in revert command with its typical
rocket-scientist-only syntax.

Built-in web server: Mercurial provides a simple,
integrated web server that makes it easy to host a
repository quickly for others to pull from. Pushing requires either
ignoring security or a more complex setup that supports Secure Sockets Layer (SSL).

History preservation during copy/move operations:
Mercurial's copy and move commands both preserve complete
history information, while Git does not preserve history in
either case.

Branches: Mercurial automatically shares all branches,
while Git requires that each repository set up its own branches
(either creating them locally or by mapping them to specific
branches in a remote repository).

Global and local tags: Mercurial supports global tags
that are shared between repositories, which make it easy to
share information about specific points in code development
without branching.

Native support on Windows platforms: Mercurial is written
in Python, which is supported on Microsoft® Windows®
systems. Mercurial is therefore available as a Windows
executable (see Resources). Git on
Windows is more complex—your choices are msysGit, using
standard git under Cygwin, or using a web-based hosting system
and repository.

Automatic repository packing: Git requires that you
explicitly pack and garbage-collect its repositories, while
Mercurial performs its equivalent operations
automatically. However, Mercurial repositories tend to be larger
than Git repositories for the same codebase.

Mercurial and Git fans are also happy to discuss the learning
curve, merits, and usability of each VCS system's command
set. Space prevents that discussion here, but a web search on that
topic will provide lots of interesting reading material.

Creating and using Mercurial repositories

Mercurial provides two basic ways of creating a local repository
for a project's source code: either by explicitly creating a
repository or by cloning an existing, remote repository:

To create a local repository, use the hg init [REPO-NAME]
command. Supplying the name of a repository when executing
this command creates a directory for that repository in the
specified location. Not supplying the name of a repository
turns the current working directory into a repository. The
latter is handy when creating a Mercurial repository for
an existing codebase.

Note: To use the HTTP protocol to access Mercurial
repositories, you must either start Mercurial's internal web
server in that repository (hg serve -d)
or use Mercurial's hgweb.cgi script to
integrate Mercurial with an existing web server such as
Apache. When cloning via HTTP, you will usually want to specify a
name for your local repository.

After you create or clone a repository and make that
repository your working directory, you're ready to start working
with the code that it contains, add new files, and so on.

Getting help in Mercurial

Mercurial's primary command is hg,
which supports a set of sub-commands that are similar to those in
other VCS systems. To see a list of the most common commands,
execute the hg command with no
arguments, which displays output similar to that shown in Listing 2.

Listing 2. Basic commands provided by Mercurial

Mercurial Distributed SCM
basic commands:
add add the specified files on the next commit
annotate show changeset information by line for each file
clone make a copy of an existing repository
commit commit the specified files or all outstanding changes
diff diff repository (or selected files)
export dump the header and diffs for one or more changesets
forget forget the specified files on the next commit
init create a new repository in the given directory
log show revision history of entire repository or files
merge merge working directory with another revision
pull pull changes from the specified source
push push changes to the specified destination
remove remove the specified files on the next commit
serve export the repository via HTTP
status show changed files in the working directory
summary summarize working directory state
update update working directory
use "hg help" for the full list of commands or "hg -v" for details

This short list displays only basic Mercurial commands. To
obtain a full list, execute the hg help
command.

Tip: You can obtain detailed help on any Mercurial command
by executing the hg help COMMAND
command, replacing COMMAND with the name of any valid
Mercurial command.

Checking repository status

Checking in changes is the most common operation in any VCS
system. You can use the hg status
command to see any pending changes to the files in your
repository. For example, after creating a new file or modifying an
existing one, you see output like that shown in Listing 3.

Listing 3. Status output from Mercurial

$ hg status
M Makefile
? hgrc.example

In this case, the Makefile file is an existing file that has been
modified (indicated by the letter M at
the beginning of the line), while the hgrc.example file is a new
file that isn't being tracked (indicated by the question mark ?) at the beginning of the line.

Adding files to a repository

To add the hgrc.example file to the
list of files that are being tracked in this repository, use the
hg add command. Specifying one or more
file names as arguments explicitly adds those files to the list of
files that are being tracked by Mercurial. If you don't specify any
files, all new files are added to the repository, as shown in Listing 4.

Listing 4. Adding a file to your repository

$ hg add
adding hgrc.example

Tip: To add automatically all new files and mark any files
that have been removed for permanent removal, you can use
Mercurial's handy hg addremove command.

Checking the status of the repository shows that the new file has
been added (indicated by the letter A
at the beginning of the line), as shown in Listing 5.

Listing 5. Repository status after modifications

$ hg status
M Makefile
A hgrc.example

Committing changes

Checking in changes is the most common operation in any VCS
system. After making and testing your changes, you're ready to
commit those changes to the local repository.

Before committing changes for the first time

If this is your first Mercurial project, you must provide some
basic information so that Mercurial can identify the user who is
committing those changes. If you do not do so, you'll see a
message along the lines of abort: no username
supplied... when you try to commit changes, and your
changes will not be committed.

To add your user information, create a file called .hgrc in your
home directory. This file is your personal Mercurial configuration
file. You need to add at least the basic user information shown in
Listing 6 to this file.

Listing 6. Mandatory information in a user's .hgrc file

[ui]
username = Firstname Lastname <user@domain.tld>

Replace Firstname and Lastname with your first and
last names; replace user@domain.tld with your email
address; save the modified file.

You can set default Mercurial configuration values that apply
to all users (which should not include user-specific information)
in the /etc/mercurial/hgrc file on
Linux and UNIX systems and in the Mercurial.ini file on Microsoft Windows systems,
where this file is located in the directory of the Mercurial installation.

The standard commit process

After creating or verifying your ~/.hgrc file, you can commit your changes
using the hg commit command,
identifying the specific files that you want to commit or
committing all pending changes by not supplying an argument, as in
the following example:

$ hg commit
Makefile
hgrc.example
committed changeset 1:3d7faeb12722

As shown in this example output, Mercurial refers to all changes
that are associated with a single commit as a changeset.

When you commit changes, Mercurial starts your default editor
to enable you to add a commit message. To avoid this, you can
specify a commit message on the command line using the -m "Message.." option. To use a different
editor, you can add an editor entry in
the [ui] section of your ~/.hgrc file, following the editor keyword with the name of the editor
that you want to use and any associated command-line options. For
example, after adding an entry for using emacs in no-window mode as my default editor,
my ~/.hgrc file looks like that
shown in Listing 7.

Listing 7. Additional customization in a user's .hgrc file

Tip: To maximize the amount of information that Mercurial
provides about its activities, you can add the verbose = True entry to the [ui] section of your Mercurial configuration
file.

Pushing changes to a remote repository

If you are using a clone of a remote repository, you want to
push those changes back to that repository after committing
changes to your local repository. To do so, use Mercurial's hg push command, as shown in Listing 8.

Pulling changes from a remote repository

If you are using a clone of a remote repository and other users
are also using that same repository, you want to retrieve the
changes that they have made and pushed to that repository. To do
so, use Mercurial's hg pull command, as
shown in Listing 9.

As shown in the output from this command, this command only
retrieves information about remote changes—you must run the
hg update command to show the
associated changes in your local repository. This command
identifies the ways the repository has been updated, as
shown in Listing 10.

Listing 10. Updating your repository to show changes

Undoing changes in Mercurial

Mercurial provides the following built-in commands that make it easy to undo
committed changes:

hg backout CHANGESET: Undoes
a specific changeset and creates a changeset that undoes that
changeset. Unless you specify the --merge option when executing this command,
you have to merge that changeset into your current revision
to push it back to a remote repository.

hg revert: Returns to previous
versions of one or more files by specifying their names or
returning to the previous version of all files by specifying the
--all command-line option.

hg rollback: Undoes the last
Mercurial transaction, which is commonly a commit,
pull from a remote repository, or a push to this
repository. You can only undo a single transaction.

See the online help for all of these commands before attempting to
use them!

Summary

Mercurial and other distributed source code management systems are
the wave of the future. Mercurial is open source software, and
pre-compiled versions of Mercurial are available for Linux, UNIX,
Microsoft Windows, and Mac OS® X systems. This article highlighted
how to use Mercurial to perform a number of common VCS tasks,
showing how easy it is to get started using Mercurial. For more
advanced purposes, Mercurial provides many more advanced commands
and configuration options to help you manage your source code and
customize your interaction with a Mercurial installation.

Resources

Learn

The Mercurial
home page is a great starting point for getting information
about Mercurial, and it also provides links to many other sources of
related information.

Mercurial: The Definitive
Guide by Bryan O'Sullivan is the definitive work on
Mercurial. The complete text of the book is available online in HTML and epub
formats, which should hold you until your paper copy comes in the
mail.

Joel Spolsky's hginit.com site
provides a great introductory tutorial for using and working with
Mercurial.

The MercurialEclipse
plug-in provides support for Mercurial within the Eclipse
Integrated Development Environment.

FogCreek Software's Kiln provides free
trials and student/start-up
versions of its online, Mercurial-based hosting service that is similar
to online Git hosting services such as GitHub, Repo.Org.Cz, and so on.

BitBucket provides free hosting for Open Source projects in its
online, Mercurial-based hosting service, as well as paid hosting plans
for larger groups of developers.

The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.