The New Breed of Version Control Systems

A version control system enables developers to keep historical versions of
the files under development and to retrieve past versions. It stores version
information for every file (and the entire project structure) in a collection
normally called a repository.

Inside the repository, several parallel lines of development, normally
called branches, may exist. This can be useful to keep a maintenance
branch for a stable, released version while still working on the bleeding-edge
version. Another option is to open a dedicated branch to work on an
experimental feature.

Version control systems also let the user give labels to a snapshot of
a branch (often referred to as tags), to ease later extraction. This
is useful to signify individual releases or the most recent usable development
version.

Using a version control system is an absolute must for a developer of a
project above a few hundred lines of code, and even more so for projects
involving the collaboration of several developers. Using a good version
control system is certainly better than the ad-hoc methods some developers use
to maintain various revisions of their code.

Traditionally, the de-facto open source version control system was CVS, but lately many others have emerged that aim to be better in some or every way. This article provides an overview
of several alternatives.

Common Features

Version control systems come in all shapes and sizes, but there are common
guidelines for their design. Some systems support Atomic Commits,
which means that the state of the entire repository changes all at once.
Without atomic commits, each file or unit changes separately and so the state
of the entire repository at any one point may not be preserved.

Most common VCSs allow
merging of changes between branches. This means that changes committed
to one branch will be committed to the trunk or another branch as well, with
one automatic (or at least semi-automatic) operation.

A distributed version control system allows the cloning of a remote
repository, producing an exact copy. It also allows changes to propagate from
one repository to another. In non-distributed VCSs, a developer needs
repository access in order to commit changes to the repository. That leaves
developers without repository access as second-class citizens. With a
distributed VCS, this is a non-issue, as each developer can clone the master
repository and work on it, later propagating his changes to the master
repository.

Another common factor is whether the repository allows versioned file
and directory renames (and possibly copies well). If a file changes
location, will the repository preserve its history? Can changes applied to the
organization of the older files be applied to the new organization?

CVS

CVS, the Concurrent Versions System, is a mature and relatively reliable version control system. Many large open source projects, including KDE, GNOME, and Mozilla use CVS. Most open source hubs such as SourceForge support it as a service, which as a result caused it to be used by many other projects.

Despite its popularity, CVS has its limitations. For example, it does not
support file and directory renaming. Furthermore, binary files are not handled
very well. CVS is not distributed and the commits are not atomic. As there are
already better alternatives that aim to be a superset of its functionality, you
are probably better off starting a new project by using something else.

Subversion

Subversion aims to create a
better replacement for CVS. It retains most of the conventions of working with
CVS, including a large part of the command set, so CVS users will quickly feel
at home. Aside from that Subversion offers many useful improvements over CVS:
copies and renames of files and directories, truly atomic commits, efficient
handling of binary files, and the ability to be networked over HTTP (and
HTTPS). Subversion also has a native Win32 client and server.

Subversion has recently entered its beta period after being alpha for a long
time. As such it may still have some minor quirks, and its performance in some
areas is lacking. Nevertheless, it's very usable for a beta-stage software, and
was so even in a large part of its alpha-stage.

The HTTP (or HTTPS)-based Subversion service is difficult to deploy in
comparison to other systems, as it requires setting up an Apache 2 service with
its own specialized module. There is also an "svnserve" server that is less
capable but easier to set up (and faster) and uses a custom protocol.
Moreover, Subversion's support for merging is limited and resembles that of
CVS. (i.e., merges to branches where files were moved will not be performed
correctly). It is also relatively resource intensive, especially with large
operations.

Subversion is extensively documented in the free online book, Version Control with Subversion. The rudimentary online help system supplied by the Subversion client can also prove
useful for reference. Subversion has many add-ons, but they are still less
mature than their CVS counterparts.

Arch

GNU Arch is a VCS originally created by
Tom Lord for his own version control needs, as well of those of other free
software projects. Arch was initially prototyped as a collection of shell
scripts, but its main client now is tla, which is written in C and
should be portable to any UNIX. It has not been ported to Win32; while it is possible to do so, it is not a priority for the project.

Arch is a distributed version control system. It does not require a special
service in order to set up a network-accessible repository, and any remote
file-service service (such as FTP, SFTP, or WebDAV) is a suitable Arch service. This makes setting up a service incredibly easy.

Arch supports versioned renames of files and directories, as well as
intelligent merging that can detect if a file has been renamed and applies the
changes cleanly. Arch aims to be superior to CVS, but there are still some
individual features missing. Arch is a post-1.0 system and, as such,
is declared mature and stable for any use.

Arch is documented with a very basic online help system and a tutorial.

OpenCM

OpenCM is a version control system
created for the EROS project. OpenCM does
not aim to be as feature-rich as CVS is, but it does have a few advantages.
OpenCM has versioned renames of files and directories, atomic commits,
automatic propagation of changes from branch to trunk, and some support for
cryptographic authentication.

OpenCM uses its own custom protocol for communicating between the client and
the server. It is not distributed. Since OpenCM is not very feature-rich, it is
possible that other systems will better suit your needs. However, you may
prefer using OpenCM if one or more of its features is attractive to you.

OpenCM runs on any UNIX and on Windows under the Cygwin emulation layer. It
features a CVS-like command set and is well documented.

Aegis

Aegis is a source configuration
management (SCM) system created by Peter Miller. It is not networked, and all
operations are done via UNIX file-system operations. As such, it also uses the
UNIX permissions system to determine who has permission to perform what
operation. Despite the fact that Aegis is not networked, it is still
distributed in the sense that repositories can be cloned and changes can be
propagated from one repository to the other. Allowing network access requires
using a file system such as NFS.

Being an SCM system, Aegis tries to assure the correctness of the code that
was checked in. Namely, it:

Manages automated tests, prevents check-ins that do not pass the previous tests, and requires developers to add new tests.

Manages reviews of code. Check-ins must pass the review of a reviewer to get into the main line of development.

Has various other features that aim to ensure code quality.

Its command set reflects this philosophy and is quite tedious if you desire
only a plain version control system.

Aegis is documented in several troff documents that are then rendered into
PostScript. As such, it is sometimes hard to browse the documentation to find
exactly what you want. Still, the documentation is of high quality.

Monotone

The Monotone Version Control System was created by Graydon Hoare, and exhibits a different philosophy than all of the above systems. It is distributed, with changesets propagated to
a certain depot that can be a CGI script, an NNTP (Usenet news) receiver, or
SMTP (email). From there, each developer pulls the desirable changes into his
own copy of the repository.

This may have the unfortunate effect of causing the history or current state
of the individual repositories to fall out of sync with each other, as
individual repositories do not receive the appropriate changes, or receive
inappropriate ones.

Monotone supports renames and copies of files and directories. It has a
command set that aims to be as CVS-compatible as possible, with some necessary
deviations due to its different philosophy. It should be portable to Win32, but
was not explicitly ported yet.

Monotone is still under development, and may still have some behavioral
glitches. The Monotone developers expect to resolve these problems as work
continues.

All in all, Monotone holds a lot of promise, and is well worth
examining.

BitKeeper

BitKeeper is not an open source
version control system, but is listed here for completeness because some open
source projects use it. BitKeeper is very reliable and feature-rich, supporting
distributed repositories; serving over HTTP, file, and directory copies, and
renames; patches management; tracking changes from branch to trunk; and many
other features.

BitKeeper comes in two licenses. The commercial license costs a few
thousands dollars per seat (lease or buy). The gratis license is available for
development of open source software, but has some restrictions, among them a non-compete clause and a requirement to upgrade
the system as new versions come out, even if they have a different license.
Furthermore, the source code is not publicly available, and binaries exist only
for the most common systems, including Win32.

A handful of projects use BitKeeper, including some of the Linux kernel
developers and the core MySQL developers. It has been the subject of much
controversy in the Linux Kernel Mailing List. Due to its license, BitKeeper is
not suitable for open source development, as this will alienate more
"idealistic" developers, and impose various problems on the users who choose to
use it. If you are working on a non-public project and can afford to pay for
BitKeeper, it is naturally an option.

Conclusion

You probably should not use CVS, as there are several better alternatives,
unless you cannot get hosting for something else. (Note that GNU Savannah provides hosting for Arch, and there is documentation for using it with SourceForge). You should also not use
the free version of BitKeeper because of its restrictions.

Other systems are nicer than CVS and provide a better working experience.
When I work in CVS, I always take a long time to think where to place a file or
how to name it, because I know I cannot rename it later, without breaking
history. This is no problem in other version control systems that support
moving or renaming. One project in which I was involved decided to rename
their directories and split the entire project history.

is a software professional, who has been experimenting with programming since 1987 and with various UNIX technologies since 1996. He graduated from the Technion with a B.Sc. in Electrical Engineering, and has been heavily involved as a Linux and open source user, developer, and advocate.

His most successful project so far was Freecell Solver, but he also headed
several other projects, and contributed to other projects such as Perl 5,
Subversion, and the GIMP.