Tutorials, howtos and other writing relating to the Perl programming, Network Engineering and Open Source systems administration arenas.

2012/04/13

Using a DVCS for your code and documents: Part 1

This document is the first in a series that outlines the very basics of using a Distributed Revision Control System (DVCS) to manage and store changes and updates to documents you write. Used primarily for software code, I've come to use it for my blog posts, poetry, documentation etc. If you've never heard of revision control systems before, you might want to do a quick search online for what they do, and what they are for.

Back in the day, I used software such as CVS and SVN to house my code changes. This was tedious though. I had to set up a server, keep it running 24x7, ensure it was properly backed-up and maintained. I worked at an ISP, so these things weren't a problem for me. However, for others, having a dedicated server doesn't make a lot of sense.

Recently, I decided to give Bitbucket a try. It, like GitHub, provides free hosting of your document repositories. The most interesting and useful feature of DVCS is that they are indeed distributed. If my CVS or SVN server went down, that was it... work stopped. With DVCS, I can clone my repository, and if the remote server goes down, I can use my current clone just as if it was the original repository. I can clone it again, or even use it to store new changes.

Go on over to Bitbucket and set yourself up an account. Once you're done that, navigate to where you can create a new repository. I'm going to name mine "Test" for this example. I am going to make it public, so that you can see my repository at the end of this post. I don't need a bug tracker, so I'm leaving that option unchecked, as well as the Wiki option. Although I've written patches against Git repositories, I like Mercurial, so I'm going to use that. (The command "hg" is for Mecurial, so you'll see it often in my examples.)

Once you've created the repository (hereinafter: repo), click the link that states "I'm starting from scratch".

Create a repo directory on your computer, and change into it:

$ cd ~
$ mkdir repos
$ cd repos

Now copy the 'clone' line that you see on the Bitbucket page, and paste it on your command line:

Let's get right into using your repository. I will go through the basic commands as we encounter them.

Start by creating a new file, and adding some text to it. I use vim, but you can of course use any editor of your choosing.

steve@ub:~/repos/test$ vim test.pl

Save the file. Here's what my new file looks like:

#!/usr/bin/perl
use warnings;
use strict;
print "Hello, world!\n";

Ok, we have a new file with some text in it. Let's check the status of the file in relation to our repository. The command 'hg' is for Mercurial:

steve@ub:~/repos/test$ hg status
? test.pl

The "?" before the filename means that this file is unknown to the repository. There will be many cases where you won't want to add certain files to a repository, but we'll deal with that in a later post. For now, we want to add this file:

steve@ub:~/repos/test$ hg add test.pl

Note that you can also call "hg add" with no filenames. This will include ALL files (recursively). Now let's re-check the status of our repository:

steve@ub:~/repos/test$ hg status
A test.pl

The "A" before the filename means that you have added a new file. It has not been committed to the repository yet. Let's do this now:

steve@ub:~/repos/test$ hg commit -m "-initial import"

Output:

abort: no username supplied (see "hg help config")

Whoops! What happened? Well, Mercurial (hg) needs to know authentication information before you send up changes back to your master repository. We'll discuss how to do this momentarily. First, lets focus on the "commit" command to hg. The "-m" flag tells hg that you want to add an inline message for this change. If you omit the -m and the following message, you will be dropped into your default editor to write one out there. You can cancel a commit simply by exiting your editor without saving. Now, back to adding auth information. While in your repository, create a new file named "hgrc", and add your information. Mine looks like this:

The "default" directive under the [paths] category is the link to your repository on Bitbucket. Under the [ui] section, the "username" is the email address/account you signed up to Bitbucket with. I don't add anything further... I prefer to just type my password out manually when I need to. Once your 'hgrc' file is created, move it into the .hg directory:

steve@ub:~/repos/test$ mv hgrc .hg/

Now rerun your commit:

steve@ub:~/repos/test$ hg commit -m "-initial import"

Went off without a hitch. Committing saves your changes in a changeset in your local working copy. To push them to the master (in this case, Bitbucket), we use "push". Let's upload the local commits now:

Done. We added a file with "hg add", committed the changes via "hg commit", and uploaded the single changeset with "hg push". There's a problem though. My program was supposed to say hello to the universe, not just the world! Edit the test.pl file to print "Hello, universe!\n"; instead of "Hello, world!\n";, and then save the changes.

Now commit this update ("hg commit"), this time without the '-m' flag so it opens your editor. Add the following line in the commit message, and then save:

- replaced world with universe in print statement

Oh, man! I wanted to insert a comment saying what the print line is doing, but I forgot. Edit test.pl so it looks like this:

Notice this time the output found two changesets. This is because we committed two changes prior to pushing the first one. A rule of thumb is "commit early, commit often". I follow the same rule with push.

So, we have our program created, and it runs great. We have made changes, and saved these changes. Let's see the basics on viewing the changes we've made. "hg log" shows you a list with a brief set of details for all the changesets you've committed. They appear in reverse chronological order. The hexidecimal string next to the "N:" in the "changeset:" line represents the specific changeset. This is much more complicated than how I'm describing it, so we'll focus on these details in a later post.

The log is great for history, but what if we need to see more information... such as the list of files changed, and all the lines in the commit message as opposed to just the first? Adding the "-v" flag to 'hg log' will show you the files changed, as well as all the lines you added to your commit message. Here's an example from one of my real repositories:

Within each commit, we can now see what files we changed, and the list of comments we made per changeset. What if we need to see the actual changes themselves? No problem... add the "-p" (patch) flag to 'hg log':

steve@ub:~/repos/test$ hg log -p

...wait! That lists ALL of our changesets (commits). That's too much information for what we want. I want to know about the last commit only right now. In Mercurial, we are currently working in the "tip" branch. Other *revision control systems may refer to this as HEAD. Let's check out the actual changes like we tried above, but only the most recent change. Again, the '-p' flag means "patch". The '-r' flag means "revision". We want to see the actual physical changes (-p) to the most recent revision (-r):

This tutorial series is primarily designed to describe the command-line usage of a DVCS application. The web-based display of the storage facility is outside the scope of this document, but it can be very handy. Here is what my online repo looks like after completing the examples in this post.

I'll end this post here. You've learnt the very basics on how to clone a Mercurial repository from your free Bitbucket account, how to commit changes into changesets, how to push the changesets back into the master repository, and how to do some basic review of the changes that you've made. In the next episode, we'll delve into how you can perform more advanced reviews of your changes, revert your working directory to a previous change, creating branches to manage different change tracks and an explanation and examples of how DVCS differs from non-distributed versioning systems. We'll also touch on the ".hgignore" file, which allows you to use "hg add" without adding files you don't want included.

Thanks for reading. If you've read any of my other posts, you know I appreciate all feedback, good and/or bad in either the comments below, or privately via email.