A Small Matter of Programming

Wednesday, August 29, 2012

A tagged union (also known as a variant) is an aggregation of two or more types. An object of such a type can be any one of a predefined set of types. They are insanely useful for modelling many real-world concepts. A network connection might be either ready, open or disconnected – and each of these states has different pieces of data attached to them. A message that comes over that network connection might be a heartbeat, broadcast packet, or an RCP call – and again, each type is associated with a certain set of data fields. Tagged unions pop up everywhere in the real world, and the core C++ language has no decent way to model this concept.

Enter Boost Variant, a library that provides tagged unions to C++. Conceptually, it sounds like just what we want – a Boost Variant is a template type that you can instantiate to hold a client-defined subset of types, and it provides protections to make sure you correctly unwrap the variant to the right type. The problem is, to take advantage of this type safety involves writing a lot of ugly code!

I will present in this post my library function that sprinkles in C++11 lambdas to make Boost Variants a little more palatable.

Sunday, June 17, 2012

It's commonly said that Apple's Mac OS X UI is better than the competing products from Windows, Gnome and KDE. This is so widely believed, that the competitors have even started imitating some of the distinctive features of OS X – just look at Windows 7's or Gnome 3's one-per-application task bar icons, or Unity's global menu bars. But is OS X really a good example to follow? After spending a year in Apple land on a Macbook Air with OS X Lion and trying to believe in its philosophy, I'd have to say that its window management interface is markedly inferior to that of Windows (95 through to 7) and Windows-inspired desktops like Gnome 2 and KDE 3.

In Windows XP, a running instance of an application generally has one window, which corresponds to exactly one task bar icon, and closing the window is tantamount to closing the application. This is easy to understand. The user just needs to understand that applications are windows, and the windows form a stack, just like a stack of papers on a physical desk. You interact with whichever one is on top - and in the case that they don't overlap, the one that's "on top" is easily identified by its blue border. And when you switch to work on another window, that window will always rise to the top.

Mac OS has a more abstract concept of an application. In OS X, regardless of how many windows it has, each application always has one icon in the dock, one entry in the Command-Tab list, and can always be dismissed completely with a single Command-Q combo. On paper, this sounds like a good idea – after all, a word processor's many windows do all run in the same process and it makes sense that users should be able to deal with the windows in aggregate. But after trying to embrace this concept in the past year, I just cannot get over some basic issues that it brings with usability.

Tuesday, October 11, 2011

Many would say that Real Programmers should need nothing more than a text editor and a compiler, but it's hard to deny that a good IDE is a useful tool for many humans. Whether a project is large or small, I think it wouldn't be a stretch to say that an IDE can double, triple, and perhaps even quadruple productivity. Features like code completion, code generation, automatic refactoring and live syntax error highlighting can make code-writing significantly more fun and productive.

But a little-recognised fact of programming in a team is that you spend most of your time reading and understanding code written by other people. So although IDEs are usually associated with writing code, I'd say that it's their code-reading features that are even more useful. Of the code-reading features that IDEs commonly provide (amongst syntax highlighting, outline views and documentation browsing), the most useful are the code navigation features. Some examples of use-cases are:

Jump to the definition of a variable.

Find calls to a function

Find references to a class

Find subclasses of a class

Find references to an object

Any decent IDE will give you these code navigation features, but unfortunately, there is not a great range of C++ IDEs to choose from in Linux. This leaves a gap for text editors like Vim to rule the roost, but it is a far cry from say, Visual Studio. While there are tools like ctags and cscope that purport to provide these code navigation features, they tend to give quite bad results because the underlying problem requires the editor to have access to a fully-fledged C++ parser.

In this post, I will introduce a set of tools that I've been working on, that bring solid code navigation features to Vim, backed by a real C++ parser: Clang. Meet CLIC – the Clang Indexer for C/C++.

Saturday, January 29, 2011

Software products are complex constructions, with engineers often only understanding a tiny fraction of the code-base. Programmers often come from a background of solitary hacking, and this culture follows through to many of the software engineering companies they end up working in. Code gets written and checked in without a whiff of scrutiny. With time, it grows more complex, more siloed, more error-prone and less maintainable. Developers leave and others inherit a codebase they don't understand. In the end, schedules slip, things break, customers are unhappy and developers are stressed.

If your company writes software and doesn't get it peer reviewed, it needs to start doing so now because the above problems can all be solved or mitigated with code reviews.

Monday, January 3, 2011

One of the prominent features of Clojure are a core set of immutable data structures with efficient manipulation operations. Two of the most innovative and important are the persistent vector and persistent hash map.

As a little project I set myself in order to get to know Haskell better, I have been porting these structures to Haskell. I think it's now at a state where the basics are there and usable, so I've put it up on my Github. The API provides Data.PVector (the persistent vector) and Data.PHashMap (the persistent hash map). The interface for both has been kept as consistent as possible with Data.Map.

Tuesday, September 14, 2010

In my day job, we often deal with a multitude of git branches - whether we’re keeping branches for maintenance releases, supporting deprecated APIs for certain customers, or working with experimental features. Although git’s model entices you to create more and more branches, it brings the burden of keeping them up to date through periodic merging of branches.

While merges are important for keeping code up to date, errors in merge commits are more common and more impactful than in normal commits. Firstly, merges have multiple parents, which makes it very hard to see, from history, what the programmer actually did to resolve merge conflicts. Secondly, reverting a bad merge can turn into a headache in itself. Thirdly, a large proportion of merge conflicts happen when dealing with someone else's code because by nature, branching is a multiplayer game. In essense, merge errors are easy to make, hard to fix, and hard to find, so it really does pay to improve the merge process.

Yet, I have found that the tools and interfaces available for performing merges do not equip programmers sufficiently to do them effectively. Often, a programmer will simply run git merge and hope that git will take care of the large majority of the hunks. Of the hunks that conflict, the usual merge strategy of the human at the helm is to use the surrounding context to roughly guess what the intended program is supposed to look like.

This article will hopefully demonstrate that there can be a much more measured process to the resolution of merge conflicts which takes the guesswork out of this risky operation.