While some people may argue that Clojure is already obfuscated by default, I think it is about time to organise an official International Obfuscated Clojure Coding Contest similar to the IOCCC. This idea was born out of my own attempts to fit my Clojure experiments in one single tweet, that is 140 characters.

Winning IOCCC entry: flight simulator

The plan

First get some feed back from the Clojure community on this idea. You are invited to share your thoughts as comments to this blogpost. I will also twitter about this idea. If this particular tweet will get 100+ retweets, I will go ahead with the next step in the plan, which is establishing the rules for this contest.

These rules are also open to discussion. At the moment I’m considering for example a category ‘Code fits in a single tweet’ and another one like ‘Code size is limited to 1024 characters’.

After these preliminary steps I will set up a website, find a jury to judge the submissions and will continue from there.

Inspiration

Since Clojure is such a powerful language, there are also plenty opportunities to make the code more challenging to read. Mike Anderson already created a GitHub project called clojure-golf with some tricks.

You are also invited to violate the first rule of the macro club: Don’t write macros. And obviously the second rule (write macros if that is the only way to encapsulate a pattern) should be ignored as well.

Generating code on the fly is of course a breeze in a ‘code-as-data’ language like Clojure.

Finally

So if you think you can create a fully functional chess engine in 1024 characters, a Java interpreter in a single tweet or managed to make the Clojure REPL self-conscious with your obfuscated code, leave a comment. Also if you have suggestions for rules, want to help with setting up a website, want to be a judge or want to help in another way, I would love to hear from you.

Not a very catchy title this time, since this post will be mostly about hardcore nerdy coding. In my previous post I talked about the business value of elegant code. I argued that cleaning up an existing codebase in the maintenance phase still makes a lot of sense, but only if it can be done cheaply. One of the ways to make it cheap (cost-effective might be a better word if you need to sell this to your management) is of course to automate the refactoring process.

The problem

By automating I mean automating detection of code smell and automating fixing this smell. This in contrast to tools that are very good at detecting but don’t fix anything. For example Sonar. Also in contrast to most IDE’s that allow you to select a piece of code and select for example the ‘Extract method‘ action from the menu. That certainly is helpful, but your IDE will probably not detect if that is needed. It will just execute what you tell it to do.

Let me first show you an example of some Java code that I would like to refactor automatically:

public class A {
int answer() {
return (42);
}
}

I case you already haven’t noticed: in the code above the return statement has an extra pair of parenthesis. You can find many discussions on the internet on why this is or isn’t a good thing, but personally I don’t like them for the simple reason that return is a keyword and not a function. Extra parenthesis just add visual noise. So what I would like to see instead is:

public class A {
int answer() {
return 42;
}
}

This simple refactoring is already surprisingly difficult if you want to do this with a set of regular expressions since you need the context of the return statement. For example you have to be sure it’s not part of a comment or a string. The alternative is to fully parse the code and create an abstract syntax tree (AST). Again you can create one yourself using for example ANTLR and a grammar for the language of your choice. I decided to use the Eclipse Java development tools in combination with Clojure. The scope of my experiment: being able to refactor above example.

Preparation

I got the idea for this blogpost from ‘A complete standalone example of ASTParser‘. This post lists all the Eclipse libraries you need. You will have to add these libraries (8) to your own local Maven repository. Assuming you are using Leiningen 2, I followed these steps described by Stuart Sierra. For example to add an artifact to my local Maven repository called maven_repository within my Leiningen project, I used:

As you will notice creating an AST with the Eclipse JDT only takes a very few lines of code: on line 7 I create a JLS3 (Java Language Specification 3) parser, Line 8 tells the parser where it will get its source (in this case a string) and line 9 creates the AST. Next I need some helper functions:

Next the are 4 functions that zoom in into the code we would like to refactor. You will notice that I use doseq quite a lot since the actual AST manipulation (the refactoring) will be in-place and thus has side effects. This is not always avoidable when using Java libraries from within your Clojure code. We could write an immutable version that leaves the original AST intact by returning copies of the AST though. Such functionality is supported by the Eclipse JDT.

In this code we first get to the expression within the parentheses, hence the double .getExpression. Note: this code only strips one level of parentheses. Next we make a copy of the expression and finally we assign it back to our return statement, effectively removing the outer parentheses.

This code is easy to test via the REPL. You will see something similar to:

Recently I run into the following Java code (actual variable names replaced by x and y to anonymize):

if (x != 0) {
y = 4 - x;
} else {
y = 4;
}

So what’s the problem with this code? There is the use of the magic number 4, maybe we could test the x for equality to zero and swap the if and the else condition. And of course the whole test is nonsense and this 5 line code snippet could be just as well be replaced by:

y = 4 - x;

But again, is there a problem with the original code? The code works as expected. The compiler will probably optimize this code anyhow and get rid of that test statement. The unit test and functional tests passed. So at least from a business (or end user if you will) point of view no problem at all.

And of course there are costs involved with refactoring the above code. If this happens during active development costs are probably quite low and the benefits will outweigh those costs. But if this is a maintenance project the situation might be different. The actual code is already in production and might be mission critical for the company. And if there is no (fully automated) continuous delivery in place the cost of this seemingly small code improvement can be one or two orders of magnitude higher than during development.

Now lets try to make a business case for the above code change in a maintenance project. If you did an MBA you learned that business case = benefits – costs. An MBA is hardly more than that. Usually this formula results in overly optimistic hockeystick curves with an ROI of less than a year. But I digress. So we have to do two things here: we have to minimize the costs and maximize the benefits.

To start with the benefits. You can make a case that refactoring your code will usually result in less code and certainly in higher quality. If the code becomes more pleasant to work with it will also result in more motivated and productive developers instead of the attitude “I am not going to touch this mess that John left behind unless I really have to.” In general: cleaning up your code base and reducing it to half its size will leave you with about 1/4 of the maintenance costs. That might sound ambitious but over the last 20 years I have seen a lot of non-trivial pieces of software (ranging from 10k – 500k LOCs) for which this was no problem at all.

On the costs side of the equation you need a couple of things to minimize these costs:

Continuous delivery: automated building, unit testing, functional testing, integration testing, performance testing, version control, automated deployment, etc. Push hard to get these into place in your organisation. If you really can’t have continuous delivery on short notice, at least make sure that you have tests covering the code that you are going to change. Avoid the temptation to do ‘easy changes’ without them.

Any modern IDE will help for trivial refactoring. I call them trivial, because most if not all IDE’s only support refactoring on the syntax level. You can extract methods, rename variables, etc. But the IDE has no clue of what the code actually means. So refactoring on a semantic or design level is mostly not supported.

Automate your refactoring! This seems to contradict the previous point but there are tools available (usually as plug-ins for Java IDE’s) that can take refactoring one step further by looking at the AST (Abstract Syntax Tree) of your code and recognise patterns. I wouldn’t be surprised if they could detect and fix the example I started with.

In a next blogpost I will go into detail on what levels of refactoring there exist and how they could (at least theoretically) be automated. Until then:

Have fun refactoring your legacy code base!

(You might be wondering why I inserted the image of an old instrument in this post. My point is that in the past design of many things went far beyond pure functionality. For example scientific devices were often real pieces of art. I like code to be more than ‘just functional’ and have a certain elegancy about it.)

I first learned about Reactive Extensions (Rx) begin this month when it was open sourced by Microsoft. Although I found a few scattered references on the internet on how to get Rx working with Mono, I had to jump through quite a few hoops. This blogpost is a detailled account and will hopefully save you a couple of hours.

Getting Reactive Extensions

When you are using Windows this is pretty straightforward. But then again, in that case you are probably using .NET and not reading this blogpost at all. However when you are using Linux or OS-X it gets a bit more complicated. In that case your only option is to use NuGet.

Getting NuGet

I didn’t download the recommended version (NuGet.exe Bootstrapper 2.0) but used the NuGet.exe Command Line. This didn’t work out of thebox. According to this excellent blog post you first have to import some root certificates so that Mono will trust NuGet:
$ mozroots --import --sync

Getting Rx-Main

Ok, so let’s finally get Rx. I started with the latest and greatest (Rx-Main 2.0.21114 at the moment of writing) but I didn’t get that working. However version Rx-Main 1.0.11226 does seem to work with Mono. To see all available versions enter:
$ mono NuGet.exe list Rx-Main -AllVersions

If you save this code as rx.cs you are ready to compile your first Rx program. Make sure that you have the System.Reactive.dll in the same directory or set the library path for the Mono compiler using the -lib directive. Assuming the dll is in the same directory as your source, just type:
$ mcs -r:System.Reactive rx.cs

This will create a rx.exe that can of course be executed with:
$ mono rx.exe

Next steps

This is all you need to get Rx and Mono working. I tried with both Mono 2.10.x and 3.0.x on OS-X and Linux. As mentioned before, I only got this running with Rx 1.0.x which uses a single dll. In Rx 2.0.x this dll is split-up into several dll’s. However trying to compile this leads to:

Unhandled Exception:
IKVM.Reflection.MissingMemberException: Member ‘System.IComparable`1’ is a missing member and does not support the requested operation.

I haven’t investigated this any further yet, but it might very well be a Mono versus .NET incompatibility.

Have fun hacking Rx and Mono and please let me know if you have any questions or remarks.

Unless you are looking for a specific item, the interface every time returns 100 items in XML format. You will also get a resumption token so you can query for the next 100 items. I imagined it would be useful to abstract from this, using a lazy sequence in Clojure. So let me show you the resulting code and a brief explanation:

We will have to parse some XML that is returned, so we start with adding some convenient libraries. If you are not familiar with Clojure zippers, please look it up in the documentation and numerous blogs. They make navigating XML almost painless.

The routine build-query builds up a query. If there is no resumption token yet, the resulting query loads the first data. Otherwise, it will continue with the next batch of records. Currently the Rijksmuseum API supports two kinds of queries. You can either ask for a list of records (using verb=ListRecords) or you can ask for a specific record (using verb=GetRecord and an identifier). The API documentation has all the details.

We will first start with a couple of helper routines. Basically they extract the information we are interested in from a XML stream:

lazy-seq (line 18) is a macro that creates a lazy sequence out of a body of expressions. Next we create a query, parse the resulting XML and create a zip structure. All in one single line of code: line 19. All we have to do now is to extract the records (called works in line 20), the resumption token (line 21) and call ourselves recursively (line 22). Don’t be afraid of stack overflows: the lazy-seq macros takes care of this.

Now we are ready to use our lazy sequence. The next example creates a list with the image url’s of the first 10 items in the collection:

Don’t go overboard with requesting all items in the collection at once. Retrieving 1000 items takes about 1 minute, so the calls to the API are most probably throttled. Anyhow, have fun with lazily stealing works of art!

Let’s suppose that your old car one day suddenly stops working and it is beyond repair. So now you are in the market for a new (or second-hand) car. You have a mental model of what you are looking for: it should be roomy enough for your family and 2 dogs, it should be safe and of course your new car should have sufficient power. Together with your wife you decide that you want a MPV. So you walk into the nearest car dealership. To your surprise it is a rather nondescript building and when you enter it the showroom is completely empty. Luckily enough there is this friendly car sales guy and you start to talk to him: “Hi, we are looking for a new car and …”. Before you can finish your sentence he raises his hand to stop you, smiles at you and walks back to the counter. He returns with a key and hands it over to you. “Here is the key of your car. If you just follow me, I will show it.” Completely flabbergasted you and your wife are led into a parking lot behind the building and he points to a Mazda MX-5 Miata. You start to protest and try to explain that your 3 children and 2 dogs are not going to fit into a convertible, but he ignores your protests. “I’m sorry Sir, as you can see this is the only car we have got. And I think it is just right for you. Good luck!”

Now the above scenario might sound a bit absurd, but this is what I see a lot in IT consulting: your current project has come to an end and in some magic gathering that often goes by the name “Project Allocation Meeting” or “Resource Meeting” the sales guys and girls at a typical IT company have decided that the very first project that comes along miraculously is the perfect fit for you. And you can already start tomorrow. Sounds familiar with that single car that is for sale in an otherwise empty dealership?

Lets first have a look at the criteria for a good project allocation process:

Limited waste because of non-billable hours. The business model for IT consulting is rather straightforward: revenue equals the number of billable hours multiplied by the hourly rate of a consultant, summed over all consultants. Both parameters are not very scalable, so it is tempting to maximize the number of billable hours by leaving no gaps between projects.

The right fit: projects should be challenging enough so that a consultant can develop his skills. Working below his level is going to make him leave the company eventually. Working far above his level will only result in a burn-out. In general: the assignment should be a good fit in the career path of the consultant. That will make him move valuable to customers so you can ask higher rates later.

Physical location: a project at a nearby customer will limit traveling time and will result in overall happiness. Long traveling times will make it unlikely that the consultant is going to put in some extra hours for either the customer or his own company.

Other criteria like for example: does the customer’s culture fit with that of the consultant. Putting a consultant who thrives on freedom into a limited or formal organization like an insurance company is not going to make him very happy.

You will notice that the first point is a rather short-term optimization. The next three points are going to have way more impact on an IT consulting company but the effects are not immediately noticeable. Therefore an IT company will have to make a careful trade-off between optimization for billable hours (which is easy, a straightforward computer algorithm can do this for you!) and long-term sustainability and profit. This is quite difficult and one of the reasons that most consulting companies behave like this weird car dealership I introduced at the start of this blogpost. That brings me to the conclusion.

So we can choose the company we like to work for, the place we want to live, or our own car. And yet in IT consulting others are deciding what assignments best fit us. What I would like to propose is a very simple project allocation process with a minimum amount of overhead: All project information is always visible to every consultant. Very similar to a car dealer that has many cars on display. So you can pick the right car or in this case: the right project for you. That comes with the freedom of waiting for a better opportunity and passing a project. Of course that also comes with the responsibility for balancing the number of non-billable hours. That could be done by putting a cap on that number or by introducing some kind of rewarding mechanism.

Most important is that an IT consultant can perfectly well make these trade-offs himself. Self organization and responsibility at the lowest possible level instead of old Soviet style planning!

At parties I usually try to avoid mentioning that I work in IT. There are a couple of reasons for this. First I often get a reaction like “Oh cool, I have this weird problem with Windows XP and my new printer. Since you are an expert, you can help me with that!” My answers range from anywhere between polite (trying to explain what I do for a living, which is not fixing Windows problems) and a simple “No, I can’t help you”.

The second question I often get is more like a remark of even an accusation: “why are software projects always late, expensive and unpredictable? By now everything is already available as standard libraries, so why is writing software not as simple a clicking existing components together?” These remarks mostly come from people whose software experience is limited to small 100 line Visual Basic programs or from hobbyists who have tinkered a bit with Excel. Often they add examples like building a new house or bridge, arguing that this is way more difficult than building a simple piece of software and yet at the same time very predictable.

Lately I started telling those people (and others when giving a Scrum/Agile training) a story from my own experience, which is about installing laminate flooring in our house.

When I planned for this do-it-yourself activity, my first estimate was: 3 bedrooms and 1 hallway should take at most one weekend. Next I created an initial work breakdown structure (WBS):

Remove old carpet and floor panels: 1 hour

Install 50 m2 underlayment: 2 hour

Install 50 m2 laminate flooring: 8 hour

Nice, if I would work hard enough I could finish this on a single Saturday and have the Sunday for relaxing, spending time with my wife and kids or even write a blogpost! So I started and everything went more or less according to plan: first two steps took a little bit less, and in another hour I had installed the first 5 m2 of laminate. Since installing a floor is as easy as writing software I figured that I could scale 5 m2 to 50 m2, so that would take 10 hours instead of the planned 8. Well, not that bad.

But then I discovered that it would look way better if the laminate would be a bit under the skirting board, instead of against it which would leave some visible gaps. This was a bit of a setback since this would mean I probably couldn’t finish the job in one day. Luckily I still had the Sunday to finish the work. Then disaster hit: while removing the skirting board I discovered it was quite old and nailed into the wall with really long nails. So two things happened: part of the boards broke, while also part of the plaster felt off, damaging the walls.

Note: not my actual wall…

I decided to do a little bit of refactoring and reuse the old floor panels that I had removed from the first bedroom as skirting board. Without all the details my new WBS looked like this:

Remove old carpet and floor panels: 1 hour

Install 50 m2 underlayment: 2 hours

Remove old skirting board: 2 hours

Install 50 m2 limate flooring: 10 hours

Sawing 40 m floor panels into new skirting boards: 4 hours

Grinding 40 m skirting boards: 2 hours

Using a plunge router to add a nice profile to the skirting boards: 2 hours

Painting the skirting boards twice: 8 hours

Repair damaged walls: 2 hours

Remove remaining nails: 1 hour

Fixing new skirting boards to the walls: 4 hours

Some additional woodwork for the door posts: 8 hours

So my carefully planned 11 hours blew up to 46 hours! So that’s about 400 %. What’s worse, my initial lead-time of 1 day ultimately became 6 months. This simple seemingly predictable set of tasks behaved like a real software project after all with lots of unforeseen problems and new functionality during the project.

If I would have foreseen all those problems I might not have started at all. On the plus side I ended up with skirting boards that are way more beautiful than the original ones. And what’s more, I learned how to operate a plunge router, making me a better craftsman which will be useful in future projects.

Conclusion: writing software indeed is as predictable and easy as installing laminate flooring.

Recently I stumbled upon a white paper by Roger Sessions called “The Mathematics of IT Simplification”. In this paper he describes an approach called synergistic partitioning, which is based on the mathematics of sets, equivalence relations, and partitions. One of the topics that comes up in this article is how many ways there are to partition N elements. The answer to this is well-known and called the Nth Bell number.

The 52 partitions of a set with 5 elements

The Wikipedia page that explains the Bell number also shows two implementations on how to calculate them, one in Ruby and the other one in Python. The Ruby version needs 17 lines of code, the Python version 12 lines. I have the peculiar habit of intriguing/annoying my colleagues at work with the statement that using Clojure the solution to any programming problem fits into one single tweet. In other words: 140 characters.

The first function (append-ele) appends a single new element to a row in the Bell triangle (explained in the Wikipedia article). In several ways this implementation is suboptimal since concat can only concatenate sequences, so I have to convert the single element to a list first. And what’s worse, appending at the end of a sequence is expensive since it’s computation is O(N), while prepending would only take one single operation.

The function next-row calculates the next row in the Bell triangle. This row always starts with the last element of the previous row. I implemented this by a simple reduce over the elements of the previous row.

And finally the function bell returns a sequence with the first n Bell numbers. I use a loop/recur to avoid a stack overflow. The sequence of Bell numbers is constructed in reverse order, so at the end (line 9) I have to call reverse.

While this version is already pretty short (10 lines, without the printing), this still won’t fit in a single tweet. So the next step was to remove the first two helper functions and inline all the code into a single function. Warning: this is not good coding practice! With this disclaimer, here is the resulting code:

Without the whitespace this implementation is 170 characters. Almost there! As you can see there are still some ‘expensive’ keywords like concat and reverse (6 and 7 characters!) and some annoying conversions from single numbers to lists. The only way to avoid this is to use a specialized data structure like vector instead of a sequence:

Finally the code (without the indentation) fits into 140 characters. One caveat: I use conj which nicely appends an element to the end of a vector. However this behaviour of conj is not guaranteed. It is allowed to add an element anywhere in a sequence. When you use a list for example, the element is prepended. Because of this he above algorithm is O(N^2).

I will leave it as an exercise for the reader to implement the Bell number calculation as a lazy sequence so that you can use for example (take 9 (bell)). Have fun.

Recently I did a project using Scrum in short (one week) iterations. The acceptance testers weren’t part of the team and asked for (well, actually demanded) release notes with every increment we shipped. We told them that wasn’t a problem at all, since we keep track of all our user stories and issues in JIRA. We already had created an account for them at the start of the project, so end of story we thought. Almost.

This didn’t work out because they experienced JIRA as a bit too complicated and didn’t want to dig up all the information themselves every Friday. We realized that some basic introduction in JIRA might help but that we could help them even more by defining a couple of filters. So I asked what they needed, created the filters to come up with this information, showed them how to use them and again concluded: end of story and back to real work. Well, almost.

They still preferred to have a document containing the release notes attached to the email that announced every new release. Mainly because they had always done it like that and also because it was easier to print the document. We decided to take the path with the least resistance, use our own JIRA filters, and waste one or two hours per release to copy JIRA issues to Word, format them in tables, fight to get the layout somewhat correctly, etc. At least some activities to give a project manager a reason for his existence. End of story. Almost.

Because as IT guys (and girls) we don’t like boring repetitive work, especially not when it has to be done late at night when we finally got that release shipped. And certainly not when we can’t see the added value of duplicating information from one format (JIRA) to another (Word). Our default solution is Yak shaving automating the work. So I came up with this set-up based on JIRA, Google Docs and some Google Apps Script:

The source is JIRA. In the past I already wrote a couple of blog posts on how to import data using Ruby (and Soap), using JavaScript (directly into a Google docs spreadsheet) or using ClojureScript (using REST). The report is based on a template that I created in Google docs. Here you can already include for example the company logo’s, the disclaimers, etc. etc.

Some sample code (note this doesn’t use JIRA) to create a document using a template:

As you can see in the picture I also use a Timed Trigger. This fires the script every Friday for example at 6:00 PM. I belong to the minority of people that think that distributing Word documents is not very professional (unless you want to co-author it with others) so I prefer to create a pdf. This is done in the next step. Some sample code on how to do this:

And finally we have to send the release notes to the right people. For this I created a new Group in Google Contacts. The next code snippet shows how I can read the email addresses from this group and how I create an email with the pdf as an attachment:

This concludes my brief description on how to generate release notes. There is room for improvements. For example right now the email is scheduled at a fixed time. To make your reports look more ‘genuine’ (as in: a lot of work to create) you could use a ClockTriggerBuilder to generate your own triggers that fire at a more or less random time, preferably of course Friday late at night.

Final remark: it is almost always better to include testers in your team. Even at the cost of a lot of initial energy and frustration, it’s worth the end result. The solution I described is only a patch for a very bad process.

Let me start this blogpost with a little fictitious story about two project managers:

Paul and Stephen were sitting outside the conference room, waiting for the steering committee to call them in. Paul looked at Stephen with a smug expression on his face. “I’m so glad that I spent that extra money on hiring two extra developers right from the start. That was quite expensive, but since one of our developers has quit during the project and another one got ill for almost a whole month, this was money well spent. We made the deadline! So how about your project?” Stephen sighed. His project hasn’t been that successful. Their test server had failed a couple of months ago and ordering a new one took quite some time. This has caused a severe delay in his project. If only he had invested in buying a backup server. At the project start the team had created a list of risks. They hadn’t forgotten about this one, but the chance of this happening was considered very low. How wrong they were!