Cogito ergo sum

Tuesday, 26 November 2013

Late last night, fuelled by energy drinks for the first time since university, after a frantic hacking session to put out some fires discovered at the last minute, I prepared my first ever 1.0.0 release. I wanted to share some retrospective thoughts about this.

It's worth mentioning that the project uses a slightly modified implementation of Semantic Versioning. So 1.0.0 is a significant release: it indicates that the project is no longer in a beta status, rather it's considered stable, mature even. Any public API the project provides is considered frozen for the duration of 1.X.X release train. Any mistakes we have made (and I fully expect we'll discover plenty of them) in terms of interface design, we are stuck with for a while. This part is a little bit frightening.

Oh, I should specify that the project is Thermostat, an open source Java monitoring tool. Here's the release announcement from our announcement list archives. My last post (woah, have I not posted anything since February? Bad code monkey!!) also mentioned it.

Thermostat consists of a core platform including a plugin API, and ships with several useful plugins. Leading up to this release, our focus has been primarily on the core platform and API. Releasing 1.0.0 is somewhat exciting for us as we can move into primarily maintenance mode on the core, while building out new features as plugins. Writing brand new code instead of lots of tweaking and refactoring of existing code? Yes, please!

But what I really want to write about isn't the project itself, but the process and the things I learned along the way. So, in no particular order:

Estimation is hard

This project was started by two engineers about two and a half years ago. There was an early throwaway prototype, then a new prototype, which eventually became today's code base but looks nothing like it. Over time things started to look more and more reasonable, and we started thinking about when we'd be releasing a 1.0 version. I want to say that probably for more than a year, we've been saying "1.0 is around the corner". And each time we said it, we believed it. But we were, until recently, obviously wrong. Now there are various reasons for this, some better than others. In that time, there were new requirements identified that we decided we couldn't release 1.0 without implementing. Naturally, estimates must be revised when new information appears. But a lot of it is simply believing that some things would take significantly shorter than it actually did. I want to think that this is something that improves with experience, and will be mindful of this as we move into building out new features and/or when I'm one day working on a new project.

Early code and ideas will change

When I think back to the early days of this project, before it even had a name, it's hard to imagine. This is because it is so incredibly different from where we ended up. Some parts of our design were pretty much turned inside out and backwards. Entire subsystems have been rewritten multiple times. We've used and abandoned multiple build systems. And this trend doesn't seem to be slowing down; we've had ideas brewing for months about changes targeting the 2.X release train that will change the picture of Thermostat in significant ways again. One really awesome result of this is that nobody working on the project can afford to indulge their ego; any code is a rewrite candidate if there is a good reason for it, no matter who wrote it originally or how elegantly. And everyone understands this. Nobody gets attached to one implementation, one design. It's nice to be working in a meritocratic environment. It's a sort of freedom: freedom from attachments, and freedom to innovate.

Good test coverage helps make changes safe

So this one is something that's probably been noted by a lot of developers. I know I've been taught this in school, read it in various places, and so forth. But it is working on Thermostat that has really driven it home for me. In the early days, we didn't really have any tests. It made sense at the time; we didn't really know where we were going, the code base was small and undergoing radical changes very regularly. But time went on, and it became clear that this project was going to be around for a while, and both the code base and the group of contributors were growing. So, we started adding tests. Lots and lots of tests. No new code was accepted without tests, and over time we filled in gaps in coverage for pre-existing code. The happy result has been an ability to make very invasive changes with the confidence that side effects will be minimal, and likely detected at test time. I cannot exaggerate the number of times I've been thankful we put in the effort to get our unit and integration tests to this level.

Automation is king

Have a repetitive, error-prone task? Script that. Over time Thermostat has grown a collection of useful little helper scripts that save contributors time and effort, over and over again. From firing up typical debug deployments, to release management tasks, to finding source files with missing license headers; we write this stuff once and use it forever. These type of things go into version control of course, so that all developers can benefit from them. Also, testing automation. The common term used is of course Continuous Integration Testing, and for ages we've been using a Jenkins instance to run tests in sort of a clean room environment, catching problems that may have been hidden by something in a developer's environment. This has saved us a lot of pain, letting us know about issues within hours of a commit, rather than discovering them by accident days, weeks, or months later and having to wonder what caused the regression. I'll have to insist on a similar set up for any non-trivial project I work on.

That's all I have to say. Hopefully it won't be so long before my next post. I've actually been meaning to make a "battle station" write-up; I'm a remote employee, and invested time and money in a convertible standing desk setup and some clever mounting techniques to keep my workspace neat despite the number of devices involved. Until then, Adieu!

Sunday, 10 February 2013

I started this blog mostly to share useful information. But, while I'm here, I may as well mention the work I do (well, hopefully it is also useful). Since I haven't shared anything about this before, my day job revolves around building an open source tool for monitoring, profiling, tuning, and instrumenting Java applications, called Thermostat. It's not exactly feature complete, but we dropped a pre-release tarball recently. Read the announcement and find more details here. An important aspect of Thermostat design is the plugin API, which we're getting close to considering functional and stable. If you have a use case for adding custom monitoring modules to a fairly standard existing set of run-time data, consider trying it out.

I'll try to write again with some more generally useful content so this blog doesn't just become a venue for self-promotion.

Thursday, 13 December 2012

Some days, you learn things that you immediately know will be useful again and again.

Several months ago, I started incorporating the mq extension into my mercurial overflow. Mutable local history changed my work-flow entirely, allowing me to do work on larger overall changes locally, but in organized units of change. Aside from aesthetics, this sort of had a similar effect as probably anyone who learned how to program and then learned how to use version control: once you realize what it does for you, you can work a lot faster without fear of breaking things because, well, you can always revert. Well, mutable local history made things even more forgiving for me.

Every so often, however, I run into a messy merge, where some patch series I've been working on for some time overlaps with a lot of the changes others have pushed in the meantime. I should say that I'd really only gotten used to
the qinit|qnew|qpop|qpush commands. So, merging with upstream changes went kind of like this:

And at this point some manual inspection of a bunch of FooFactory.java.rej files and the corresponding FooFactory.java files, adding and munging my changes with the conflicting section. If there are any conflicts. Sometimes I am lucky and there aren't.

This was my work-flow, until yesterday. I had been working some improvements to a set of widely-used classes and interfaces, and someone else had been pushing a bunch of package and bundle reorganisation work. When I qpush-ed over this, I learned a thing: qpush (even with -m) does not seem to be aware in any way of file renames. Turns out that this meant that most of this 200kB patch simply was not able to apply.

This was unacceptable. No way am I applying all that manually. I made a fresh local clone and set out on a web hunt to find a solution. The mercurial wiki page for MqExtension was pointing to a page titled MqMerge which didn't sound a lot different from what I was already doing, but the page did seem quite outdated and referred to some work being done to create a rebase extension. It seems this work has gone quite well, and its wiki page is even up-to-date. And, as it turns out, the merging done by the rebase command tracks file renames properly, and when there are merge conflicts it know to open a 3-way vim session to help me sort them out. Where have you been all my life?!?

So now my workflow looks more like this:

$ hg qnew patchname # and now make some changes
$ hg qrefresh # possibly repeat this and previous a number of times
$ hg pull
$ hg rebase -s <rev1> -d <rev2>

Here, rev1 is the oldest applied patch tracked by mq, and rev2 of course is the most recent changeset in the newly pulled upstream changes. The series of applied patches is merged in sequence onto the new parent. Even the mq tags follow the rebase, so once it's done I still have mutable local history until I'm ready to qfinish. And sorting out the conflicts takes me far less time because of the 3-way vim session.

Game-changing. Really. For as long as I am involved with projects that use mercurial, learning that rebase and mq play nice is going to be saving me time and headaches over and over and over.