Wednesday, April 28, 2010

I think of quality in two ways. One is the product quality which users perceive, and the other is the habitability of the code itself. One says whether the function points and UX are well-selected, and the other says whether it is worth your time to try to work in the code. One can have crap code that seems great externally, and one can have beautiful code that doesn't do anything a user is interested in doing (the popularly-presented dichotomy). Likewise one could have poor functionality written poorly, or great functionality written well. Those who consider themselves professionals or craftsmen will desire great functionally written well.

I often get to work in code that started out as good functionality badly written. I get to work on the teams that clean up the mess and make the system more useful, scalable, and performant. As we work, the code gathers external quality. It becomes increasingly virtuous. Smaller, simpler, cleaner code starts to run faster. With less duplication, bugs are fixed once-for-all rather than as buggy duplicates are discovered. This month is better than last month, but next month should be better still.

Because there are two kinds of quality, and one can make progress on either front, we see two kinds of progress. There is linear progress toward completion of a specific feature, and then there is radial progress which improves the code base overall and makes work easier for many colleagues.

Developers who make linear progress without radial progress ultimately drive code into an increasingly unliveable state. Eventually bugs are hard to resolve, changes are hard to make, testing is costly, refactoring is painful, and the design is more a collection of warts and hacks than a system any sane beings would have intentionally conceived.

Radial progress does not contribute much to one's ability to close a particular development request. Instead it improves the ability of the team to complete features next month and the months after that. It tends to affect many source files, so it is uneconomical in projects that don't have a significant investment in automated testing.

We made a lot of radial progress today.

We started with a change we wanted to make, but instead of slapdash hacking we took The Next Step and looked at the tests that covered the area of change. We found the unit tests were poorly written. These tests had poor fault isolation and failed for the wrong reasons. This prompted us to take The Next Step and make them less fragile so that they would stay 'green'.

From reverse-engineering the unit tests, we saw that the code under test was nigh indecipherable. It was riddled with poor naming, inconsistency, huge argument lists, lying comments, and slapdash implementation. Support routines did little to clarify the code, and in come cases further obscured its intent. Tests were fragile and indirect.

We decided to take The Next Step by refactoring some tests for readability. We replaced some loops with LINQ, after which it was obvious that the code had unnecessary looping. Each readability improvement showed some error in linear thinking that almost certainly slowed initial development of this part of the code base.

Writing bad tests is hard work. It would have been easier to write them better, or to refactor them as they went, but they were almost certainly more focused on making linear progress than on making the code clean. I can't really blame them. People under pressure cut corners and sometimes "it works" is all they think they can afford. People do the things that make them successful, and sometimes they are rewarded or punished for the wrong things. As is, it suffices to say that they did hard work the hard way, and so we found it in a poor state of development.

The more we introduced explanatory variables, simplified expressions, and eliminated duplication, the more we could see the next set of flaws to address. We took The Next Step and started reworking the test harness. Soon we find that there are variables being created and passed from routine-to-routine only to be ignored completely at the lower levels. We cleaned up the Object Mother utility, obviating hundreds (if not thousands) of unnecessary lines of code, and I suppose we should take The Next Step and remove the unnecessary code from the ObjectMother users.

This sounds like weeks of work, but was only hours. We closed one ticket, opened another, attended a long planning meeting and a lunch-and-learn. We took breaks for all the usual reasons. We broke up our pairing session at 5:00 CDT. We understand the code much better, and the tests explain the code better.

Sure, the code cleaning could turn into a few man-days of effort, but it is effort that will benefit everyone who wanders into this area of the code in the future. This area of the code base is being actively developed and has outstanding bugs, so we will likely recoup all our investment in the course of completing very few change requests. We are aware that we can't afford to spend time gold-plating this module, but we can make it habitable before we move on.

On the XP mailing list someone asked what distinguishes a truly great agile development team. Let me offer that the difference is that a great team will take The Next Step to make the code habitable on the inside in addition to making is useful externally. Because of this tendency, each month gets better, each quarter more productive. Each year, the system requires less patience of all those associated with it, and provides more rewards.

That, my friends, is great.Late addition: If course Ron Jeffries wrote about this first, and better.

We say "embrace change" and "accept changes to requirements, even late in the process." We also rant about focus and that developers should neither be interrupted nor allowed to multitask. To a lot of people, this sounds just like a contradiction. Are we hypocrites?

There is no conflict in these statements. The problem is with the organization.

Change In A Change-Resistant Culture

If, for example, I plan a release in 10 months, and assign each of my developers a ten-month task (or a couple of five-month tasks) then I am not only at capacity, I'm likely well over and don't know it. Now, in month two I have a change I would like to see. I can't get any of my developers to put aside their work and take my task. Those stupid developers!! What's wrong with them? Nothing is wrong with them. They are doing the work they are assigned, and since they have no slack or reserve capacity they know that the new task puts their release in jeopardy. They also know that they are being reviewed, graded, rated, and ordered by their ability to get those assigned changes into the next release. If they drop a task to do the new work, they will fail. If they don't, then they'll be deemed uncooperative which also will possibly cause career damage. They can work overtime to do all the work that they were assigned (which is probably an ambitious goal anyway) and try to absorb the new change by stealing time from their significant others, sleep periods, side projects, and other obligations but this will make them less likely to recognize shoddy work and less eager to correct it. Assigning work this way creates a reserve-free system in which any response to any change is costly both professionally and personally. No wonder developers become demotivated.

Embracing Change In Healthy Organization

Instead, what if we have a release in 10 months (or so). We break all the features into pieces, where each piece is valuable to the end user, but each one can be completed in a week or two. Maybe there's an admin screen, a data entry screen, and a small bit of processing for the first feature. Rather than assign these to an individual (tying up his next few months), we assign it to the team as a whole. Now each of the people working on this feature will come "free" (ie. have completed a task) in a week or two. They will be able to switch pair programming partners, take on new features, make revisions, etc.

Now I can introduce changes between tasks, or instead of planned tasks. These task boundaries are frequent, so change is much more welcome. In addition, a developer is only committing to one change or feature at a time, and so only an uncompleted feature is at risk instead of the programmer's career.

In addition, the code for a completed feature is available much sooner, so that it can be tested, demonstrated, and evaluated. A few months in, it can even enter an alpha-testing program to get end-user feedback and determine the effect of the change on performance and scalability.

Smaller Assignments Are The Secret

If you can't accept change, it is because the work assignments are too large. If you are going to want changes often, then you need to have "change injection opportunities" more often. If changes are daily, then every day there should be some programmer(s) coming available to make the changes. Furthermore, they must be at work in a system that does not punish them (career-wise or by depriving them of the things they hold dear personally) for accepting orders for change.

You don't have to have an Agile organization or an Agile team to harness the power of smaller work assignments. There have been many methodologies, including some from very large contracting companies, who have long touted "a series of small successes" as the way to best ensure success of a project.

Early feedback is also known pre-Agile to be important to a good user experience and to dissolving contract discomfort.

The trouble is that working this way erodes the strong central authority many managers have enjoyed, and which may have brought them respect as "strong leaders" in the eyes of their managers. That is a topic for a whole different conversation.

If your development organization can't accept change without interruption and multitasking, then:

Your work assignments are too large, too personal, and too well enforced.

Friday, April 16, 2010

I wake up with odd memories. I think my brain is just getting exercise, staving off senility maybe.

Yesterday I woke with a memory of a bug fix from long ago, probably 15 years or more. I was given a project which led me into the permissions-checking code of a billing system. When I got there, I saw the code was written in such a way as:

I know that's silly, but I am recreating from an old memory, and I remember the code having some silly uses of null & the like.

I remember noticing right away that if a permission did not exist, it meant that the user can perform the activity. GetPermission() read from a database table, I think, so that all a user had to do to get unlimited access was to drop records from a table. Furthermore, if an installation tech didn't deny access to unpaid features, the customer got them for free.

I don't want to reflect on the validity of the business model here, whether restricting features and paying for use of them is a good idea or whatever. It was a simple code error and a simple code fix to me and I only needed to know if the behavior was intended or accidental.

I quickly consulted some people who told me that it was not the right behavior. They said that some users had paid extra for features and would be angry if other customers were getting them for free. A little research, and we find that most customers had most features for free. As requested I fixed the feature so that having no permission meant having no feature. They figured that they were safe since they didn't ship documentation for features that weren't covered in the contract (security by obscurity). I had approval for the fix, and completed the ticket quickly, and you'd think that was the end of the story.

To the sales and marketing, however, it was a fiasco. After the next release there were angry people asking why they couldn't do their normal reporting and administrative tasks, why they couldn't do various normal feats that they could do only the week before with no issue. We had taken back from our users the features that we'd granted.

People did not rely on the manuals to tell them what the software did. That would be silly. Interfaces are rightly made explorable to people can learn it for themselves. Omitting docs did not lock down features.

Not testing the permission systems wasn't all that smart either, because nobody realized that it was "in promiscuous mode" until one of us programmers stumbled onto it.

Maybe the most damning error, though, was taking back what we had given. Whether they were entitled to the extra features or not, they'd had them. The genie was out of the bottle, the toothpaste sans tube. A technical correction was a marketplace error.

It was a decent lesson to learn, though I soaked the blame for screwing up the customers by fixing the code. I had permission, had talked it over, and had an approved ticket to fix it, but ultimately it was my fingers typing the code and my hands testing the correction, so I had to wear it.

I got in more trouble for fixing things than I ever did for breaking them. There's a lesson in that, too.