Tuesday, September 29, 2009

I (re)tell a story frequently about a place I worked in the 90s. There was a piece of code with an absurd cyclomatic complexity score, running literally hundreds of lines in length, and being called from myriad places in the code base.

The code was written to check to see if two ranges were overlapping. Being written in a poor 4GL, it took four parameters representing the starting and stopping dates of two time ranges. Such a simple task for the code to be so horrible and lengthy. In C it would look rather like this:

bool ranges_are_overlapping(int a, int b, int c, int d){
}

It quickly became clear to the most casual reader that the ranges were a..b and c..d. Well, sort of. The code was defensive and was written with the understanding that the range-defining arguments could be somewhat unordered:

Now, of course, even once we square away the range markers, there are a whole host of possibilities. A could be less than, equal to, or greater than C, and B could likewise be greater than, less than, or equal to D.

Of course, each permutation of the range start and end relationship would require a full repetition of the comparisons of the range starts and ends complete with cut-n-paste-n-edit inner blocks. We do not attempt to recreate the entire mess here.

I bet the programmer responsible for this routine wrote more lines of code that day than anyone else on the team. What he lacked in efficiency, he made up in diligence.

The code was correct in its results, but it was huge and slow and tedious to desk-check. It had no tests, automated or otherwise.

Straighten out the begin/end so you don't need a Cartesian explosion of if statements, and then look at the ranges. When either range ends before the other starts, there is no overlap. Otherwise, you have overlap.

It is hardly a miracle of profound logic or deep mathematical insight. Nor of excessive typing and careful editing.

Young Tim actually had to work that out on paper. It worked great, was much smaller and simpler, and ran much faster than the ugly monstrosity that was there before.

The reason I'm writing this is not to express that I'm revolted by bad code or to brag that I replaced it with good code. It is not to ridicule diligent-yet-misguided junior programmers. The point of this blog is what happened next.

What Happened Next

I was called into the boss' office. Someone noticed my code improvement! He showed me the old code and the new, both printed out in neat stacks on his desk. I grinned and said "Yes, it's much faster now."

He didn't smile back. He frowned.

He told me that it was clear from the original code that the author had thought through all of the possible scenarios and had accounted for them. Mine, on the other hand, showed no such diligence. I clearly had not put any real effort into my work. My answer was too small. Even though they couldn't make it fail in testing (yet), he knew that I must have left something out. As a result of disbelief in my abilities, he was rolling back my change.

I was upset that a small working algorithm was about to be tossed out and a messy steaming pile of code put back in. I was more upset with the accusation that I did not think through the cases, and that (despite all evidence to the contrary!) I had written an insufficient piece of code.

I tried to defend my solution, even pulling out my scrap paper with all the overlap scenarios on it, but it was too late. The decision was made. I and my solution were obviously inferior. Chastised, I returned to work at my desk.

I began to lose interest in the company where I had previously intended to spend my entire career. In time I left and have had a good time of it.

The Payoff

Why blog about it in 2009? Because my TDD associates have blogged and tweeted about how, in TDD, the code becomes more generic as the tests become more specific.

If the Tim of 1994 had known about TDD, he would have built up a large and specific base of tests covering all the cases represented by the old code. His tests would have demonstrated that he'd thought through all the permutations, and the simpler solution might have been fielded. Tim would have been saved a moment of humiliation, and that poor application would have gotten a boost in performance.

TDD would have made the code better, and it would have improved the experience I had there. It would have given me visible evidence of my thinking. With a body of unit tests, there is proof that we've thought things through. An oblique, small solution cannot provide that on its own.

Young programmers: consider this advice. The more elegant solutions you devise will need a body of proof if you are to survive clue-challenged technical managers. If you don't do TDD for the sake of the code, do it for yourself.

Tuesday, September 22, 2009

I was reading Uncle Bob's latest blog this morning about messy code and technical debt. I wanted to make a comment about the problems programming shops face, but decided to do it here instead.

The problem with clean code is twofold:1) people who can't see it don't believe in it2) some people who should be able to see it don't believe in it

People who can't see it don't believe in it.

One of the heartbreaking lessons from the Big Ball Of Mud talk on Wednesday at Agile2009 is that people working two levels of management above your head do not know that the code is messy. Joe and Brian popped up a slide of Mike Rowe, and quipped that you can't bring him out to wade in the muck in a way that non-programmers can understand. Oh, the code is stinky and messy and bad, but only you can see it.

If you can't see the difference between clean and ugly code, it all sounds like a "programmer myth". It seems daft to take time for refactoring. After all, when the programmers finish refactoring the code doesn't do anything new, but the programmers feel better. How much money do we lose to make programmers feel better?

We need quality (in low bug count, low regression count, sustainable productivity) but can't afford time for quality practices (TDD, pairing, and clean code). Discounting this dubious "clean code" thing, it must be because the programmers aren't very good. Which is right, as far as it goes. Better programmers make better code which can be enhanced more readily. But doesn't that imply that our fastest programmers must be our best programmers?

Some people who should be able to see it don't believe in it.

Not all programmers can see mess. If they could see it, then they wouldn't make so much of it.

What if one makes a new program by copying an existing program and hastily hacking it into a workable shape (ignoring duplication and testing) and drops it into the release for tomorrow? Isn't that a big win for my team? If it's done quickly, doesn't that make me a good programmer?

Maybe the jury is out until we hear back from the users. Is my responsibility to hack code out quickly, or to make stuff that works in actual users' hands? What about when my peers come along to fix something: have I helped or hindered them? Quick hacks stop well before they reach 'done.' Though hacks they look good in the short term, they are just deferring work to post-release. It would be wrong to reward this behavior.

A number of otherwise capable and productive programmers can't tell mess from brilliance. Their code is complex, confusing, implicit, indirect, cryptic, and poorly organized, but it works and they feel good about it. They may have reached some level of success for continually pouring out working code, yet their code is a shambles. James Grenning would say such a person is like a cook who never cleans the kitchen.

The primary factors determining how quickly we will program today are the quality of the code we're working in, and our ability to do work well. Clean, clear, obvious, straightforward code makes us better and faster, poor code makes us slower and more likely to make mistakes. John Goodsen from RadSoft always told me that the secret to going fast was not to slap things together but to make fewer, more correctable mistakes. This level of disciplined work is not a waste of time, but a small-yet-potent investment in future productivity.

We've learned that the longer a bug remains undetected, them more it will cost to locate, isolate, and eliminate it. Cleaner code will reduce the incidence of bugs and TDD will also speed discovery of bugs. Ugly code will encourage the creation of bugs and lack of TDD will allow them to remain undetected for longer periods. Sending bugs out to the customers erodes good will, which nobody wants. As a coping mechanism, exhaustive manual testing is costly in time and money. Code cleaning and TDD together are a waste preventative rather than a waste of money and time.

Duplication of code is a common form of "messy code", generally caused by copy-and-paste programming. It is particularly ugly because developers may fix one copy of the code (perhaps in a report) not knowing that it has been duplicated elsewhere (perhaps in another report or screen). Later we report bugs that look like recurrence/regression but really they are just bug duplicates. Going back to fix a bug multiple times is an expensive waste of user patience. Eliminating duplication is waste removal, not actually a form of waste at all.

Cleaning our code and testing our code make us go faster, but the effects are not immediate. It may seem inobvious that we are going faster by taking time to clean and refactor our code, by using TDD and pair programming, but these are the practices that we use to avoid having code returned by QA or unhappy users. If we measure from the time we pick up an assignment until the time it really works for our users, we find that TDD, Refactoring, Pair Programming and like practices greatly speed development. If we only measure from the time we pick up until we release the buggy feature, then all these practices seem to slow us down. You have to choose the measurements that really matter.

Where does this leave us?

If some programmers can't tell clean code from messy code, most managers cannot tell, and most sales and product people can't tell, and if the benefits of refactoring trail the intial feature work by weeks, months, or years, then aren't we without hope of improvement?

We are without hope of external rescue. It is unlikely that any non-developers in authority will mandate or even approve the practices that will get us out of our mess. If things are going to be better, it will be because we make them better. We don't need permission, but if we care about our products then we do need to use hygienic practices in our daily programming.

I saw Ben and Phil speak at Agile2009, partly because I know Ben and expect him to someday have a large impact on the world of software development, based on his intellect and personality. Ben and I were coworkers briefly at Object Mentor, and I got to work with him a very little. I don't know Phil, but enjoyed meeting him. They are at Improving Works now, the commercial entity behind Infinitest (a Continual Testing tool). I asked if I could borrow the death spiral steps for an InAFlash card, and they graciously agreed.

I produced the card and my spin on the points and turned it over to my partner Jeff Langr for review. Jeff pointed out that it was interesting enough, and true enough, but as an extract from the context of the talk it is not particularly helpful.

I've had similar remarks about my blog posts from Bob Koss. He suggested that an article is good to the extent that it helps people do their jobs. I realized that he was right still, and Langr was channeling that same wisdom.

Jeff suggested that we provide remedies for each of the steps along the way so that people can get some real benefit from the card. While I am not crazy about violating the boundaries of the single 3x5 card (bleed-over) I realized that there is value in the idea.

Jeff started the remedies list, I joined in, and Jeff reproduced the cards. I thought I liked the Daniel Black font, now am not so sure. Either way, the new cards are far more useful than the old one was alone.

As a blogger, tweeter, writer, coder, email correspondent I need to be more focused on whether I'm actually helping people do their job instead of providing sparkling commentary and a personal touch. I've been working on that, and will work harder in the future to make it so.

While we all have personalities, this Agile stuff is about delivering value frequently.

Tuesday, September 8, 2009

If your test has a trainwreck in it, DO NOT start building the object chain in your setup so that the trainwreck will execute in your test. That will take forever, and the payoff is next to zilch. Instead, extract the trainwreck expression to a private virtual method you can override:

Advanced students may note that the public method is in the wrong class now. It doesn't use any local methods or variables, indicating very low cohesion. Time to push it one class deeper in the chain and then reevaluate it to see if parts of the expression need to be pushed further down.

This trick will take you from having to test "in context" with tons of object-chain construction to a new situation where testing is absurdly simple. Use trainwreck removal for all complex indirect accesses including singletons.

A friend of mine told me that his company is doing an agile project for integrating some systems. They've already had a few deployments, and their bug count is below 1/10th the expected level. Ahhh. That's how it's done.