I now think that choosing roman numerals to represent parts in my series of posts about agile was a bad idea. It is not very scalable! This post, while related to agile, is not part of the series anyways.

A few months ago, I read blog posting about agile development. I think it was called 10 points to consider to make a good agile team. (Unfortunately I couldn’t find the article again.) One of the first few points was – I am paraphrasing – “Recruit best programmers/developers to the team..” When I was reading it, I thought, what sort of advice is that!

If you can get excellent, top notch, superstar developers in your team, then it is not a big deal to be productive. But, it is obviously not practical as a general practice. To start with, the terms excellent, top notch, brilliant etc. are quite vague. Even if you are experienced enough to figure out what it means and are able to determine if a candidate matches these vague criteria, not every software team in every company be able to find them. As a rule, if a developer is considered to be excellent, she will be relatively expensive compared to the more average developer. If you are working in a large company, there will be severe competition to steal better developers by different teams and not all of them are going to get the people they prefer. Since superstars are in high demand, there is always the fear of losing them!

Today I was reading an excellent article by Laurent Bossavit of Institut Agile (in French) named Fact and Folklore in Software Engineering. The article is about the oft quoted statement that “best programmers are 10 times better than the worst”. While I have heard this statement in my early days, I have never thought about it in depth and considered it to be an insignificant observation in practical software production. Bossavit goes into the history and evolution of this statement in great depth.

What attracted me most was the rigor with which the article was written. It is an excellent review in the scientific sense. He starts from the first published record of this statement to a 1968 paper by H. Sackman et. al. While this paper describes an experimental study, it actually measures difference between debugging task. Bossavit then continues to investigate further papers that repeats this claim either by referring to the original study or by referring to other studies that supposedly replicates this result. He then looks at a 2008 blog post by Steve McConnell which talks about this problem and does a survey of papers that apparently supported this claim published after 1968. The fun is when he tries to verify the references, which are either just repetitions, circular references or just opinion pieces. (I am not going to repeat this part. Read the post).

A common problem in software development is the subjectivity of measurement. There are so many matrices and methodologies to measure developer productivity both for predicting and rewarding. But, most of them are quite arbitrary. It is one thing to introduce a measure with the express acknowledgement of its arbitrary nature, for e.g Story Points used by many agile teams. But, when people start to take these numbers and then come up with complicated calculations of velocity and then plug it into traditional project management schemes, things start to crumble.

Another thin I noticed while reviewing the articles was that, they all measure the initial time to solve a certain task. In practical software development, the initial solution to a problem is just that. The total cost of that solution will not be clear until it goes in production and actual users start using it. So, just because a person can finish a solution to a problem faster than everybody else does not mean that she is the most productive.

I think the real challenge of a developer, especially a person in a lead/mentor role is to figure out how to achieve sustainable productivity with an average team comprised of average people. If you don’t have superstars in the team, you don’t have to fear losing those superstars!

My reference to Adam Savage in Part III was not just incidental. I think it is a very profound one, especially in software development.

Adam Savage, in a later podcast (unfortunately, I was unable to find it) explains how the phrase “Failure is always an option” represent a fundamental fact about scientific enquiry. Unlike we see in many movies where mad scientists work like crazy and then be heart broken after their experiment “fails”, scientific enquiry, especially that of experimental enquiry thrives on failure. There might be a favored outcome for an experiment, but if the real outcome is different, it provides data. Failures in many cases provide vastly more data than success. Even when one gets expected results, it will most likely be falsified later by someone else.

It is hard not to notice the similarity between this and software development.

Every claim about a software is eminently falsifiable.

While, as a software user, many of us are faced with the mysterious ways of working of an application. But, this is neither due to a supernatural intervention or from a Heisenbergian uncertainty. It is just simple classic phenomenon of not having enough information about the inner workings of the software. However, while developing a software, one cannot really appeal to ignorance. Many things in software development resembles Murphy’s law in steroids. Things are guaranteed to go wrong, and they will always go wrong in detectable ways.

Software development is always considered and thrived to be an engineering discipline. This is why we try to create one engineering process after another to make it behave more like other engineering projects. But, the history of these tight engineering controls is at best dismal, and even when they worked, they did it by curtailing innovation and creativity to the extreme. Agile/Extreme programming in many ways was a response, resistance movement if you will, against this tyranny of process. It concentrates on the human element (like the Dow chemicals commercial – but then they went and bought Union Carbide), and creativity. Instead of trying to control and limit change, agile methodologies embrace change.

Successful software development demands a lot of intellectual commitment from the people involved. It is more like pursuing a scientific experiment. Here, we have this hypothesis. What is the best way, in terms of representational accuracy, maintainability, and overall usability to model it! There is always a multitude of choices. Optimality of one of these choices is unlikely to be clear in the beginning of the process. So we have to start with hypotheses and empirically prove that the assertion is either true or false. Irrespective of which answer we get, the data we collect during this process define the problem in a better light. We get to define more variables and get the values for more constants. May be we have to go back to the proverbial “drawing board” and adjust our original hypothesis, except that the drawing board here is a constant addition to workspace. One difference from a scientific experiment though is that, at the end of even a partially successful experiment, we get something more tangible.

This is the spirit of scientific enquiry. This is why I think software development should be treated less like an engineering discipline and more like a research activity.

Agile in many ways does this. It unseats many of the mechanistic visions of earlier methodologies. By focusing more on the team dynamics and accepting change - constant change - as a welcome phenomenon. I am sure many of you are familiar with the old adage that the cost of change in software development increases exponentially as it progresses, which results in the axiom that we should try to reduce change, and capture as much as possible in the beginning. This is a wrong premise. Usual dynamic is that the user will find more things to change as the feature/component nears completion. Users may find many of their original assumptions were not accurate. We can always shut the user down until we announce that everything is done, and then tell them to live with what they have just like they are with their last home improvement job. There was a time when this would have worked. But users now understands more about software and its nature. We can no longer afford to blame the user for everything that went wrong… “they changed the specification, they don’t know what they want!”

There are some parallel efforts at resurrecting the engineering credentials of software. One such attempt is Intentional Programming. One assumption I had earlier was that the current problems in software development is just because it is a new industry and will eventually find its true calling. However, the nature of software, that of modeling real world scenarios, makes it unlikely that this will happen soon. The complexity of human society, individuals, interactions, even that of our artificial systems like banking and finance are so great, and our ability to model them, or even understand them is still in a very very early stage. Software, which tries to create virtual worlds, information models about them, and sometimes even helps create this understanding is bound to be complex and tentative.

That takes us to the next parallel between software development and scientific enquiry. Tentativeness of the solutions we create. There are so many factors that will reduce the overall usability of the system and create obsolescence from change in practices, advent of new hardware or software technology, changes in social expectations etc. Even when we successfully produce a model that satisfied the requirements, one has to constantly question the viability of that model. This could be a new human computer interaction paradigm like multi-touch or Kinect, ubiquity of small form factor devices, change in financial regulations, expectation of connectivity with the rest of the digital world, disappearing boundaries of office and home etc. Just like there are no sacred theories or laws in science, there are no sacred software. There are no eternal killer feature. There are no “only I can do this”es.

What this means is that, if I don’t poke holes in my model, someone else would. And if that someone else is a customer with the cudgel of a Service Level Agreement, it could bring us a world of hurt. So, the best way is to do this proactively. The main function of a software developer is to think about how to break what we have done, how to negate the hypothesis, how to falsify what we just proved.

So, go ahead, break your Fitnesse tests. Break the build and if you cannot fix it within the day, buy Donuts (or Parippuvata) for the whole team. As long as you take the code to a better place, it will all be forgotten.

So we are TDD. We proudly announce the number of unit tests and the percentage coverage as part of the scrum achievements. We make demands on minimum coverage (for a brief while when we had TFS, it was a check-in constraint). But, what do we actually gain by testing? Is there a law of diminishing returns in testing?

Uncle Bob in his first presentation at our company demonstrated the bowling example. It is such a simple, eye opening experience to see how easy it is to over specify.

As I mentioned earlier, we were used to very delayed gratification. There were no demands on checking in code, actually we encouraged private branches for doing stuff. Sometimes it takes months before the changes could get in to the build.

The fun part about unit testing (especially if you have a “run this test” context menu) is the instant gratification, even in failure. You should first write a test that fails, so says TDD.

In a non TDD style of development we always expect to succeed. The first time I press build after a series of changes, I expect it to pass. The first time I run the application after a change, it better not crash. Even under the best circumstances a full build and run of the application, which was required if the change was in any of the core units, takes quite a while. Once the application comes up, we need to login and navigate to the view where the change is to see its effect. If we were to find something not working it would be a downer. So, we expected to succeed all the time, there by accumulating heart breaks upon heart breaks.

What changes when tests becomes the primary focus of development? When you write a unit test to model a new behavior, the first attempt is not even supposed to compile. Sometimes, if you are just fixing a behavior, we might have a unit test that can be compiled successfully without changes, but it definitely should not be passing. So, most times, we are specifically looking for a failure. It grows on you. I am no longer ashamed by a compilation error. It is a piece of information, sometimes quite valuable insight into the change that I am going to make. Since when a unit test breaks no airplanes fall from the sky (or angels die), we can afford to do this over and over. Every failing test gives us yet another insight into the problem, one more thing to do; every passing test makes us look for the next best way to fail.

Accepting failure as not just a normal outcome but as a desired outcome makes things much less stressful. If we fear failure, we will build safety measures for every imaginable way something can fail. The problem here is that, there are more imaginable ways to fail than plausible. And there are far more even plausible ways to fail than probable.

In a very fast pace environment things do go wrong from time to time. Since failure is welcome, there need to be a way to celebrate it. This is why we invented the Blame Gametm. When something goes wrong, when the build turns red, when a test “works in my machine” but no where else, when you wipe out the changes for 50 Fitnesse scripts because of one wrong merge, we blame. Of course, the blamee doesn’t have to accept it. There can always be come backs as long as they are more logically consistent and evidenced than “dog ate my hard drive”. The key is to embrace the failure.

What TDD, not just unit testing but aggressive acceptance testing teaches us is to fail often and fail gracefully. As we all know, if your millions of assertions never fail, they are as good as absent. The value of a test is when it fails.

*This is one of the best memes to come out of MythBusters promoted constantly by Adam Savage. There is one podcast where he describes why it is a fundamental principle for him. I hope it is not copyrighted by Adam or Discovery channel.

The point I stop reading an article about agile development is when it starts quoting from the agile manifesto. No, I do not have any qualms with the manifesto; I think it is an excellent minimalist document. However, when people starts to preach about it, I tune out.

Same is the case when someone brings up a specific set of practices usually with a cute name as The Process. What I know from my last 7 years is, the only process that stays is changing processes.

This was very true in the beginning. We went back and forth and back to SCRUM as the organizing process, but played with its format and deliverables for quite a long time. We have tried weekly iterations as well as those that are longer than a month. We filled our walls with multi-colored post it notes. Built weird looking shared Excel sheets and used formal project management software with custom extensions to track the sprint.

We have been using unit testing to some extend even before our full plunge into the agile pond. But, the development and testing phases were mostly separated. After introducing SCRUM, and co-located teams, we were not sure how to interact with each other for a while. Lingering mistrust between the programmers and testers were quite palpable.

Most of our automated acceptance tests were UI tests. While these are in some sense the ultimate integration tests, precisely due to this overarching scope, they require most of a feature, including a near final UI ready before it can be run. In those early days of UI automation, the tests were extremely sensitive and just adding one control in a form would break a whole bunch of tests in mysterious ways. We actually had a campaign “does it break automation?”. (These days we ask, how can I come up with a breaking acceptance test, or a directed failure in existing tests to implement a feature or to fix a bug. More about that later.)

One of those days Michael Feathers came to our work and told us to go find fracture points and start clawing from there – I am paraphrasing. We have been looking at a really huge block of rock and wanted to see nice score marks, tap it with a soft mallet and there you get the nicely shaped pieces. He wanted us to look for fissures, cracks. That is what we did. Delphi, the language of our code base is not very refactor friendly. After 5 years of yearly releases, the original architecture was starting to form into a tangled web. (Hmm, so we have a huge rock with tangled web around it. See, I have my metaphor still going strong. How many of you have pulled the dry stems of climbers from rock surfaces? you have to be very careful or it will break. They do have a tendency to go into cracks!) Though we were a bit unconvinced about the feasibility of a test driven agile development strategy for our code base (we of course wanted to build from scratch!) we looked at our Java brethren with green eyes! They have all the tools, full reflection, managed code… We wanted it!

Once we started to pull at these stems, things started to happen. We tried to follow TDD as much as possible, but at the unit test level. However, once these frequent changes started to spill over and break automation, things became serious. Mind you, we are also working at break neck speed for a new release. While there was some consideration to the additional burden of process adoption, it did not change the deliverables much. So, if the automation is not passing, we cannot say a feature is done. If the feature is not done, we cannot get to stand in front of everyone and get an ego boost during the sprint retrospective.

Our early sprint retrospective started with a science fair-y demo of our features in a lab. We even got to sell our new ideas. After a few sprints though, we decided to do the demo in a formal fashion, power point or actual demo one at a time with a specified amount of time. Then we decided to do the demo at the team rooms and let the stakeholders walk from room to room. Then we decided to do it as a presentation for everyone, then we went back to team rooms, then we decided to record them and post them the previous day…

They all worked.

Coming back to the automation dilemma, soon it was clear that UI centric acceptance automation is not enough to support the new way of development. It was quite complex and time consuming to write and maintain. They also took awfully long to execute making it unviable to use as part of continuous integration. If we were to have some confidence in what we were doing - remember, much of the code we wanted to refactor were units that we seldom touched – we needed acceptance tests that are run with every build, or at least once or twice every day. Jealousy is a good motivation. We had been drooling over Fitnesse from the moment Uncle Bob showed us what it can do. There was no support for Delphi in Fit at that time, so we ported Fit to Delphi Win32 and started writing some tests. This was the same time when Delphi came out with a .Net version. We had to try it. We managed to compile enough of our code in .Net to allow us to cover the core and common business rules. This exercise to cross compile also created an opportunity to redefine layer boundaries by package restructuring. So we abandoned our win32 Fitnesse plan and started using the .Net version of the code to write Fitnesse tests for core functionality. This along with the Business Objects Framework that was introduced mostly through unit testing finally started to carve into the block making the cracks bigger and bigger.

We had a very supportive upper management during this transition stage. But, as the release progressed, each sprint, they will find some of the things that were supposed to be done was not done. This naturally brought up the question of accuracy of the estimates. Even though we were quite aware of the arbitrariness of the numbers we put in the remaining work column, it was never really sunk in. This gave rise to a series of estimation games and strategies. We had long planning sessions upfront. Longer planning sessions in the beginning of the sprint. More accountability of estimates. Complexity, risk and confidence factors, 0 to 1, 1 to 10, percentage… Attempts at accurate time reporting. And the all powerful velocity. We must have multiplied and divided every measurable and quantifiable aspect of the development process with every other to come up with a velocity unit. Ideal team, developer days, 2 hours allowance every day for meeting, fractional contributors.

They all worked…

Even when many of them did not bring forth the result we hoped for. But, when they didn’t, we got a chance to learn why it did not. Isn’t that the spirit of any scientific enquiry!

The noise level about agile software development is deafeningly high these days. May be it has already peaked, which is probably a good thing.

My real encounter with agile development in a production environment happened in 2003, when the company I am working for decided to adopt agile practices. We were playing with Fish philosophy before that. That was quite amusing and often gave me of an Office Space feel. It was like someone was trying to pour happiness down my throat.

When I was sitting in the early presentations and crash courses on agile, I was quite skeptical in its adoptability for our code base. Our experience of the first three years of agile was presented in the 2005 agile conference (Teaching a Goliath to Fly). 5 years after that paper, we are still agile, more so.

So, among all the other noise, I will add mine as well.

I am planning to write a series of posts about the interesting facts, realizations and revelations during this time, my reflection on the larger state of affairs etc.

First of all, I believe the spirit of agile development is its lack of rigidity. Unlike the earlier, well defined software development life cycles, which specified (sometimes including visual/text format) of artifacts that are to be used in each stage of the development, agile presents some basic principles. There, of course, are attempts by many to present such over specified artifacts in agile as well, but, it is an exception, not the rule.

It all comes down to the realization that software development process is quite messy. But any complex human endeavor is messy, especially ones that involve a lot of abstractions. In many engineering practices, we have managed to come up with systems and practices that controls this messiness to a very large extend. However, if you have ever associated with a construction project, it is easy to realize that with all the plans and architectural drawings and RFPs etc., the final product turns out to be quite different from our original conception about time, resources and form. But, since we know the costs involved in making a change, we just try to live with it.

In case of software, there is a common assumption that it can be altered quite easily – that it is soft, and malleable. It is also quite abstract even in its final form. It is a model of the real world, a simulation of a series of behavior patterns of the store front clerk, physical movement of a bunch of trucks. They communicate tersely with the user, mostly in verbal, or in highly iconized visuals. We create this model through a series of layered abstractions from real world observations, verbal descriptions, mathematical equations and finally the implementation tools (programming language, testing tools, modeling tools etc.)

Many early attempts at controlling the messiness of software development did so by controlling change. Mimicking other engineering disciplines, we tried to create detailed design artifacts, elaborate triaging procedures for change control and sometimes downright scare tactics! Every stage of the development created these huge walls of artifacts between its predecessor and successor. And as with any wall, it seeded animosity. I remember, in my previous job, I met an actual tester only after several months. But, even before meeting one in flesh, I was quite happy to trash them and developed quite a distaste in their ways. We did the same to the “architecture team” whom I never met in my two years there.

A significant symbolic gesture we did, prompted by Ken Schwaber was to demolish the walls of our cubes. Our cubes had 5ft high walls and you could only see the person sitting right across you. The window cubicles were a status symbol. It also separated the programmers from the testers and from the BAs quite effectively by placing them in different areas of the floor. It was a shock to a lot of people to lose the privacy of their cubes. Many complained the new “team rooms” are too noisy, Ken complained that we are too quiet. (The team I am in now has the honor of being the noisiest!)

We were officially following SCRUM and agile. But, the nature of our products made this adoption quite challenging. Since we are developing packaged software, there is not a lot of direct and immediate interaction with the end users, and the release cycle is typically span an year or more. The adoption of new versions by customers is even slower. There were serious doubts about the predictability of iterative process. We were changing things so very often. Trying new ways of planning, inventing complex sticky notes schemes, pairing and not pairing, fighting over differences between unit testing and acceptance testing.

Now, if we look at kings in real life, who are overly pompous, over spending, ego maniacal AHs, this title may suite.

Oh, no, I have nothing against Volt. I would buy one (or another Plug-in) when it is time for my next car. But there had been stories similar to this taking rounds in the web. It looks like the manufactures want to make this a story. Look at us, we have 10 million lines of code!

First of my problems with this statement is, lines of what code? Since lot of the code are controller and other microprocessor code, it could be just instructions. In that case, 10 M instructions is not that big a code base. It might be a good thing to have only that much. But if this is 10 M lines of code in some high level language like C, things look much different. Then the question is why the heck so much lines of code!

There was a time when people used to boast about the numbers of lines of code in their code base. There were even places which used to pay per lines of code. But, if someone these days try to bring up the number of lines of code with a sense of achievement, unless it is to show how few lines there is, it is unimpressive.

A lot of people assume that all the work in creating a software product is in finishing it up for the first release. But, the fact is, it is only the beginning of work. Not just fixing bugs but also keeping up with user demands for new features, accommodating new scenarios etc. So, the total cost of a software development process is overwhelmingly decided by its maintenance costs.

That is why the best software product is the one that does not have any lines of code. So, Chevy Volt is 10 million times worse than the theoretical best