This code reads pretty much as the author would have explained how to handle the “digits” variable: If it only has four digits, pre-pend it with a zero. If it only has three digits, prepend it with two zeros. Simple and to the point and move on. Save the clever code for something that matters.

The Ruby Way is the perfect second Ruby book for serious programmers. The Ruby Way contains more than four hundred examples explaining how to do everything from distributing Ruby to functional programming techniques like the Y combinator.

The author of this code was an Agnostic. He didn’t mind using a case statement instead of if-elsif-elsif, he paid attention to the features of his chosen programming language, but he didn’t have strong convictions about there being One True Way to express every set of choices.

His code “Just Worked.”

Donning The Hair Shirt

The Agnostic’s code carried on, quietly working, until one day another programmer chanced upon it. Our second programmer was an Ascetic. Ascetics believe in code that can do powerful things with a small set of core axioms, like recursion and first-class functions.

The Ascetic spotted a bug: What if “digits” only had two digits? The objective of the code was clearly to have at least five digits, with leading zeros. The Ascetic cracked his knuckles and went to work.

The Ascetic checked his code in, where another programmer, a Librarian, reviewed it. She saw immediately that it could and should be re-written:

def formatted_zip_code(digits) digits.rjust(5,'0')end

Librarians believe that there is a library for everything, and good languages have huge libraries providing lots of built-in functionality. Programming is the art of searching the libraries for the one magic incantation that does exactly what you want.

And for a short time, all continued to be well.

Purity of Essence

One day, management hired a senior programmer with much experience working in the bowels of BigCo. He was brought in to provide “Adult Supervision” for the programming team. He went right to work, reviewing the code base and compiling a list of sins against his religious convictions. He was an OO Purist.

I am told that a cathedral built by Purists is beautiful to behold, constructed lovingly of small bricks, no more than 50 lines per class, no more than 10 classes per package, no more than two instance variables per class, no more than one level of indentation per method, no more than one dot per line, and absolutely no else clauses.

The Purist considered ifs, cases, and even shortcut booleans to be deficiencies in code, places where the proper approach—and indeed to only approach—is to use polymorphism and dependent types.

The Purist seized upon the twice-rewritten code as an opportunity to show how such a thing ought to be written:

I have also omitted his ZipCodeFactory that would take a normal string and figure out which particular subclass of BaseZipCodeDigits to construct.

(I am told that a cathedral built by Purists is beautiful to behold, constructed lovingly of small bricks, no more than 50 lines per class, no more than 10 classes per package, no more than two instance variables per class, no more than one level of indentation per method, no more than one dot per line, and absolutely no else clauses.)

Of course, the Librarian, Ascetic and the Purist hated each other with a passion fueled by the narcissism of small differences into a raging fire of religious cleansing. Much politicking and fighting broke out over the code.

Meanwhile, the Agnostic carried on, quietly coding while the others bickered over how to properly rewrite what he would write. One day, he noticed that some unit tests were failing and raised the issue in Campfire.

The Return of the Agnostic

“Hey,” he asked, “What happened to that piece of code? It was for zip codes, it was only supposed to pre-pend one or two zeros when importing zip codes from CSV files. Anything with fewer than three digits is supposed to be an invalid code. Empty strings shouldn’t be converted to five zeros, as far as I know.”

“What’s up with that?”

The Librarian, Ascetic, and Purist were silent for a moment, and then immediately resumed arguing, this time over whether #rjust should have another parameter, whether the regular expression in a recursive routine should be changed, or whether the simplest thing was to simply change the method for the EmptyDigits, OneDigit, and TwoDigits classes.

The Agnostic listened for a moment, tried to interject a few times, then sighed and reached for his text editor.

Update: Reviewing the comments made elsewhere, I see that this post fell into the Fizzbuzz Trap: By quoting a programming problem—no matter how banal and contrived—the article was bound to provoke a huge amount of dicussion around the correct way to solve the problem, while ignoring the point of the post. This is my fault, I should have known better than to post a snippet of code. As many of you have noted, the post is not about the Agnostic or his code, it’s about the dynamic of programmers eager to rewrite code in their own image, and the hypothesis that our (I am equally guilty of this behaviour) motivation for doing so is to emphasize the small differences between ourselves and others.

For no matter how good or bad the Agnostic’s code is, why did the Librarian rewrite the Ascetic’s code? Why did the Purist rewrite it again? When they discovered that the Agnostic’s code actually met some requirement, why weren’t they talking about documentation, or the process around tests? Why were they all—including the agnostic—eager to rewrite it one more time?

Seriously, the post is not about whether the Agnostic should have documented his code, or where the zip code validation should live. However, I should have known that the moment I included a piece of code to provide local colour, the resulting debate over its fitness was inevitable.

To be perfectly fair, this is allegorical. But yes, all of the programmers knew that the digits belonged to a zip code. However, they were not necessarily aware of the subtlety of the “business rule” that zip codes less than 00200 are invalid.

That being said, there were unit tests for the appropriate cases, and the programmers suggesting/making changes did not review all of the possible test before blindly making assumptions about what was going on.

In other words, they assumed that there was a bug rather than assuming that the desired behaviour was somewhat irregular.

Great post. That's exactly it. I started way back in the ascetic camp, then moved to purist view briefly before staying on the librarian side for a long time. I didn't become agnostic until I was the poor fool who had to stay up all night to find and fix the bugs.

It turns out that when you are really really sleepy your favorite pieces of code are always the most 'obvious' ones. Thinking is not fun in the middle of the night, and it shouldn't be necessary all of the time.

Not surprisingly, elegance comes from taking something 'hard' and finding the simplest, most obvious answer, thus making it look 'easy'.

So Agnostics are supposed to be allowed to enter code that's undocumented and subject to future breakage?

You mention the "business rule" that zip codes less than 00200 are invalid. While that's currently true, it's not complete: there are a number of other unassigned 3-digit prefixes. There's also nothing preventing the USPS from assigning those numbers.

Just because there are unit tests for a specific behavior don't mean that specific behavior is right. Unit tests are code, and can have bugs.

The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they've been fixed. There's nothing wrong with it. It doesn't acquire bugs just by sitting around on your hard drive. Au contraire, baby! Is software supposed to be like an old Dodge Dart, that rusts just sitting in the garage? Is software like a teddy bear that's kind of gross if it's not made out of all new material?

Back to that two page function. Yes, I know, it's just a simple function to display a window, but it has grown little hairs and stuff on it and nobody knows why. Well, I'll tell you why: those are bug fixes. One of them fixes that bug that Nancy had when she tried to install the thing on a computer that didn't have Internet Explorer. Another one fixes that bug that occurs in low memory conditions. Another one fixes that bug that occurred when the file is on a floppy disk and the user yanks out the disk in the middle. That LoadLibrary call is ugly but it makes the code work on old versions of Windows 95.

Each of these bugs took weeks of real-world usage before they were found. The programmer might have spent a couple of days reproducing the bug in the lab and fixing it. If it's like a lot of bugs, the fix might be one line of code, or it might even be a couple of characters, but a lot of work and time went into those two characters.

When you throw away code and start from scratch, you are throwing away all that knowledge. All those collected bug fixes. Years of programming work.

I think we can debate what Agnostic should have/could have done, but the real point of the story is that refactoring other people’s code to suit your own religious convictions is difficult.

I am not going to say it cannot or should not be done, just pointing out how easy it is to break something in the zeal to make it “better.”

Joel continues:

problems can be solved, one at a time, by carefully moving code, refactoring, changing interfaces. They can be done by one programmer working carefully and checking in his changes all at once, so that nobody else is disrupted. Even fairly major architectural changes can be done without throwing away the code. On the Juno project we spent several months rearchitecting at one point: just moving things around, cleaning them up, creating base classes that made sense, and creating sharp interfaces between the modules. But we did it carefully, with our existing code base, and we didn't introduce new bugs or throw away working code.

I am not trying to give a black-and-white prescription here. Just pointing out how much the Ascetic, Librarian, and Purist have in common despite the fact that each of them thinks the other two are complete Bozos.

To those of you saying the function should have been commented to state its purpose -- what would you like that comment to say? "Formats a ZIP code"? The function name says that already. "Formats a ZIP code by prepending one or two zeros to it if needed"? That's exactly what the code says; there's no need to be redundant.

Looks like someone is treating ZIP codes as integers somewhere, which I'd call a bug.

The case I have seen repeatedly is that someone uses a tool like Excel to export a CSV of some kind. Excel helpfully treats values consisting exclusively of digits as integers unless you go out of your way to insist that it leave them alone.

So... CSV files you import often have artefacts like this that need to be handled.

I think you misrepresent the Purist. An OO approach would be to have used a strongly-typed ZipCode class that encapsulates both the validation logic and formatting rules for zip codes. I'd rather maintain that than any of the other solutions you give. I'm calling Strawman! ;)

An OO approach would be to have used a strongly-typed ZipCode class that encapsulates both the validation logic and formatting rules for zip codes.

That is the view from the rest of the system, but how does the Purist implement the validation and formatting within the ZipCode class? The Purist eschews the bloated heavyweight classes that are “Procedural programs in OO Clothing.” For example, a class that is not polymorphic in practice is a sure tip-off that an OO language has been used to provide encapsulation but is no more OO than a similar program written in a modular language like Modula or Ada.

Thus, the Purist implements the classes you see above inside a ZipCode class or package rtaher tahn building a class and hiding if-then-else logic inside it.

Now as to the post: your summary does not match my own, which has something to do with the importance or lack of importance people place on there being One True Way to express a certain algorithm relative to the result it generates.

My summary, which you can obtain by reading the title, is that people fasten upon unimportant distinctions in order to generate a sense of self.

"The \"Real WTF?\" Here" is how any of these people got to be professional programmers without learning to properly comment their code! (Kudos to FP for catching onto that quickly too) ;) Agnostic could have saved everyone the ensuing b!tchfest with:# pad 3&4 digits w/ 0 only.

If you see that, you're going to go ask someone who was involved in the original coding or at least search for dependencies in your code-base before changing that behavior...

The guy who religiously puts in little comments like that though would also be part of a new category, ("Know it All", maybe?) and would likely know the appropriate "one-liner" to spit out upon the merest mention of the problem domain. Something like (forgive my lack of ruby cred if i've mistyped something):

The problem with using unit tests as your specification is that you're splitting your specification from the code in a major way. If there had been something, anything in the code that had pointed out that this is not only formating a number as a zip code, but performing incomplete validation on the number, this whole argument wouldn't have happened. Instead, the answer is "You needed to read another file, in a different directory structure, to understand what this is really supposed to do."

The real problem is we shouldn't even be discussing the details of the problem, but rather whatever point ragenwald was trying to make. Unfortunately, allegories are a tempting but useless communication vehicle, that deliberately obscures a point in hopes that it'll be more interesting (or harder to discuss).

Which made it considerably slower, really. What did he think he was doing, messing with perfectly working code for no good reason? Just putting his stamp on it?

Oh, wait. Who wrote that part of the code in the first place? Looks like it was the same jackass. Maybe there's sometimes a good reason to pave over perfectly working code, say, in the quest for simpler code?

However, I like to kid myself that, if the tests were passing, I wouldn't have touched that code. While I'm kidding myself, I like to think that, if I had touched the code inanely, the tests would have told me about it pretty bloody sharpish and I'd've rolled back my changes and either gone to investigate, or moved onto something productive.

As Reg says, we're in Allegoryland here, so spending any serious time on this particular code is silly, but I still say that we programmers should strive to write code so that, when we return to it a week later with our Ascetic/Librarian/Purist hats on, we're going to resist the urge to tamper because we've written code that attempts to express the why as well as the what.

The Agnostic is to blame 100%: he coded it, but didn't put a comment to the boundaries of his implementation. Had the comment he made during the argument being clearly written, the other goofs wouldn't have modified the little utility.

The Agnostic is to blame 100%: he coded it, but didn't put a comment to the boundaries of his implementation.

So… Do I understand you to say that if a piece of code does X but you don’t see a comment explaining that it is supposed to do X, any programmer ought to be free to “fix” it to do Y because there&rsqsuo;s no comment saying that it is not supposed to do Y?

And that said programmers have no responsibility to research whether such changes break any unit tests? Or to talk to that programmer? Or to look at the context of the code to figure out what it does?

And furthermore that programmers are free to do this whenever they like, without reifying the change from X to Y as a bug or feature where the team can discuss it and prioritize it against other work?

Putting it in that light, do you not think that the other three programmers have some ’splaining to do?

I think that responsibility does not neatly add up to 100% in team contexts. While we can construct a course of action for any one person that might have avoided this allegorical problem, there is a much deeper dynamic at work that needs to be addressed.

Using comments or cruise control or reviewing every check-in to prevent programmers from rewriting code in their own image is ultimately a losing game of cops and robbers.

The way forward from my perspective is to address the dynamic that motivated the rewrites in the first place.

The Ascetic, for example. We look at his code first, but even his thought that the code had to be wrong because it wasn’t mathematically regular: Isn’t that a strong reflection on his aesthetic desire rather than any objective fault in the Agnostic’s code?

Check out a similar anecdote by (your relative) Keith Braithwaite refactoring a gnarly business logic loop into a clean mathematical abstraction (one I believe raganwald will apreciate) and how his team reacted.

The unspoken point is that there is absolutely nothing you can do to prevent someone from taking your code and screwing it up for you.

Lack of comments isn't the problem. You could add a dissertation on why the code does what it does and people will still argue that you're wrong, or invent some far-fetched edge case that you didn't handle properly.

This is the weirdest use of the title agnostic I've ever seen. If anything you can maybe call the first guy a pragmatist. If the library exist and he knew about it you're telling me the guy wouldn't use it? On purpose?

What is he proclaiming to be beyond attainable knowledge? And in what way does it so affect his programming that he earns that title?

Agnosticism is the philosophical view that the truth value of certain claims — particularly metaphysical claims regarding theology, afterlife or the existence of God, gods, deities, or even ultimate reality — is unknown or, depending on the form of agnosticism, inherently unknowable.

The agnostic doesn't know if there's a better way, and believes that we may not be able to know. A pragmatist believes that what he is doing is the best way.

There is a marked difference between the two, and I deliberately chose to label him an agnostic to hint that he doesn't necessarily believe the ascetic, librarian, and purist are wrong.

Where a self-identified pragmatist would argue with them that his code was better because it was pragmatic, the agnostic acknowledges that there are times when his approach is not the best way.

What's interesting to me is that every time an anecdote is given, and the author gives out titles like 'pragmatist', or 'purist', then there is automatically a bias in peoples minds drawing from the connotations of the word used. As a result, no commentor would ever label themselves a purist, for example.

But in actual fact we're all of these at different times and, I've found, we may be a purist to one person and a pragmatist to another.

End of the day, we are paid to write code and provide value to business, and business don't care either way.

But I will say there's been very few pieces of code I've looked at that hasn't made my fingers itch.... perhaps it has less to do with trying to empahsis differences and more to do with placing our mark of craftmanship on what we've worked on?

When I'm not being a FP evangelist on the web, I moonlight as a Christian. One of the things that always astounds me in the Christian church is how the closer two people are in theological understanding, the more likely they are to consider the other a complete idiot. As a corollary, the incidence of "One True Way"-style talk is inversely related to the diversity of the group.

I have no idea why this is true, but I suspect it relates to the incredible annoyance of a musical instrument being a half step out of tune.

This weird behavior happens in development environments, too: people who all appreciate the same goal in development (writing productive code) can really get hung up over some silly stuff.

(BTW, clearly the first programmer needed to define a type of "ZipDigits" which prevented the wrong number of digits from compiling in the first place.)

Although, really, does he want to force a storage medium? That's pretty cruel, particularly if another storage medium would be more efficient. So he doesn't just need OneDigitZipCode, TwoDigitZipCode, et al., he also needs CharOneDigitZipCode, ShortOneDigitZipCode, StringOneDigitZipCode, etc., etc.

Oh, and that's really tough to remember all those class names, so he'd want to make sure to create a ZipCodeImplFactory. And, if he wants to use that through Spring, he'll want a ZipCodeImplFactoryInterface...

Although, really, does he want to force a storage medium? That's pretty cruel, particularly if another storage medium would be more efficient. So he doesn't just need OneDigitZipCode, TwoDigitZipCode, et al., he also needs CharOneDigitZipCode, ShortOneDigitZipCode, StringOneDigitZipCode, etc., etc.

That is FAR to simple. Let’s use Generics! If they are missing (as they are from Ruby), we can still use paramaterized types with some eval meta_magic :-)

actually, treating "int"-iness does appear to be the only valid use of such a constraint, but as a programmer it's not always in your power to reformat every instance of old data you're required to pull in. so this is a perfectly normal function to find in use on the data layer. it correctly handles everything from "999"-"99999-9999", including foreign post codes. my "one-liner" earlier was mistaken, as it mangles zip+4's and foreign postal codes, despite being called "digits"... my function shouldda been more like:

(2 < digits.size < 5) && "00#{digits}"[-5] || digits

that's exactly the same functionally (assuming it's valid ruby or w/e), but still gives you a reason to argue points for style.

i think that's the intent of the original article. ignore the possible repercussions of changing the function result a bit.

"However, I should have known that the moment I included a piece of code to provide local colour, the resulting debate over its fitness was inevitable."