2013-03-27

My notes on "Clean Code" (Prentice Hall)

These are my notes on the book "Clean Code" by Robert C. Martin from Prentice Hall.

I am in agreement with almost everything that this book says; the following notes are either quotes from the book which I found especially interesting and/or especially well written, or they point out issues that I disagree with, or errata which objectively need correction. If some point from the book is not mentioned below, then I agree with it.

Page xxii (Foreword):

Back in my days working in the Bell Labs Software Production Research organization (Production, indeed!) we had some back-of-the-envelope findings that suggested that consistent indentation style was one of the most statistically significant indicators of low bug density. We want it to be that architecture or programming language or some other high notion should be the cause of quality; as people whose supposed professionalism owes to the mastery of tools and lofty design methods, we feel insulted by the value that those factory floor machines, the coders, add through the simple consistent application of an indentation style. To quote my own book of 17 years ago, such style distinguishes excellence from mere competence.

Page 6:

[…] what if you were a doctor and had a patient who demanded that you stop all the silly hand-washing in preparation for surgery because it was taking too much time? Clearly the patient is the boss; and yet the doctor should absolutely refuse to comply. Why? Because the doctor knows more than the patient about the risks of disease and infection. It would be unprofessional (never mind criminal) for the doctor to comply with the patient. So too it is unprofessional for programmers to bend to the will of managers who don’t understand the risks of making messes.

Page 13:

We are willing to claim that if you follow these teachings, you will enjoy the benefits that we have enjoyed, and you will learn to write code that is clean and professional. But don’t make the mistake of thinking that we are somehow “right” in any absolute sense. There are other schools and other masters that have just as much claim to professionalism as we. It would behoove you to learn from them as well. Indeed, many of the recommendations in this book are controversial. You will probably not agree with all of them. You might violently disagree with some of them. That’s fine. We can’t claim final authority. On the other hand, the recommendations in this book are things that we have thought long and hard about. We have learned them through decades of experience and repeated trial and error. So whether you agree or disagree, it would be a shame if you did not see, and respect, our point of view.

Page 14:

Indeed, the ratio of time spent reading vs. writing is well over 10:1. We are constantly reading old code as part of the effort to write new code.

Page 32

www.fitnese.org should read www.fitnesse.org

Page 35

I find all this “very small methods” discussion rather misdirected. If you do this you end up with a multitude of tiny methods that have very long and hard to read and understand names (necessary to capture their painstakingly narrow specificity) and whose existence is atrociously artificial: they serve no purpose other than to pre-chew a perfectly sized meal into tiny pill-sized chunks that might be swallowed without chewing. I think it is not pragmatic, that is, not feasible in real-world scenarios. From the narrative my guess is that the author has never really practiced it in reality: he only briefly saw it in a few screens of source code from a colleague’s project.

The absurdity continues with the classification of methods into four categories, depending on whether they accept zero, one, two, or the incredibly high number of three arguments. (Four arguments being, presumably, unthinkable.) Puh-leez, gimme a break! As Albert Einstein said, things should be as simple as possible, but not simpler. Unless it is violating the "Single Responsibility Principle" (SRP), a method needs to accept as many arguments as necessary to do its job. There are two ways to reduce the number of arguments to a method which is not violating the SRP: one is to introduce more member variables to the class which contains the method, (which is evil, because they are state,) and the other is to introduce new classes, the sole purpose of which is to group values together so as to give the appearance of a single parameter. But introducing more (artificial) entities in a system is far worse than having a few methods with plenty of arguments. The presence of lots of methods with lots of arguments might be taken as a hint that perhaps we are missing an entity that our system really needs, and nothing more. If our system does not really need a new entity, then methods with plenty of arguments is the way to go, really.

Luckily, on page 176, Chapter 12: Emergence, in the section titled Minimal Classes and Methods the book provides somewhat of a correction, by stating:

In an effort to make our classes and methods small, we might create too many tiny classes and methods. So this rule suggests that we also keep our function and class counts low

and

High class and method counts are sometimes the result of pointless dogmatism.

Page 46

The “real solution” offered seems wrong to me.

Page 48

Some programmers follow Edsger Dijkstra’s rules of structured programming. Dijkstra said that every function, and every block within a function, should have one entry and one exit. Following these rules means that there should only be one return statement in a function, no break or continue statements in a loop, and never, ever, any goto statements.

Okay, no problem with the “never, ever, any goto statements”. Actually, I am glad the author begins by saying “Some programmers follow…”, indicating that he included this piece of nonsense in the book only because it is popular enough to be perhaps worth mentioning, not because he really adheres to it, nor to actively suggest that the reader should adhere to it. I think the author should have either skipped such a lame “argument from authority”, or he should have provided justification for it, so as to make it an argument by reasoning rather than an argument from authority. But I am afraid that in this case ol' Edsger is not backed by any valid reasoning.

Page 49

In the end, I wind up with functions that follow the rules I’ve laid down in this chapter. I don’t write them that way to start. I don’t think anyone could.

Okay, I am glad we are in agreement on this one.

Page 53

Chapter 4: Comments – The author is entitled to have his own opinion about comments. I have my own opinions, which are very different from his.

Page 100 - “Data Transfer Objects”

The quintessential form of a data structure is a class with public variables and no functions. This is sometimes called a data transfer object, or DTO. DTOs are very useful structures, especially when communicating with databases or parsing messages from sockets, and so on. They often become the first in a series of translation stages that convert raw data in a database into objects in the application code.

Somewhat more common is the “bean” form shown in Listing 6-7. Beans have private variables manipulated by getters and setters. The quasi-encapsulation of beans seems to make some OO purists feel better but usually provides no other benefit.

(emphasis mine.)

Page 176

Consider, for example, a coding standard that insists on creating an interface for each and every class. Or consider developers who insist that fields and behavior must always be separated into data classes and behavior classes. Such dogma should be resisted and a more pragmatic approach adopted.

This brings to mind certain funny ways of doing things at my current workplace.

Page 185

Recommendation: Learn these basic algorithms and understand their solutions.

Uh, esqueez me, algorithms don’t have solutions. Problems have solutions. Luckily, what was described was not algorithms, it was problems.

Page 187

“Attempts to repeat the systems can be frustratingly” should read “Attempts to repeat the failures can be frustrating”.

“trial an error” should read “trial and error”.

Page 272

Where it says “any function that used to take an int for a month, now takes a Month enumerator” the word enum should have been used instead of enumerator.

Page 276

I also deleted all the final keywords in arguments and variable declarations. As far as I could tell, they added no real value but did add to the clutter. Eliminating final flies in the face of some conventional wisdom. For example, Robert Simmons strongly recommends us to “. . . spread final all over your code.” Clearly I disagree. I think that there are a few good uses for final, such as the occasional final constant, but otherwise the keyword adds little value and creates a lot of clutter. Perhaps I feel this way because the kinds of errors that final might catch are already caught by the unit tests I write.

And clearly, I agree with Robert Simmons. First of all, the 'final' keyword is compiler - enforced documentation, which is the best kind of documentation: when I see an entity marked as final, I know right away that I do not have to worry about the possibility that it may change as I read the rest of the code. Thus, the 'clutter' argument is blown out of the water. Secondly, there is never a valid reason to let a run time test catch an error that could have been caught at compile time: it is a well documented fact in the industry that the sooner you catch an error, the less it costs to fix it. And last but not least, a non-final entity is immensely more complex than a final entity. Therefore, the more entities are final in the code, the smaller the overall complexity of the code. And we strive for less complex code, don't we?

Page 282

“So I pushed it up” should probably read “So I pushed it down”, since judging by many other sentences, the author considers “down” to refer to descendants and “up” to refer to ancestors. (See page 279, “So I pushed it down”, and page 282, “the getDayOfWeek method is another one that should be pulled up from SpreadSheetDate” and “I pulled the implementation up into DayDate” and “So I pulled them all up from SpreadsheetDate” etc.)

Page 283

“making the all three methods much clearer” should read “making all three methods much clearer”.

Page 300

And in the FEET_PER_MILE case, the number 5280 is so very well known and so unique a constant that readers would recognize it even if it stood alone on a page with no context surrounding it.

The world is not the USA. Most of the world uses the metric system, and knows very little about miles and feet. When given a number in "United States customary units", many people outside the USA know how to convert it to metric, but it is rare to meet someone who is proficient in converting from metric to US units, and even more rare to meet someone who has the slightest clue, or the slightest interest, in converting between US units. Therefore, the number 5280 means absolutely nothing to the vast majority of the population of the planet. As a matter of fact, there are so many non-USAians working as software engineers in the USA, that even in the USA it would be a bad idea to assume that anyone who sees the number 5280 in a piece of code will know what it stands for.

Page 301

In “Declaring a variable to be an ArrayList when a List will due is overly constraining”, the word “due” should be “do” instead.

Page 302

“G30: Functions Should Do One Thing” -- no, functions should do as many things as necessary to accomplish their job with no duplication of code. Splitting functions into much smaller functions looks good as an exercise on paper, but in real life scenarios things are not so beautiful. A clean cut purpose will have to be devised for each one of the smaller functions, and a nice descriptive name will have to be invented for it. Lots of local variables may have to be passed as parameters to it, and more than one of them may need to be modified, which is impossible in Java. (That's one of the reasons why I love C#.) And at the end of all this, the reader will still be left wondering why it is a separate function when it is only used from one and only one place in the code. These problems could be alleviated by introducing more classes into the system, but then again I do not think that a design with one class more than necessary is better than a design with one method longer than necessary. Of course my way of doing things requires comments: there where each piece of code would be documented with a self-explanatory method name, I would preface that same piece of code with a descriptive comment
within a long method
. That's why I also disagree with the author on the issue of comments.

Page 302

“You must saturate the gradient before you can reticulate the splines, and only then can you dive for the moog.” -- wow, sounds very cool, but it will be confusing to non-native speakers of English, who are bound to try to look for meaning in the sentence, whereas, I suppose, there is none. This is really a minor issue, I am not really complaining about it. I have to admit that it adds some fun to the reading even for a non-native speaker such as me. On the other hand I am afraid that this book will also be read by plenty of people whose English is just barely up to task, and who will, undoubtedly, hit their dictionaries in an effort to discover the meaning of these words, and failing to make sense after examining all synonyms in their language, they will probably call a friend who knows better English, only to puzzle them too with a sentence that was not meant to make any sense in the first place.

Page 302

You might complain that this increases the complexity of the functions, and you’d be right. But that extra syntactic complexity exposes the true temporal complexity of the situation.

Very well said. Einstein's famous quote is also about the exact same thing: “Things should be as simple as possible. But not more simple.”

Page 311

In the sentence “choose names the reflect the level of abstraction of the class or function you are working in” the first “the” should be “that”.

Page 312

Where it says “This is emphasized by the fact that there is a function named renamePage inside the function named doRename!” it should say “This is aggravated by the fact that there is a call to a function named renamePage inside the function named doRename!”