In Praise Of Small Code

Keeping class size small--no more than fits on one screen--goes a long way to reducing complexity in code, which also leads to code that's easier to test.

If you've been doing object-oriented programming for a while, you've surely run into the seemingly endless essays on testability. The debate focuses on how to write code to make it more amenable to automated testing. The topic is particularly intriguing to exponents of test-driven development, who argue that if you write tests first, your code will be inherently testable.

In real life, however, this is not always how it happens. Developers using test-driven development frequently shift to the standard code-before-tests approach when hacking away at a complex problem or one in which testability isn't easily attained. They then write tests after the fact to exercise the code; then modify the code to increase code coverage. There are good reasons why code can be hard to test, even for the most disciplined developers.

A simple example is testing private methods; a more complex one is handling singletons. These are issues at the unit testing level. At higher levels, such as user acceptance testing (UAT), a host of tools help provide testability. Those products, however, tend to focus principally on the user interface aspect (and startlingly few handle GUIs on mobile devices). In other areas, such as document creation, there is no software that provides automated UAT-level validation because parsing and recreating the content of a report or output document is often an insuperable task.

What makes code untestable, however, frequently isn't any of these things. Rather, it's excessive complexity. High levels of complexity, generally measured with the suboptimal cyclomatic complexity measure (CCR), is what the agile folks correctly term a "code smell." Intricate code doesn't smell right. It generally contains a higher number of defects and it's hard--sometimes impossible--to maintain. Fortunately, there are many techniques available to the modern programmer to reduce complexity. One could argue that Martin Fowler's masterpiece, Refactoring, is dedicated almost entirely to this topic. (Michael Feathers' Working Effectively With Legacy Code is the equivalent tome for the poor schlemiels who are handed a high-CCR code base and told to fix it.)

My question, though, is how do you avoid creating complexity in the first place? This topic too has been richly mined by agile trainers, who offer the same basic advice: Follow the open-closed principle, obey the Hollywood principle, use the full panoply of design patterns, and so on. All of this is good advice; but ultimately, it doesn't cut it. When you're deep into a problem such as parsing text or writing business logic for a process that calls on many feeder processes, you don't think about Liskov substitution or the open-closed principle. Typically, you write the code that works, and you change it minimally once it passes the essential tests. In other words, as you're writing the code there is little to tell you, "Whoa! You're doing it wrong."

For that, you need another measure, and I've found one that is extraordinarily effective in reducing initial complexity and greatly expanding testability: class size. Small classes are much easier to understand and to test.

If small size is an objective, then the next question is, "How small?" Jeff Bay, in a brilliant essay entitled "Object Calisthenics" (in the book The Thoughtworks Anthology), suggests the number should be in the 50- to 60-line range--essentially, what fits on one screen.

Most developers, endowed as we are with the belief that our craft does not and should not be constrained to hard numerical limits, will scoff at this number of lines--or any number of lines--and will surely conjure up an example that is irreducible to such a small size. Let them enjoy their big classes. But I suspect they are wrong about the irreducibility.

Lately, I've been doing a complete rewrite of some major packages in a project I contribute to. These are packages that were written in part by a contributor whose style I never got the hang of. Now that he's moved on, I want to understand what he wrote and convert it to a development style that looks familiar to me and is more or less consistent with the rest of the project. Since I was dealing with lots of large classes, I decided this would be a good time to hew closely to Bay's guideline. At first, predictably, it felt like a silly straitjacket. But I persevered, and things began to change under my feet. Here is what was different.

Big classes became collections of small classes. I began to group these classes in a natural way at the package level. My packages became a lot "bushier." I also found that I spent more time managing the package tree, but this grouping feels more natural. Previously, packages were broken up at a rough level that dealt with major program components, and they were rarely more than two or three levels deep. Now, their structure is deeper and wider and is a useful road map to the project.

Testability jumped dramatically. By breaking down complex classes into their organic parts and then reducing those parts to the minimum number of lines, each class did one small thing I could test. The top-level class, which replaced its complex forebear, became a sort of main line that simply coordinated the actions of multiple subordinate classes. This top class generally was best tested at the UAT level, rather than with unit tests.

The single-responsibility principle, which states that each class should do only one thing, became the natural result of the new code, rather than a maxim I needed to apply consciously.

And finally, I have enjoyed an advantage foretold by Bay's essay: I can see the entire class in the integrated development environment without having to scroll. Dropping in to look at something is now quick. If I use the IDE to search, the correct hit is easy to locate, because the package structure leads me directly to the right class. In sum, everything is more readable, and on a conceptual level, everything is more manageable.

Andrew Binstock is editor in chief of Dr. Dobb's. You can write to him at alb@drdobbs.com.

An interesting point you make there. You celebrate the birth of many new classes, sadly the combination of a lack of a proper package strategy and an explosion of the amount of classes is a recipe for disaster.

Especially the problem of where to put classes that are shared amongst classes wrecks most package-class distribution algorithms.

I've seen projects with 6000 classes making anyone who has to develop in them a depressed lump of sobbing meat.

I do believe simple objects are the way to go but one page seems a little extreme.