What's a Method to Do?How to Maximize Cohesion While Avoiding Explosionby Bill VennersFirst Published in JavaWorld, April 1998

Summary
In this installment of the Design Techniques column,
brush up on how -- and why -- to divide a class's functionality among
its methods. I demonstrate how to
maximize method cohesion while keeping the total number of methods to a
manageable level.

In
last
month's Design Techniques column, I told half of
the method design story: minimizing method coupling. In this month's
installment, I'll reveal the other half of the story: maximizing method
cohesion.

As with last month's column, "Designing fields and methods," the
principles discussed may be familiar to many readers, as they apply to
just about any programming language. But given the vast quantity of
code I have encountered in my career that didn't benefit from
these basic principles, I feel it is an important public service to
address the basics in the early installments of this column. In
addition, I have attempted in this article to show how the basic
principles apply in particular to the Java programming language.

Cohesion
Methods do things. On a low level, they do things such as accept data
as input, operate on that data, and deliver data as output. On a higher
level, they do things such as "clone this object," "print this string
to the standard output," "add this element to the end of this vector,"
and "add this much coffee to this cup object."

Minimizing coupling (the topic of last month's article)
requires you to look at methods on a low level. Coupling looks at how
the inputs and outputs of a method connect it to other parts of the
program. By contrast, maximizing cohesion requires that you
look at methods on a high level. Cohesion looks at the degree to which
a method accomplishes one conceptual task. The more a method is focused
on accomplishing a single conceptual task, the more cohesive that
method is.

Why maximize cohesion?
The more cohesive you make your methods, the more flexible (easy to
understand and change) your code will be. Cohesive methods help make
your code more flexible in two ways:

If your method is focused on a single conceptual task, you can more
easily choose a method name that clearly indicates what your method
does. For example, a method named int convertOzToMl(int
ounces), which converts ounces to milliliters, is easier to
comprehend at first glance than a method named int convert(int
fromUnits, int toUnits, int fromAmount). At first glance, you
could guess that the convert() method may be able
to convert ounces to milliliters, but even if that were so, you would
need to do more digging to find out what fromUnits value
represents ounces and what toUnits value represents
milliliters. The convertOzToMl() method is more cohesive
than the convert() method because it does just one thing,
and its name indicates what that thing is.

Cohesive methods help make your code more flexible because changes
are easier to make when you can draw upon a set of methods, each of
which performs a single conceptual task. Cohesive methods increase the
odds that when you need to change a class's behavior at some point in
the future, you'll be able to do so by writing code that invokes
existing methods in a new way. In addition, changes to an existing
behavior are more isolated if that behavior is encased in its own
method. If several behaviors are intermixed in a single (non-cohesive)
method, changes to one behavior may inadvertently add bugs to other
behaviors that share that same method.

Low cohesion
As an example of a method that is not very functionally cohesive,
consider this alternate way of designing a class that models coffee
cups:

CoffeeCup's modify() method is not very
cohesive because it includes code to do tasks that, conceptually, are
quite different. Yes, it is a useful method. It can add, sip, and
spill, but it can also perplex, befuddle, and confuse. This method is
difficult to understand partly because its name, modify(),
isn't very specific. If you tried to make the name more specific,
however, you would end up with something like
addOrSipOrSpill(), which isn't much clearer.

Another reason modify() is hard to understand is that some
of the data passed to it or returned from it is used only in certain
cases. For example, if the action parameter is equal to
CoffeeCup.ADD, the value returned by the method is
meaningless. If action equals
CoffeeCup.SPILL, the amount input parameter
is not used by the method. If you look only at the method's signature
and return type, it is not obvious how to use the method.

Figure 1: Passing control down to modify().

See Figure 1 for a graphical depiction of this kind of method. In this
figure, the circle for the action parameter is solid
black. The blackened circle indicates that the parameter contains data
that is used for control. You can differentiate data that is used for
control from data that isn't by looking at how a method uses each piece
of input data. Methods process input data and generate output data.
When a method uses a piece of input data not for processing, but for
deciding how to process, that input data is used for control.

To maximize cohesion, you should avoid passing control down into
methods. Instead, try to divide the method's functionality among
multiple methods that don't require passing down control. In the
process, you'll likely end up with methods that have a higher degree of
cohesion.

By the way, it is fine to pass data used for control back up from a
method. (Throwing an exception is a good example of passing control
up.) In general, up is the direction control should go: Data used for
control should be passed from a method back to the method that invoked
it.

Medium cohesion
To increase the method cohesion of the previous CoffeeCup
class, you could divide the functionality performed by
modify() into two methods, add() and
remove():

This is a better design, but it's not quite there yet. Although the
add() method does not require you to pass down control,
the remove() method still does. The boolean parameter
all indicates to the remove method whether or
not to remove all coffee (a spill) or to remove some coffee (a sip). In
the case of a sip, the amount parameter indicates the
amount of coffee to remove (the size of the sip). The graphical
depiction of the remove() method, shown in Figure 2, shows
a blackened circle heading down for the all parameter just
as modify() had a blackened circle heading down for the
action parameter. It also includes a parameter,
amount, that is not always used, just as
modify() is not always used. For remove(), if
all is false, amount indicates
the amount of coffee to remove. If all is
true, amount is ignored.

High cohesion
A better design for the CoffeeCup class is to divide remove()
into two more methods, neither of which accept control data as input or
have parameters that are used only part of the time. Here
remove() has been divided into
releaseOneSip() and spillEntireContents():

As you can see, the process of removing input data used for control
yields more methods, each with a more focused functionality. Instead of
indicating your wishes to one comprehensive method by passing down a
command as a parameter, you call a different method. For example,
instead of saying:

As described earlier, this approach to method design yields code that
is easier to understand because each method is responsible for
performing one conceptual function, and the method's name can describe
that one function. Such code is also easier to understand because the
data passed in and out are always used and valid. In this example,
add(int),int
releaseOneSip(int), and
spillEntireContents() are easier to understand at first
glance than the int modify(int, int) from the
low cohesion example.

In addition, this approach to method design yields code that is more
flexible, because it is easier to change one functionality without
affecting the others. For example, if you wanted to make some
adjustments to the spilling behavior of the coffee cup class with
modify(), you would have to edit the body of
modify(). Because the code for spilling is intermingled in
modify() with the code for sipping and adding, you might
inadvertently introduce a bug in the adding behavior when you enhance
the spilling behavior. In the CoffeeCup class with
separate methods for adding, spilling, and sipping, your chances are
better that you can enhance the spilling behavior without disturbing
the adding and sipping behaviors.

Reducing assumptions
Functionally cohesive methods also increase code flexibility because
they make fewer assumptions about the order in which particular actions
are performed. Here is an example of a method that is not very
functionally cohesive because it assumes too much:

This CoffeeCup object keeps track not only of the amount
of coffee it contains (innerCoffee), but also of the
amount of cream (innerCream) and sugar
(innerSugar). As you would expect, the add()
method accepts an amount of coffee to add, then increments
innerCoffee by that amount; however, add()
doesn't stop there. It assumes that anyone wishing to add coffee to a
cup also would want to add some cream and sugar, in fixed amounts
relative to the amount of coffee added. So add() goes
ahead and adds the cream and sugar as well.

The design of this method reduces code flexibility because later, if a
programmer wanted to add coffee with cream, but no sugar, this method
would be of no use. A more flexible design would be:

These methods are more functionally cohesive because they each do one
thing. This design is more flexible because the programmer can call any
of the methods at any time and in any order.

Cohesion is high-level
Although all of the examples of functionally cohesive methods so far
have accepted, processed, and returned a very small amount of data,
this is not a characteristic shared by all functionally cohesive
methods. Cohesion means doing "one thing" on a high level, as in
performing one conceptual activity -- not at a low level, as in
processing one piece of data. For example, perhaps your virtual
café has a regular customer, Joe, who always wants his coffee
prepared with 30 parts coffee, 1 part cream, and 1 part sugar. If so,
you could create a method such as:

The prepareCupForJoe() method is functionally cohesive
even though, on a low level, it performs exactly the same function as
the add() method, shown earlier, that wasn't functionally
cohesive. The reason is that, conceptually, this method is preparing a
cup of coffee for Joe, just the way he always likes it. The
add() method, on the other hand, was adding coffee to a
cup, and incidentally, also adding cream and sugar. Although
add(), a member of class CoffeeCup, did not
allow coffee to be added to a cup without also adding cream and sugar,
prepareCupForJoe() involves no such restriction. Because
prepareCupForJoe(), a member of class
VirtualCafe, invokes addCoffee(),
addCream(), and addSugar() from class
CoffeeCup, it in no way prevents other methods from
filling a cup with only coffee and cream, or any other combination in
any proportion.

Method explosion
Although these guidelines can help you create code that is more
flexible, following the guidelines too strictly can lead your code down
the wrong alley. Factoring methods, such as modify(), into
more cohesive methods, such as add(),
releaseOneSip(), and spillEntireContents()
leads to more methods. At some point your class will get difficult to
use simply because it has too many methods.

As with all guidelines, you must use this one wisely and selectively.
In general, you should maximize the cohesion of your methods. In
general, you should avoid passing control down into a method. But there
are some circumstances in which you should violate both of
these principles.

For example, you may have a class that must respond differently to each
of 26 different keypresses, one for each letter of the alphabet.
Passing down the key to a method named void handleKeypress(char
key) could reasonably be called passing down control, especially
if the first thing you do is a switch(key), then have a
case statement for each letter of the alphabet. But in this case, a
single handleKeypress(char key) likely is easier to use
and just as easy to understand as 26 separate methods
handleAKeypress(), handleBKeypress(),
handleCKeypress(), and so on.

Thus, you need to strike a balance between maximizing method cohesion
and keeping the number of methods in a class to a reasonable level.
There's no hard and fast rule to help you do this -- you just have to
use your own judgment. After all, that's why they pay you the big
bucks.

Cohesion in coding and writing
When you write software in a commercial environment, you aren't just
writing down instructions for a machine, you are communicating your
intentions to other human beings -- fellow programmers who may someday
need to read your code to understand how to use or modify it. In a
sense, when you code, you are writing, and many of the principles of
good writing can be applied to good coding.

For example, a well-named, cohesive method in a Java program is
analogous to a well-written paragraph in expository writing. A
paragraph should have a main idea, usually stated in a topic sentence.
Just as the topic sentence indicates to a prose reader the main idea of
a paragraph, a descriptive method name tells a code reader what service
that method performs. In a paragraph, every sentence should directly
support the main idea. A paragraph, like a cohesive method, should be
about one thing (indicated by the topic sentence), and all the
sentences in the paragraph should be focused on that one thing.

A paragraph that doesn't stick to its topic, like a method that isn't
cohesive, is harder to understand than a paragraph that focuses
exclusively on its topic. Coupling, the topic of my previous article,
doesn't really have an analog in the writing domain, but as I said
previously in this article, a cohesive method maps to a well-written
paragraph. Another metaphor for cohesion is glue. There are many kinds
of glue, some of which emanate a stronger smell than others, which may
give you a headache in certain cases. One of my favorite glues was the
kind of paste we used in kindergarten. The lid had a brush attached to
it, and we would -- oops, I just spilled some mocha on my printout --
cut out different colors of construction paper and glue them together.
Those were the days...

See what I mean? What do coupling, glue, headaches, kindergarten,
mochas, and construction paper have to do with the difficulty in
understanding a non-cohesive paragraph? Not much. In fact, these
sentences comprise an example of a paragraph that doesn't stick to its
topic.

Unfortunately, I have encountered many methods, functions, and
subroutines over the years that were as much an amalgam of unrelated
parts as that paragraph. And while an unfocused paragraph may
be somewhat amusing to read, the unfocused functions I've encountered
have usually brought me more anguish than amusement.

So please try to keep in mind as you program in Java that you aren't
just giving instructions to a Java virtual machine (JVM), you are
communicating through your code to your programming peers. Making your
code easier to understand and change will help you earn the respect and
admiration of your colleagues, but it can also help you in a more
direct way. Don't forget that a few years or months (or minutes) down
the road, the person called upon to maintain your code just might be
you.

Conclusion
A set of well-named, functionally cohesive methods serves as an outline
for someone trying to understand a class. It gives a good overview of
what a given object does and how it should be used. You should try to
declare one method for each conceptual activity your object can
perform, and give each method a name that describes the activity as
specifically as possible.

Cohesion is not a precise measurement. It is a subjective judgment.
Likewise, the process of deciding whether a particular piece of data is
used for control or whether a particular class has too many methods is
subjective. The point of these guidelines is not to define a precise
measure for good method design, but to suggest a mental approach to
take when designing methods. This approach can help you create code
that is more flexible and more easily understood. As with all
guidelines, however, they are not laws. When you design methods,
ultimately you should do whatever you think will best communicate to
your programming peers what the method does .

The gist of this article can be summarized in these guidelines:

The maximize-cohesion mantra
Always strive to maximize the cohesion of methods: Focus each method on
one conceptual task, and give it a name that clearly indicates the
nature of that task.

The watch-what's-going-down guideline
Avoid passing data used for control (for deciding how to
perform the method's job) down into methods.

The method-explosion-aversion principle
Balance maximizing method cohesion (which can increase the number of
methods in a class) with keeping the number of methods in a class to a
manageable level.

The golden rule
When designing and coding, do unto other programmers (who will be
maintaining your code) as you would have them do unto you (if you were
to maintain their code).

Next month
In next month's Design Techniques I'll continue the
mini-series of articles that focus on designing classes and objects.
Next month's article, the fourth of this mini-series, will discuss
designing objects for proper cleanup at the ends of their lifetimes.

A request for reader participation
Software design is subjective. Your idea of a well-designed program may
be your colleague's maintenance nightmare. In light of this fact, I am
trying to make this column as interactive as possible.

I encourage your comments, criticisms, suggestions, flames -- all kinds
of feedback -- about the material presented in this column. If you
disagree with something, or have something to add, please let me know.

About the author
Bill Venners has been writing software professionally for 12 years.
Based in Silicon Valley, he provides software consulting and training
services under the name Artima
Software Company. Over the years he has developed software for the
consumer electronics, education, semiconductor, and life insurance
industries. He has programmed in many languages on many platforms:
assembly language on various microprocessors, C on Unix, C++ on
Windows, Java on the Web. He is author of the book: Inside the Java
Virtual Machine, published by McGraw-Hill.
Reach Bill at bv@artima.com.

This article was first published under the name
What's a Method to Do?
in JavaWorld, a division of Web Publishing,
Inc., April 1998.