No, Seriously, It's Naming

I just read David Bryant Copeland’s post
It’s not Naming That’s Hard—It’s Types which he wrote in
response to Katrina Owen’s
What’s in a Name? Anti-Patterns to a Hard Problem and I feel
compelled to say a few words. Katrina’s post provides some
suggestions around the perennial developer challenge of naming things
in code. Along the way she asserts that including the type of a
variable in its name is typically an antipattern, and “type
information is just not that compelling”–something David took great
exception to.

David argues that the actual problem is we don’t have enough types:
“types are a better way to solve the problems Katrina identifies in
her post.”

He then proceeds to turn Katrina’s perfectly reasonable bit of code&mldr;

But after unflipping my table, taking a long walk, and having a big
mug of coffee, I think I’m ready to speak again. The truth is I can
understand why David went down this road. In fact, it’s perfectly
reasonable Kingdom of Noun thinking, and possibly something I would
have done a few years back. But I’ve come to believe it’s backwards,
so I’ll make my case:

Performance

Let’s just get this one out of the way because it’s the easiest but
not the one I want to focus on: David’s version wraps every single
character in a Ruby object–this is spectacularly wasteful of
memory.

Someone’s going to point out that we shouldn’t be using Ruby if we
care about performance. This argument is specious; we shouldn’t use
Ruby if we care most about performance, and I certainly tend to
optimize to use machine cycles instead of human cycles, but don’t
assume you won’t ever find yourself wishing for better performance in
a part of an application where Ruby was generally a good choice.

Naming

The naming situation didn’t even improve–now we just have a
proliferation of things to name! Really! Look again!

Complexity

There’s one very obvious difference between the two solutions:
length. Length is not, itself, a deal-breaker, but
all-things-being-equal, it is a disadvantage. Every line of code is
an opportunity for bugs.

If there’s a bug in both of these implementations, which would you
rather work on? The one with two functions1, testable in
isolation, and completely referentially transparent? Or the one with
two classes, three times as many lines, and state?

But more importantly, ask yourself: why is there more code?

Look at the methods David had to (re-)implement: the to_s methods,
==, <=>. What do the contents of these methods have to do with
finding anagrams? Nothing. This is what’s called “incidental
complexity”; this code exists solely to solve challenges introduced by
the shape of the solution–it doesn’t actually relate to the details
of the problem we’re trying to solve (the “intrinsic complexity”).

Looking at the contents of those extraneous methods, all they do is
allow access to underlying capabilities of the (wrapped) String
class. Katrina’s version didn’t require this rigmarole because she
didn’t wrap the String class.

Let’s put it this way: let’s say we have a perfectly good humanoid
robot, and we want to teach that robot to serve us gin martinis.
Katrina’s solution says: “OK, robot, follow these commands to make the
martinis happen”. Simple! David’s solution says “robot, I’m gonna
throw a bag over your entire body, then cut two arm holes (so you can
reach the booze, of course!), then eye holes, and then you can make
martinis!".

And if this whole one-step-forward-two-steps-back thing weren’t
strange enough, it’s actually even stranger–not just any
bartending-capable bag will do–it has to be exactly the one labeled
Word2.

If we think about this in the abstract, David’s version is saying to
us: “I could do my job perfectly fine as a function that manipulates
what you already have, but instead I want you to transform yourself to
a new format I invented before I’ll let you use my functions." Why do
we want this?

Think about what we’re doing here–we’re creating two new classes
(unnecessary) in order to annotate that certain values can participate
in certain interactions. If we really feel it’s important to call
these out, programming provides a mechanism for this–it’s called an
interface. An interface is a way to tell a type checker or a human
that one thing works with another. This idea exists because a value
cannot reasonably change its identity (what wrapping it in a class
does) just to participate in an interaction with a piece of code.
Expecting it to assumes that the receiving bit of code is the most
important thing in the program and thus the rest of the world should
conform to it. This kind of thinking leads to a proliferation of
odd, single-use wrapper classes and incompatible objects.

Now, Ruby doesn’t have reified interfaces. It’s built on the idea of
duck typing, which says that an interface should be descriptive
(“what can you respond to?") rather than prescriptive (“what are
you supposed to respond to?"). We can argue about whether or not
that is a good idea, but that is quite definitely the Ruby way, and
one can at least say it does lead to some useful and fun tricks (at
the expense of safety and informative annotation).

But both duck typing and reified interfaces capture the central
point–to the extent possible3, we should avoid demanding that
arguments be certain things, we should only demand that they do
certain things.

I can understand how we got here. There’s a temptation to scope
objects down to the bits we care about; to make them clean, and
their possible operations easily enumerable. It’s a prophylactic. A
form of isolation. But this presumes that we know everything anyone
will ever want to do with the objects. And that means we need to
know the future! When we really think about writing good software,
software that can survive the long-haul of requirements changes, of
developer turnover–are we really going to look back and think “thank
Minsky I blocked access to the length method on String, that
really could have cooked my goose!", or are we just being overly picky
about aesthetics?

One might suggest that we’re abstracting away the details of the
String class, and abstraction enables flexibility and code reuse,
right? Well, let’s look at the details and decide–starting with a
question:

What’s true about a String that’s not true about a Word? Answer:
we all know WTF a String is!

But more importantly, so does every debugger, REPL, and test
framework–they all know how to work with these things. I can
serialize them, transmit them, clone them, visualize them, compare
them. I know they’re value objects. I know their concurrency
semantics. I know their performance implications.

In David’s post he says:

Strings (and Hashes) are great for exploring your domain, but once
you understand your domain, data types will make your code easier to
understand and easier to change.

I vehemently disagree. Hold fast to pure data, and only yield ground
under exceptional circumstances. In your career you’ll be burned
many, many more times by the opacity and statefulness of objects than
you will reap the rewards of transparently reworking objects’ innards.

When you’re trying to recreate a complex application state to
understand a bug, you’ll be much happier if your data is composed of
core data types rather than a graph (possibly with cycles!) of dynamic
objects (possibly opaque!) which may be dynamically generating
branches as you traverse it.

When you’re trying to test a piece of code in isolation, you’ll want
to feed it pure data, not spend hours trying to figure out which
series of constructors can manufacture the appropriate tree. In fact,
if the piece of code you want to test requires running a constructor
function, you literally can’t test it in isolation.4

With pure data you can dump a readable version from a live production
server, take it home with you, and have a perfect snapshot of a real
bug. With a graph of objects? God help you.

So, back to the original question–is wrapping values in objects like
this composable abstraction?

No! It doesn’t compose at all. Let me say that again: insisting
upon object identities is antithetical to the idea of composable
abstraction. And this is why the promise of OO as a silver bullet did
not come true–we were sold the idea that objectifying something would
make it more reusable, but what we ended up with is something we can’t
use anywhere but a single location!

So what went wrong? Well, as Rich Hickey points out in
Simple Made Easy (49m22s), abstraction that just hides things
is not important. It’s a kind of faux abstraction. Real
abstraction is about not needing to know things. And this code does
the exact opposite of that: instead of not caring specifically what it
operates on, it chooses to operate on a single new thing. This is
the opposite of abstraction.

Now, lest you think I’m coming down on David too hard, I want to
mention something–David is smart guy. You know how I know? He wrote
something really good about exactly this problem, and I recommend
you go over to his blog and read it. It’s called
Dishonest Abstractions are Not Abstractions. He says:

In my book, I encourage the reader to use JavaScript and learn SQL,
because the tools given to you by Rails aren’t abstractions—they are
extra things to learn that provide at best a marginal increase in
productivity, and that productivity only applies during the least
time-consuming part of software development: typing in source code.

In your head, substitute “functions” for “JavaScript and learn SQL”,
and “wrapper classes” for “Rails”. He continues:

These tools don’t meet any higher-order need a developer has. They
provide the ability to execute code only and when compared to the
technologies they replace, they appeal more to aesthetics than the
ability to better deliver quality software.

Please don’t think I’m being facetious–I think he’s making an
important point, and I think that point also happens to apply nicely
to the problem at hand.

Well, that’s basically my rant.

I will yield this: one thing we lost along the way is the knowledge
about what types we’re expected to use with code. Try a comment.
Or even a pre-condition / guard clause. Hold fast to true
simplicity–it’s the best friend a developer has.

I’ll end with this slide borrowed from Simple Made Easy5; I
hope you’ll see how well it applies to this conversation:

Happy hacking. 👋

Update:

In fairness, I figured I’d share what my solution to this “problem”
would look like so I’m equally subjected to scrutiny:

In general, don’t use a class when a function will do. Classes are namespaces + functions + mutable state. Don’t give mutable state places to live. ↩︎

OK, there’s probably a special place in hell for people who mix metaphors, but you likely know what I mean. ↩︎

The reason for this hedge is that at the very bottom, things may change a touch. If our built-in types aren’t objects, we’ll need to worry about their identities. But that’s fine for value objects. ↩︎

This is another thing Rich says really well in Simple Made Easy (~56m): “information is simple. The only thing you can possibly do with data is ruin it. Don’t do it.&mldr;If you leave data alone you can build things once that manipulate data and use them all over the place.” ↩︎

If you haven’t watched it, go now! It’s an important talk, and one of the reasons I write Clojure. ↩︎