Thursday, January 15, 2009

I was at the No Fluff Just Stuff conference this year (5th year in a row for me), listening to Neal Ford talk about development concepts. He made a casual remark that caught and held my attention far after he had moved on to other topics. He said

Never use primitive types in your domain model

As you veteran developers know, the core of many well-built applications is a clear, concise domain model. This forms the basis for key abstractions and is the framework around which you typically hang your business logic. Most of my experience over the years has been with developing large, complex desktop applications where a good domain model is key. I assume that for medium-to-large enterprise web applications, a good domain model is also highly valued.

So what exactly does the quote above mean? I'll give you some examples. The first is from one of our early domain models where we represented network topology elements (computers, routers, etc). One of the data members of our network representation was bandwidth, as in set-the-bandwidth-of-this-network-to-X. We assumed that the number was megabits per second. Of course, we wrote some independent utility scaling routines to go to kilobits per second and gigabits per second.

Do you see the inferior design? Of course, we should never have used a primitive type (like double) to represent bandwidth. What we should have done was create a Bandwidth class and passed this to the "set" and "get" routines. This class would have had the concept of units (possibly a class like BitRate) and would have encapsulated all the conversion routines in one nice package. Maybe you would find lines of code like

double capacity = bw.inMegabitsPerSec();

And importantly constructed like this

Bandwidth bw = new Bandwidth (3.5, MegabitsPerSec);

That is so much easier to read and is far clearer than

double bw = 3.5;

A second example comes from some code that I trapsed across not a month ago. We have a switchboard-like, sockets-based server that handles communications between applications within a given suite. In sending a message through this switchboard to another application, you identify the target application by a name. The application names and versions that form a suite come from a configuration file. We allow aliases/monikers for the applications as well as trying not to be case sensitive. Part of this is convenience, and part of it is because the applications are written in diverse technologies (C++, Perl, Java, C#).

I was looking at this code for the first time, and I saw a lot of string manipulation going on. There was trimming (of spaces), folding of cases, and substring matching -a lot of it. In tracing through the code, a string that was already lower case, was set to lower case again. Clearly somewhere along the line, the programmer (or programmers) lost track of the flow of the data in the application, where messages were entering the system. The string hacking was so prolific that it made the code harder to read.

So what was missing here? Instead of using a string primitive type, how about a class called ApplicationName. It would encapsulate all the concepts of case insensitivity and of aliases. I even found code later on that emitted information to logs and to the end user, so it could even encapsulate the concept of "pretty name" (a standard, human-consumable name).

I know what a few of you are thinking. You are concerned about the performance cost of allocating all those instances. With modern optimizing compilers, you'll find that performance costs won't really be an issue. The cost that will eat your lunch is the maintenance cost for the software over time.

So the next time you are working in your domain model, and you feel the urge to use an int, double, string, or other primitive type, think again.