Month: October 2011

Software grows organically. One line at a time, one change at a time. These changes soon add up. In an ideal world, they add up to a coherent architecture with an intention revealing design. But sometimes software just grows hairy – full of little details that obscure the underlying logic. What makes software hairy and how can we stop it?

Hairy code

Generally code starts out clean – brand new, shiny code. But each time you make a change that doesn’t quite fit the original design you add a hair – a small, subtle detail. It doesn’t detract from the overall purpose of the code, it just covers a specific detail that wasn’t thought of originally. One hair on its own is fine. But then you add another, and another, and another. Before you know it, your clean, shiny code is covered in little hairs. Eventually code becomes so hairy you can’t even see the underlying design any more.

Let’s face it, we’re all basically maintenance programmers. How many of us actually work on a genuinely greenfield project? And anyway, soon after starting a greenfield project, you’re changing what went before and you’re back into maintenance land. We spend most of our time changing existing code. If we’re not careful, we spend most of our time adding new hairs.

The simplest thing

When changing existing code, there’s a temptation to make the smallest change that could possibly work. Generally, it’s a good approach. Christ, TDD is great at keeping you focused on this. Write a test, make it pass. Write a test, make it pass. Do the simplest thing that could possibly work. But, you have to do the refactor step. “Red, green, refactor“, people. If you’re not refactoring, your code’s getting hairy. If you’re not refactoring, what you just added is a kludge. Sure, it’s a well tested, beautifully written kludge; but it’s still a kludge.

The trouble is, it’s easy to forgive yourself.

But it’s just a little if statement

It’s just one little change. In this specific case we want to do something subtly different. It may not look like it, but it’s a kludge. You’ve described the logic of the change but not the reason. You’ve described how the behaviour is different, but not why. Congratulations, you just grew a new hair.

An example

Perhaps an example would help right about now. Let’s imagine we work for an online retailer. When we fulfill an order, we take each item and attempt to ship it. For those that are out of stock, we add to a queue to ship as soon as we get new stock.

As happens with online retailers, we’re slowly taking over the universe: now we’re expanding into shipping digital items as well as physical stuff. This means that some orders will be for items that don’t need physical shipment. Each item knows whether it’s a digital product or a physical product; the rights management team have created an electronic shipment management system (email to you and me) – so all we need to do is make sure we don’t try and post digital items but email them instead. Well, the simplest thing that could possibly work is:

This is all fine and dandy, until in UAT we realise that we’re showing delivery in 3 days for digital items. That’s not right, so we get a request to show immediate delivery for digital items. There’s a method on Item that calculates estimated delivery date:

After all, it’s just one more if, right? Where’s the harm? But little by little the code is getting hairier and hairier.

The trouble is you get lots of little related hairs smeared across the code. You get a hair here, another one over there. You know they’re related – they were done as part of the same set of changes. But will someone else looking at this code in 6 months time? What if we need to make a change so users can select electronic and/or physical delivery for items that support both? Now I need to find all the places that were affected by our original change and make more changes. But, they’re not grouped together, they’ve been spread all over. Sure, I can be methodical and find them. But maybe if I’d built it better in the first place it would be easier?

A better way

This all started with a little boolean flag – that was the first smell. Then we find ourselves checking the state of the flag and switching behaviour based on it. It’s almost like there was a new domain concept here of a delivery method. Say, instead I create a DeliveryMethod interface – so each Item can have a DeliveryMethod.

Now all the logic about how different delivery methods work is local to the DeliveryMethod classes. This groups related changes together; if we later need to make a change to delivery rules we know exactly where they’ll be.

Discipline

Ultimately writing clean code is all about discipline. TDD is a great discipline – it keeps you focused on the task at hand, only adding code that is needed right now; all the while ensuring you have near complete test coverage.

However, avoiding hairy code needs yet more discipline. We need to remember to describe the intention of our change, not just the implementation. Code is primarily to be read by humans so expressing the reason the code does what it does is much more important than expressing the logic. The tests only ensure your logic is correct, you also need to make sure your code reveals it’s reasoning.

Advertisements

Share this:

Like this:

Software architecture is hard. Creating a simple, consistent, flexible environment in which we can solve the customer’s ever-changing problems is no easy task. Keeping it that way is harder still. Striking the right balance between all the competing demands takes real skill – so what does it take to create a good architecture? How much architecture is enough?

Software Architecture

First, I’m drawing a distinction between software architecture and enterprise architecture. By software architecture I mean the largest patterns and structures in the code you write – the highest level of design detail. I do not mean what is often called enterprise architecture: what messaging middleware to use, how are services clustered, what database platforms to support. Software architecture is the stuff we write that forms the building blocks of our solution.

The Over Architect

I’m sure we’ve all worked with him: the guy who could over think hello world. When faced with a customer requirement his immediate response is:

we need to build a framework

Obviously the customer’s problem is too simple for this genius. Instead, we can solve this whole class of problems. Once we’ve built the framework, we just need to plug the right values into the simple 400 line XML configuration file and hey presto! customer problem solved.

Sure, we’ve only been asked to do a one-time CSV import of some customer data. But think of the long-term, what will they ask for next? What about the next customer? We should write a generic data import framework that could take data in CSV, XML or JSON; communicating over HTTP, FTP or email. We can build rich, configurable validation logic. We can write the data to any number of database platforms using a variety of standard ORM frameworks. Man, this is awesome, we could be busy for months with this!

Whatever. You Ain’t Gonna Need It!

But sometimes, the lure of solving a problem where you’re the customer, is much more intellectually stimulating than solving the boring old customer’s problems. You know, the guy who’s paying the bills.

The Over Architect generalises from a sample size of one. Every problem is an opportunity to build a more general solution, despite having no evidence for what other cases might need to be solved. Every problem is an opportunity to bring in the latest and greatest technology – whether or not its a good fit, whether or not the company’s going to be left supporting some byzantine third party library that’s over kill for their simple use. An architect fully versed in CV++

The Under Architect

On the other hand, the Under Architect looks at every customer problem and thinks:

we could re-use what we did for feature X

Where by “re-use” he means copy & paste, change as necessary. There’s no real architecture, just patterns repeated ad infinitum. Every new requirement is an opportunity to write more, new code. Heaven forbid we go back and change any of that crufty old shit. No, we’ll just build shiny, brand new legacy code.

We’re building a web application: so we’ll need some Controllers, some Views and some Models. There we go, MVC – that counts as an architecture, right? Oh, we need a bit more. Well, we’ve got some DAOs here for interacting with the database. And the business logic? Well, the stuff that’s not wrapped up in the controllers we can put in FooManager classes. Sure, these things look like monolithic god classes – but its the best way to aggregate all the related functionality together.

Lather, rinse, repeat and before you know it you have a massive application with minimal structure. The trouble is, these patterns become self-perpetuating. It’s hard to start pulling out an architecture when all you have is a naming convention.

The Many Architects

The challenge in many software teams is everyone thinks it’s their job to come up with new architecture or start building a new framework. The code ends up littered with half-finished, half-forgotten frameworks. Changing anything becomes a nightmare: was all this functionality used? We have three different ways of importing data, via three different hand-rolled frameworks – which ones are used? How much of each one is used? Can I refactor them down into one? Two? What about the incompatibilities and subtle differences?

Without a clear vision changing the code becomes like archeology. As you delve down through the layers you uncover increasingly crufty old code that nobody dares touch any more. It becomes less of a software architecture and more of a taxonomy problem – like Darwin trying to identify a million different species by their class structure.

The Answer

What’s the answer? Well I’m sorry, but I just don’t buy this agile bullshit about “emergent architecture”. Architecture doesn’t emerge, it has to be imposed, often onto unwilling code.

Architecture requires a vision: somebody needs to have a clear idea about where the software is headed. Architecture needs patience: as we learn more about the problem and the solution, the architecture will have to adapt. Architecture needs consistency: if the guy calling the shots changes every year or two, you’ll be back to the Many Architects problem.

Above all, I think good architecture needs a dictator. Some, single person – taking responsibility for the architecture. They don’t need to be right, they just need to have a clear vision of where the architecture should head. If the team are on board with that vision then the whole team are pulling in the same direction, guided by one individual taking the long view.

Central Architecture Group

This sounds like I’m advocating a central architecture group? Hell no. The architect needs to be involved in the code, hands-on, day-to-day, so he can see the consequences of his decisions. He needs the feedback from how the product evolves and how our understanding of the problem evolves. The last thing you need is a group of ivory tower architects pontificating about whether an Enterprise Service Bus is going to solve all our problems. Hint: it won’t, but firing the central architecture group might.

Conclusion

Getting software architecture right is a hard problem. If you keep your code DRY and SOLID, you’re heading in the right direction. If someone has the vision for where the code should head and the team work towards that, relentlessly cleaning up old code – then maybe, just maybe you’ve got a chance.

Share this:

Like this:

The anaemic domain model is a really common anti-pattern. In the world of ORM & DI frameworks we naturally seem to find ourselves with an ORM-managed “domain” that is all data and no behaviour; coupled with helper classes that are all behaviour and no data, helpfully injected in by our DI framework. In this article I’ll look at one possible approach for implementing a rich domain model – this example uses Guice, although I’m sure Spring etc would have different ways of achieving the same thing.

The problem

All the source code can be found on github. The “master” branch shows the original, badly factored code. The “rich-domain” branch shows the solution I describe.

Anaemic domain model

First, our anaemic domain model – TradeOrder.java. This class, as is traditional with Hibernate, has a load of annotations describing the data model, fields for all the data, accessors and mutators to get at the data and nothing else of any interest. I assume, in this domain, that TradeOrders do things. Maybe we place the order or cancel the order. Somewhere along the line, the key objects in our domain should probably have some behaviour.

Helper class

In this really simple example, we need to figure out the value of the order – i.e. the number of shares we want to buy (or sell) and the price per share we’re paying. Unfortunately, because this involves dependencies the behaviour doesn’t reside in the class it relates to, instead its been moved into an oh-so-helpful helper class.

Take a look at FiguresFactory.java. This class only has one public method – buildFrom. The goal of this method is to create a Figures from a TradeOrder.

Besides the “effective date” (whatever that might be), the only input this method takes is the TradeOrder. Using the copious number of getters on TradeOrder it asks for data to operate on, instead of telling the TradeOrder what it needs. In an ideal, object-oriented system, this would have been a method on TradeOrder called createFigures.

Why did we end up here? It’s all the dependency injection framework’s fault! Because the process of creating a Figures object requires us to resolve prices and currencies, we need to go and lookup this data – using injectable dependencies. Our anaemic domain can’t have dependencies injected, so instead we inject them into this little helper class.

What we end up with is a classic anaemic domain model. The TradeOrder has the data; while numerous helper classes, like FiguresFactory, contain the behaviour that operate on that data. It’s all very un-OO.

A better way

Data record

The first step is to create a simple value object to map rows from the database – I’ve called this TradeOrderRecord.java. This looks much like the original domain object, except I’ve removed the accessors & mutators to make it clear that this is a value object with no behaviour.

To make constructing these record objects easier, I’ve used karg, a library written by a colleague of mine – this requires us to declare the set of arguments we can use to construct the record with, and a constructor that accepts a list of arguments. This greatly simplifies construction and avoids us having a constructor which takes 27 strings (for example).

The rich domain

Our goal is to make TradeOrder a rich domain object – this should have all the behaviour and data associated with the domain concept of a “TradeOrder”.

Data

The first thing TradeOrder will need is, internally, to store all the data associated with a TradeOrder (at least as a starting point, the unused fields hint that we might be able to simplify this further).

We make the data immutable. Immutable state is generally a good thing – and here it forces us to be clear that this is a fully populated TradeOrder, and since it has an id, it is always associated with a row in the database. By making TradeOrder immutable the obvious question is – how do I update an order? Well, there are numerous ways we could choose to do that – but that is a different story for a different time.

We also do not need accessors. Since we plan on putting all the behaviour that relates to TradeOrder on the TradeOrder class itself, other classes should not need to ask for information, they should only need to tell us what they want to achieve.

Note: there is one (now deprecated) accessor – that hints at a further behaviour that ought to be moved.

Dependencies

Besides the fields to store data, TradeOrder will also have fields representing injectable dependencies.

Some people will find this offensive – mixing dependencies with data. However, personally, I think the trade-off is worth it – the benefit of being able to define our behaviours on the class they relate to is worth it.

Behaviour

Now we have the data and the dependencies all in one place, it is relatively easy to move across the methods from FiguresFactory:

Construction

The last thing we need to tackle is how to create instances of TradeOrder. Since the fields for data and dependencies are all marked as final, the constructor must initialise them all. This means we need a constructor that takes the dependencies and a TradeOrderRecord (i.e. the value object we read from the database):

This isn’t particularly pretty, but the key thing to note is the @Assisted annotation. This allows us to tell Guice that the other dependencies are injected normally, whereas the TradeOrderRecord should be passed through from a factory method. The factory interface itself looks like this:

We don’t need to implement this interface, Guice provides it automatically. TradeOrder.Factory becomes an injectable dependency we can use from elsewhere when we need to create an instance of TradeOrder. Guice will initialise the injectable dependencies as normal, and the assisted dependency – TradeOrderRecord – is passed through from the factory. So our calling code doesn’t need to worry that our rich domain needs injectable dependencies.

Conclusion

By mixing dependencies and data together into a rich domain model we are able to define a class with the right behaviours. The obvious code smell in TradeOrder now is that the detailed mechanics of creating a Figures is probably a separate concern and should be broken out. That’s ok, we can introduce a new dependency to manage that – as long as the TradeOrder is still the starting point for constructing the Figures object.

By having the behaviours in a single place our domain model is easier to work with, easier to reason about and it’s easier to spot cases of duplication or similarity. We can then refactor sensibly, using a good object model, instead of introducing arbitrary distinctions into helper classes that are function libraries, not participants in the domain.