“Software entities (classes, modules, functions, etc.) should be open for extension but closed for modification.” [APPP]

When the requirements of an application changes, if the application confirms to OCP, we can extend the existing modules with new behaviours to satisfy the changes (Open for extension). Extending the behaviour of the existing modules does not result in changes to the source code of the existing modules (Closed for modification). Other modules that depends on the extended modules are not affected by the extension. Therefore we don’t need to recompile and retest them after the change. The scope of the change is localised and much easier to implement.

The key of OCP is to place useful abstractions (abstract classes/interfaces) in the code for future extensions. However it is not always obvious that what abstractions are necessary. It can lead to over complicated software if we add abstractions blindly. I found Robert C Martin’s “Fool me once” attitude very useful. I start my code with minimal number of abstractions. When a change of requirements takes place, I modify the code to add an abstraction and protect myself from future changes of the similar kind.

I recently implemented a simple module that sends messages and made a series of changes to it afterward. I feel it is a good example of OCP to share.

At the beginning, I created a MessageSender that is responsible to convert an object message to a byte array and send it through a transport.

After the code was deployed to production, we found out that we sent messages too fast that the transport cannot handle. However the transport was optimised for handling large messages, I modified the MessageSender to send messages in batches of size of ten.

The solution was simple but I hesitated to commit to it. There were two reasons:

MessageSender class need to be modified if we change how messages are batched in the future. It violated the Open-Closed Principle.

MessageSender had secondary responsibility to batch messages in addition to the responsibility of convert/delegate messages. It violated the Single Responsibility Principle.

Therefore I created a BatchingStrategy abstraction, who was solely responsible for deciding how message are batched together. It can be extended by different implementations if the batch strategy changes in the future. In another word, the module was open for extensions of different batch strategy. The MessageSender kept its single responsibility that converting/delegating messages, which means it does not get modified if similar changes happen in the future. The module was closed for modification.

The patch was successful, but two weeks later we figured out that we can batch the messages together in time slices and overwrite outdated messages with newer version in the same time slice. The solution was specific to our business domain of publishing market data.

More importantly, the OCP showed its benefits when we implemented the change. We only needed to extend the existing BatchStrategy interface with an different implementation. We didn’t change a single line of code but the spring configuration file.

* For the sake of simplicity, I left the message coalsecing logic out of the example.

Conclusion:
The Open-Closed Principle serves as an useful guidance for writing good quality module that is easy to change and maintain. We need to be careful not to create too many abstractions prematurely. It is worth to defer the creation of abstractions to the time when the change of requirement happens. However, when the changes strike, don’t hesitate to create an abstraction and make the module to confirm OCP. There is a great chance that a similar change of the same kind is at your door step.

The Abstract Factory pattern is an important building block for Domain Modelling. It hides the complexity of creating a domain object from the caller of the factory. It also enables us to create domain objects those have complex dependencies without worrying about when and how to inject its dependencies.

It is easier to explain the idea with a concrete example. I used to work on a project to build a simple online booking system for a heath club. The heath club provides one-to-many training sessions. The club administrators schedule training sessions in advance and publish them on a web page. Each training session has limited space. Members can reserve space for one or many people against a training session as long as there are enough slots available. Members may cancel their reservations any time.

I am going to ignore the process of creating and displaying training sessions, but concentrate on the reservation process here.

When a member fills in the number of people and click “Reserve” button on a web page, the web servlet invokes ReservationService.reserve(), which simply delegate the request to TrainingSession. The TrainingSession creates a Reservation instance and remembers it for availability checking purpose.

If a member want to cancel a particular reservation, the system calls ReservationService.cancel(). Then the ReservationService finds the right reservation instance and delegate the cancellation to it.

Nice and simple. We are going to add more challenge by asking the system to send an email to a member when he make a reservation or cancel one.

It is easy to add a reference of MailSender in ReservationService because the ReservationService is a singleton effectively. If I use spring to wire up my services, this may be as simple as adding a line of xml to my spring config file.

However this approach moves a part of the business logic into ReservationService. The domain logic is fragmented across the domain layer and the service layer. I much prefer to keep all business logic together in the domain model.

A better solution:

The Reservation class knows about the completion of reservation and cancellation. It is a good candidate to host the email logic.

However the TrainingSession is no longer able to create a Reservation instance because it cannot provide MailSender’s reference. The TraininSession does not use MailSender directly. I don’t want the TraininSession to carry a reference to MailSender around for the sole purpose of passing it to Reservation’s constructor.

The Abstract Factory pattern comes to solve my problem. Instead of instantiating a Reservation directly, the TrainingSession can use a ReservationFactory to create an instance of Reservation, passing in only the relevant business information. The actual implementation of ReservationFactory has a reference to MailSender, which the factory use to construct Reservation instances.

The factory is also a good place to generate a unique id for a new Reservation. In the example, the factory implementation use an IdAllocator to create new ids based on a sequence table in the database.

The factory is an interface, which makes easier to mock it up when unit testing domain objects. The factory should be treat as a part of the Domain model and we are safe to let other domain objects to depend on it.

The factory also decouples the caller from the actual type of the factory product. If we expand the use case further to distinguish cancellable and non-cancellable reservations, the abstract factory can instantiate different subclasses of the Reservation for different scenarios, and hide all the details from the caller at the same time.

Conclusion:

The Abstract Factory plays an important role in Domain modelling. The key benefits are:

Hide the details of creating a complex domain object.

Enables one domain object to create another object without worrying about its dependencies.

Factory can produce instances of different classes for different use cases.

The factory interface belongs to the domain model. It is used by other domain objects. It worth to consider it even just for dependency injection and id generation purpose.

Data Access Object (DAO) is a commonly used pattern to persist domain objects into a database. The most common form of a DAO pattern is a class that contains CRUD methods for a particular domain entity type.

The AccountDAO interface may have multiple implementations which use some kind of O/R mapper or executing plan sql queries.

The pattern has these advantages:

It separates the domain logic that use it from any particular persistence mechanism or APIs.

The interface methods signature are independent of the content of the Account class. When you add a telephone number field to the Account, you don’t need to change the AccountDAO interface nor its callers’.

The pattern has many questions unanswered however. What if I need to query a list of accounts having a specific last name? Am I allow to add a method to update only the email field of an account? What if I change to use a long id instead of userName? What exactly a DAO is responsible for?

The problem of the DAO pattern is that it’s responsibility is not well-defined. Many people think it as a gateway to the database and add methods to it when they find potential new ways they’d like to talk to the database. Hence it is not uncommon to see a DAO getting bloated like the one below.

In the BloatAccountDAO, I added two query methods to look up Accounts with different parameters. If I had more fields and more use cases that query the account differently, I may end up with written more query methods. The consequences are:

Mocking the DAO interface becomes harder in unit test. I need to implement more methods in the DAO even my particular test scenario only use one of them.

The DAO interface becomes more coupled to the fields of Account object. I have to change the interface and all its implementations if I change the type of fields those stored in Account.

To make things even worse, I added two additional update methods to the DAO as well. They are the direct result of two new use cases which update different subset of the fields of an account. They seem like harmless optimisation and fit into the AccountDAO interface if I naively treat the interface as a gateway to the persistence store. Again, the DAO pattern and its class name “AccountDAO” is too loosely defined to stop me doing this.

I end up with a fat DAO interface and I am sure it will only encourages my colleagues to add even more methods to it in the future. One year later I will have a DAO class with 20+ methods and I can only blame myself chosen this weakly defined pattern.

Repository Pattern:

A better pattern is Repository. Eric Evans gave it a precise description in his book [DDD], “A Repository represents all objects of a certain type as a conceptual set. It acts like a collection, except with more elaborate querying capability.”

The “add” and “update” methods look identical to the save and update method of my original AccountDAO. The “remove” method differs to the DAO’s delete method by taking an Account object rather than the userName (Account’s identifier). It you think the Repository as a Collection, this change makes a lot of sense. You avoid to expose the type of Accounts identity to the Repository interface. It makes my life easy if I’d like to use long values to identify the accounts.

If you every wonder the contracts of the add/remove/update method, just think about the Collection metaphor. If you ever think about whether to add another update methods to the Repository, think if it make sense to add an extra update method to a Collection.

The “query” method is special however. I wouldn’t expect to see a query method in a Collection class. What does it do?

The Repository is different to a Collection when we consider its querying ability. With in memory collection, it is simple to iterate through and find the one I am interested in. A repository deals with a large set of objects that typical not in memory when the query is performed. It is not feasible to load all the instances of the Account from the database if all I want is an Account with a particular user name. Instead, I pass a criterion to the Repository, and let the repository to find this object/objects that satisfies my criteria in its own way. The Repository may decide to generate a sql against the database if it is backed by a database table, or it may simply iterate through its collection if it is backed by a collection in memory.

One common implementation of a criterion is Specification pattern. A specification is a simple predicate that takes a domain object and returns a boolean.

A plan sql backed repository can take advantage of this interface and use the produced partial sql clauses to perform database query. If I use a hibernate backed repository, I may use the HibernateSpecification interface instead, which generates a hibernate Criteria when invoked.

The sql and hibernate backed repositories does not use the “specified” method, however I found it is very beneficial to implement it in all cases. Therefore I can use the same implementation classes with a stub AccountRepository for testing purpose and also with a caching implementation of the repository before the query hit the real one.

We can even take a step further to composite Specifications together with ConjunctionSpecification and DisjunctionSpecification to perform more complicate queries. However I feel it is out of the scope of this article. You can find more detail and examples about this in Evan’s book [DDD] if you are interested.

DAO pattern offers only a loosely defined contract. It suffers from getting potential misused and bloated implementations. The repository pattern uses a metaphor of a Collection. This metaphor gives the pattern a tight contract and make it easier to understand by your fellow colleagues.