Thursday, 22 January 2015

Configuration Can Be Simpler

Every time a developer creates an XML based IoC container config file I’m sure not just one, but an entire litter of kittens die. Not just them but any build managers [1] and support staff too. I’ll admit that I have yet to work on a project where I’ve seen any clear reason to even use an IoC container. I’m sure they exist, it’s just where I’ve seen them used it’s as a workaround for some other problem, e.g. a weak build and deployment pipeline [2].

If that is the case, then files with hundreds or thousands of lines of XML is not my idea of a pleasant UI to work with. One knee-jerk reaction might be to manage the complexity by trying to use an extra level of indirection and generate the XML config file(s) from a much simpler DSL written with, say, a scripting language. To me this just appears to be an attempt to overcome the XML “angle-bracket tax” instead of getting to the heart of the problem, which is using an overly-flexible and complex tool to solve what might be a much simpler problem.

Example: Configuring a Chain of Responsibility

To try and explain my position further I’m going to use a real-world example that has come up. Within a web API I worked on there was a component that handles the authentication aspects of an internal system. This component used the Chain of Responsibility design pattern to achieve its goals so that a client request could be fired into the head of the chain and somewhere along it one of the providers would answer the question of who the principle was, even if the eventual answer was “unknown”.

The section of an IoC container XML config file needed to describe this chain was not pretty - it was littered with fully qualified types names (assembly, namespace and class/interface). Whilst I grant you that it is possible to configure a Directed Acyclic Graph (DAG) of objects using the tool and its config file I would suggest that it is probably not desirable to. The example - a chain of responsibility - does not call for that amount of power (an arbitrary DAG) and I think we can do something much simpler and with very little coding effort.

Key/Value Configuration

I’ve always found the old fashioned .ini file style of configuration files one of the most easy to read and maintain (from a multi-environment perspective). One of the reasons for this I believe is that the configuration is generally flattened to a simple set of key/value pairs. Having one entry per line makes them really easy to process with traditional tools and the constraint of one-setting-per-line forces you to think a bit harder about how best to break-down and represent each “setting”. If the configuration value cannot be distilled down to a simple value then perhaps a richer structure might be needed (indicated with a simple setting like a filename), but just for that one setting, it does not invite unnecessary complexity elsewhere.

Consequently I would look to try and express the structure of our example “chain” in a single key/value entry, albeit in this case as a .Net style “App Setting”. This is how I would express it:

You’ll immediately notice that the name of the setting describes the structure we are trying to create - a chain - you don’t have to infer it from the structure of the XML or read an XML comment. You’ll also see that we haven’t included any implementation details in the setting’s value - each term is a shorthand (an abstraction if you will) for the component that rests in that position within the chain. This makes code refactoring a breeze as no implementation details have leaked outside the code and could therefore be hidden from a refactoring tool.

The Factory Method

If you’re thinking that all the complexity can’t have vanished, then you’d be right. But much of the accidental complexity has gone away simply by choosing to use a simpler and more restrictive representation. Parsing this representation only requires a handful of lines of code:

If all the dependencies in this chain are purely functional (and they almost certainly should be for a Chain of Responsibility) then the factory and chain can be easily tested using component-level tests [3]. The only mock it requires is for the configuration object which is trivial to knock-up, even as a manual mock.

What About Adding New Providers?

One of the allures of IoC containers is the apparent ease with which you can “re-configure” your system and add support for a new component. Whilst that may be true for the deployment configuration step, I know that the development and testing cost of any new component is easily going to dwarf the miniscule amount of change required to the Factory Method shown above to support such a change. Unless the new component is going to be implemented in an entirely new assembly and there is a strict requirement not to touch the existing codebase, then you’ve already got a code change and re-deployment on your hands anyway.

Configuration Validation

One of the things in particular I like about manual factory methods is that they provide a suitable place to put validation code for your configuration. For example, in our authentication chain the final provider is the default one. This provider is “a provider of last resorts”, no matter what happens before it, it can always answer our query and return an answer, even if that answer is “no-one knows who they are”. The fact that this provider should always appear at the end of the chain means that there is the possibility to mis-configure the system by accident. Imagine what would happen if it was the first in the chain - every other provider would essentially be ignored.

One approach to this problem is to punt and let the documentation (if any exists) provide a warning; if the sysadmins then mess it up that’s their problem. Personally I’d like to give them a fighting chance, especially when I know that I can add some trivial validation (and even unit test its behaviour):

case “Default”: { if (nextProvider != null) { throw new ConfigurationException(“The ‘Default’ provider can only be used at the end of the chain”); } return new DefaultProvider(...); }

If you look closely you’ll see that I’ve also removed the final argument from the call to the DefaultProvider constructor. The great thing about expressing your design in code (for a statically compiled language [4]) is that once I decided to remove the final parameter from the DefaultProvider class ctor I could lean on the compiler and fix-up the call sites as appropriate. If one of the call sites wouldn’t have been usable this way I’d have spotted it immediately rather than having to debug a curious runtime issue some time later.

Test-Driven Design

I’ve heard it said that TDD done properly leads you naturally towards using an IoC container. Frankly I’m bemused by that statement because when I do TDD it leads me exactly towards what I describe above. Frankly I genuinely cannot see how the simplest step can be to go from a hardcoded implementation (i.e. the DefaultProvider in our example) to using a highly-configurable tool to define an arbitrary DAG - there is always something much simpler in between.

Even if some of the steps below are not actually coded up and just become a thought exercise, I’d still not end up reaching for the big guns by the end:

Identify a need to do authentication

Create the default provider with TDD

Hard-code it as the only implementation

Identify a need for other authentication mechanisms

Create them with TDD

Hard-code the initial chain

Identify a need to make the chain order configurable

Introduce a factory (method) to house the creation logic, using TDD

As I said right back at the beginning, with a fluid development process and one eye on good design principles, the term “hard-coded” no longer remains quite such the pejorative it once was.

[1] Not a role I had come across until recently as the build pipeline and deployment process has always been maintained by the developers.

[2] When you can make a change in isolation, build your artefacts, run an extensive test suite and deploy changes to production quickly, many needs for components to be configurable through config files just evaporate.

[3] The difference between unit-level tests and component-level tests is really just the size of your “unit”. I chose not to say “unit tests” to make it clear that I am aware there would be more than one class under test in certain scenarios.

[4] Or a non-statically compiled language with some form of type annotations.