Blog Archive

Thursday, 30 March 2017

I've recently been in discussion with a colleague who thinks that dependency injection (DI) is over-used and, in cases where the dependency is a concrete class, unnecessary (in the latter case, he advocates simply creating new objects on the fly). I raised the point that dependency injection allows you to pass in a mock object while unit testing, but he dismissed that as irrelevant and not a valid argument as to why dependency injection is useful. My colleague also stated that type hinting dependencies results in tight coupling, and he would prefer it if we simply abandoned type hints, allowing the developer to pass in whatever they like.

In my opinion, this line of thinking is misguided, but he sent through some links to pages that he felt supported his point of view (including Tony Marston'srant on DI, and the Laravel documentation about 'facades' - which are actually used as an alternative syntax for the service locator [anti-]pattern). I genuinely wanted to understand the reasoning behind his point of view, as it flies in the face of just about everything I have ever read regarding best practice in PHP development. After reading those resources he sent though, I began to notice some misconceptions about what unit testing actually is, as well as confusion about the difference between code that is "strongly typed" (usually good) and "tightly coupled" (usually bad), and also a tendency to blame the wrong thing when problems arise.

Not All Automated Tests Are Unit Tests

There are different types of automated test that can be performed, and sometimes these different types get erroneously conflated. A unit test is used to verify that a particular class behaves correctly in isolation. When you write a unit test, you are only testing that one class - not any of its dependencies (they have their own tests).

That's not to say you can't write a test that covers a number of classes at once, but if you do, it is not a unit test - it is either an integration test (if it deals with the integration between classes), or a functional test (if it tests an entire function regardless of how many classes are involved). Even if you use a unit testing framework (like PHPUnit) to write the test, it is only a unit test if it tests a single unit.

Using Mocks to Test in Isolation

The reason we try to test units in isolation is that we want the test to live with the class, not the implementation of its dependencies. The implementation can change by passing in different dependencies, but we don't care about that, as those dependencies will have their own tests - we only care that this particular class is behaving correctly. If our test fails, it should be because there is something wrong with that particular class (or with the test itself), not because something changed in a dependency. This helps with debugging and provides reassurance that our classes are behaving correctly regardless of external implementation (integration and functional tests are useful for different reasons).

When writing a unit test, it is common to inject mock objects instead of real objects for the dependencies. This has a number of advantages, for example, it allows you to run tests without activating things you don't want (sending e-mails, making database connections, API calls, etc.), it allows you to force the test down paths that would otherwise be difficult to reach, and it allows you to isolate the class you are testing. That's not to say that all dependencies are always mocked. It might be that a concrete class provides what the test needs, and does no harm, so it might be quicker and easier to use that (in which case it is still a unit test, as the dependency is not part of the test subject).

If you create dependencies on the fly though, you no longer have an option - it is not possible to supply a mock object, nor to extend that dependency to tweak the implementation without changing the class that uses it. This is the very definition of tight coupling - it doesn't get any tighter than that. It is no longer possible to unit test that class without also taking the concrete dependency along for the ride.

What, Never Ever Create Objects on the Fly?

There are occasions when it is perfectly acceptable to create objects on the fly, but we need to fully understand the consequences of doing so, and be able to justify it. I find it a good rule of thumb is that the 'new' keyword should only be used in dependency injection containers and factory classes, but (as with all best practice principles) that is just a rule of thumb, not a law. Some occasions where it might be harmless to create objects are:

Where your entities form a hierarchical structure, and a parent entity needs to create and initialise a child entity for one of its properties.

Where a new value object is needed - a value object represents a value with no identity separate from its properties, and is essentially a data structure (albeit possibly with methods). In some cases I might justify creating a value object in the same way as creating a new PHP DateTimeImmutable object (which is itself a value object).

When using a helper class which acts as an extension to the language (for example a string or maths helper class that provides general convenience functions not provided by PHP).

Whenever we do one of these things though, we must bear in mind that the code is now tightly coupled to that class, and be willing to live with the consequences of that. If we ever want to re-use that code in a different situation, will the class that object is based on still be available? Is it possible we would ever want to swap it out for something else? Would we ever want to mock it for unit testing purposes?

What About Those Laravel Facades?

In Laravel, you can make a static call to what they refer to as a facade (not a true facade, more of a proxy), which calls a service locator to find the requested dependency. As long as the service locator is properly implemented, it is possible to populate it with mock objects that can be used in unit testing. This is certainly better than just creating new objects on the fly, as the dependencies are not tightly coupled and can be swapped out relatively easily.

The service locator does still need to be populated though, so there is not a great deal of benefit in terms of setup when using a service locator - populate a service locator, or populate a dependency injection container; it is essentially the same thing, the difference is in how you use it.

Actually consuming a service locator is arguably easier than dependency injection though, because you don't have to declare your dependencies with constructor or setter parameters. All you need is the service locator (and in the case of Laravel, not even that - you can just make static calls wherever you like). That ease of use is what makes it so alluring to some. There are dangers here though...

Hidden Dependencies and Other Dangers

A service locator, whether used explicitly, or via a pseudo facade, hides dependencies from view. This alone has earned it a reputation as an anti-pattern, and a poor choice for enterprise web development. With dependency injection, you can tell from the constructor signature what the dependencies are, and with type hints, you can ensure that only valid values are accepted. With a service locator, you have to examine every line of code in the class, or consult some documentation (and hope that it is up-to-date and accurate) to work out what the class needs in order to run. To my mind, that is a lot more painful than just being explicit up-front with DI.

In addition, code that proxies to a service locator is not very portable. If you use Laravel facades to make static calls to a service locator, you cannot re-use your class in any non-Laravel project without stripping them all out first. Whilst some tight coupling to your framework might be difficult to avoid, there is no need to tie yourself down with facades when dependency injection provides a virtually painless alternative.

Strongly Typed is not Tightly Coupled

In PHP 4, it was not possible to type hint arguments in functions - every argument was 'mixed' and you could pass in anything you like. With PHP 5 came the ability to type hint on a class name, interface, array, or callable, which allowed developers to specify what type of data they expected. In PHP 7, this is taken even further, and we can type hint with scalar types (either strictly, or coercively), and even type hint return values. If, as my colleague suggests, strongly typed arguments lead to tight coupling, why this seemingly unstoppable march toward stronger typing?

It could be argued that it is just fashionable, and that there can be drawbacks to strong typing as it is implemented in PHP (unlike statically typed languages, if you override a method, you cannot change the signature - unless it is the constructor). However, there are a few reasons why I think type-hinted code is much easier to work with. Type hints are self-documenting, giving you lots of useful information on what the argument is for and how it is being used. I like the fact that my IDE can offer code completion and navigation based on type hints (docblocks are OK, but they can get out of sync as development progresses - type hints are always up-to-date). I like the fact that I can catch errors early if the wrong type of argument is passed in, an argument is accidentally missed, or sent in the wrong order - rather than accepting anything only to have the code fail in mysterious and difficult-to-debug ways later on, or have to write my own instanceof checks.

None of that causes tight coupling. Type hinting on concrete classes is more tightly coupled than type hinting on base classes or interfaces, but even with a concrete class (it is not always possible or even desirable to avoid them) it is still possible to extend and mock, so type-hinted dependency injection is always going to be more loosely coupled than having your classes create their own dependencies on the fly.

Another Reason?

Aside from possible misunderstandings like those mentioned above, it appears to me that often, developers who want more 'flexibility' than commonly accepted best practice allows, are actually struggling to understand how to implement best practice. A poor design will result in difficulties implementing best practice, and it is tempting to blame SOLID (or other principles) as being wrong or inappropriate rather than to recognise flaws in one's own design. In my experience, learning OOP is a slow and sometimes painful process. It is not easy to see how and why SOLID principles are necessary. Things like dependency injection, unit testing, using interfaces, and design patterns, are confusing and can appear to be unnecessary or irrelevant. Why go to all that trouble when it works just fine if I do it the way I know?

Exacerbating things, we have popular projects and frameworks (such as Laravel) which are engineered primarily with rapid application development in mind, being used in potentially inappropriate ways for enterprise development. Following SOLID and other best practice principles is less important when your focus is on creating software with a short lifecycle (that may be discarded in a few months' time). Active record (breaking the single responsibility principle), auto-wired dependencies (relying on magic and hiding what is going on), static methods (tightly coupled and hiding dependencies) make development easier, but the resulting software harder to maintain. The phenomenal success of Laravel has much more to do with the fact that it is easy to understand and quick to get started than the use of best practice (that's not to say that you can't use best practice with Laravel if you are disciplined enough).

I think you cannot really appreciate the benefits of some of these things until you get into the habit of unit testing. As the pieces gradually start coming together, SOLID principles become more natural, your code becomes more robust and the whole process of development becomes more of a pleasure. The learning process never stops though.

My conclusion is that SOLID is an invaluable guide to doing things in a way that leads to cleaner, more maintainable, more extendable, more testable code. If I feel tempted to violate any of those principles (or discover that I already have done so without realising it), I need to double check that my approach is not flawed and be certain I understand the consequences.

Since you're correcting misconceptions, I have to correct you on one very common misconception: unit tests in OOP are, yes, usually, but not always a test for a single class. It would have been called a "class test" then - the term "unit" does not mean "class" or "function", it means the smallest testable unit of code, which is sometimes more than one class.

Take this class for example:

https://github.com/mindplay-dk/timber/blob/master/src/Router.php

It internally uses a second class (Route) to maintain its internal table of routes. It's an implementation detail, as is proven by the fact that it could have been refactored to internally use a different class or arrays or anything else that solves the problem, as long as it's public API remains unaffected.

This comes back to the dependency injection question. In this case, you can't say that Route is a dependency - it's an implementation detail. Dependency injection is for dependencies, and Route objects are not.

When you're testing the Router, that's a unit test - it doesn't involve integration with any kind of external dependency, service, class, etc. which would make it an integration test. The Router class can stand alone, so it is a single unit from the clients perspective: I don't need to supply it with any dependencies for it to function, and that's the real distinction, as I understand it. (The exception being code that internally uses facades or singletons etc. to hide its dependencies!)

Enjoyed your post, just wanted to pick up on one point:"...creating software with a short lifecycle (that may be discarded in a few months' time)" I have found that all quick and dirty throw away projects soon become a company bedrock and the pain point of developers for years to come.

I agree completely, but the main argument against using best practice that I have heard is that it is slower and requires more work. So I try to acknowledge that there is a cost/benefit calculation to be done.

So I too have a long-running aversion to DI and, by referencing SOLID principles to defend it, you have enlightened me as to a framework I can finally hang my nagging feeling of discontent upon.

The 'O' of SOLID is the "open/closed" principle -- cf. Wikipedia: "software entities … should be open for extension, but closed for modification".

Furthermore, a key principle of OO software design is that the implementation should be separated from the interface.

But to my mind, DI fails to accomplish both of these. In my experience, it seems that method signatures are very often changed to add new parameters when the underlying implementation changes, when the underlying implementation suddenly requires a new piece of information to depend on. That can lead to a cascade of changes throughout your codebase as additional dependencies must be passed through a whole chain of classes in order to satisfy the new interface spec.

Secondly, and concomitantly, DI requires that consumers of a service know exactly and all services upon which that service depends, and must supply them all. Again, as the implementation changes, the interface must also change.

The workarounds for this, in the form of the service locator (which often supplies many dependencies for the desired service by magic at runtime, or in the form of cramming all the dependencies into the class constructor in order to minimize interface-failure-cascade, both seem to me to be unsatisfactory solutions -- but I am at a loss for what the correct solution would be.

I think the main point of the open/closed principle is that you try not to change the underlying implementation (closed for modification). If you need a different implementation, you extend - which does not break any of your existing code. If your default implementation must change though, you only have to update the dependency injection container, and/or any factories. This is one of the reasons I advocate limiting the use of the new keyword to those places as much as possible.

The problem of needing to supply all the dependencies (and dependencies of dependencies) before being able to use a service, is easily solved with a dependency injection container. A DI container and a service locator are essentially the same thing, just used slightly differently - the DI container does not hide the dependencies, but it can still resolve them for you (whereas when a container is used as a service locator, it both hides and resolves dependencies).

> If your default implementation must change though, you only have to update the dependency injection container, and/or any factories.

... whereas in my world, when I change the implementation of a class, I don't have to change the interface signature at all. I maintain my interface contracts. That's the core OO principle of encapsulation.

The use of DI containers to pass in necessary dependencies is just silly:* Either those dependencies are the same every time -- in which case they could simply be instantiated in the constructor...* Or they are different every time -- in which case, a DI container cannot prefill them for you.

Using DI does not mean your interfaces have to change. That would defeat the purpose of using an interface! Injection of dependencies is not relevant to interfaces (it would not make sense to have a constructor defined in an interface for example).

Also, your parting dichotomy is not valid - a DI container can provide different dependencies to the same class to create different implementations. The different implementations would just have different keys in the container. Even if the dependencies are the same every time, there is no harm in using DI, and it brings benefits outlined in the article (loose coupling, use of mocks, and self-documenting).

Separately, it's always been possible to do unit testing on PHP without DI, by ringfencing the "new" keyword in its own class method that the unit-under-test can call upon, allowing it to be mocked. I agree with your original correspondant that "because unit testing" is not a good enough reason alone for DI, when other alternatives are available. A better language than PHP would surely allow language keywords to be overridden for testing purposes.

DI is not just 'because unit testing' - it decouples classes, and facilitates code re-use even when implementation details differ. By advertising what the dependencies are and requiring the mandatory ones to be passed in, the code also becomes self-documenting, less brittle, easier to extend, and easier to debug (and yes, easier to unit test).

> it decouples classes, and facilitates code re-use even when implementation details differ.

... but DI is *not* the only way of instigating weak coupling between classes, nor is it the only way of facilitating code re-use. It's entirely appropriate to use inheritance and composition to achieve this.

> By advertising what the dependencies are and requiring the mandatory ones to be passed in, the code also becomes self-documenting

I disagree on the "self-documenting" nature of DI. I find DI-structured code far less readable, because it entails significant boilerplate and unnecessary class variables.

As Tony Marston said in his post, DI is great for the situation where you actually need to inject a class conforming to a particular interface into your class to do work, where the implementing object is likely to be different every time. My experience with DI is that, containers or no, 90% of objects passed through DI are unchanging config or addresses of services that themselves are singletons. There is really no benefit to juggling these Factories and Services around through dependency injection.

And it's certainly waaaaaaaaaaay easier to debug code that doesn't pass through the multiple additional levels of indirection that DI requires.

I didn't say DI was the only way to decouple or re-use code. Clearly it isn't - there is nearly always more than one way to do things. So no disagreement there.

I agree that DI will often require you to write more code, which in a rapid application development scenario might not be worthwhile. However, in this article I am referring to enterprise development of code that needs to be supported for many years to come. The benefits of DI nearly always outweigh the extra legwork in that scenario. So still not necessarily any disagreement there.

The example you gave (passing a factory into an entity to create a child property) is exactly what I said in the article was an example of when DI is not appropriate and it is perfectly acceptable to create a new object on the fly. No disagreement there either.