Next requirement — learning file hashes

This works as required but we can improve performance by reducing the number of times we calculate a file’s hash. Assuming each file is static — i.e., its MP3 frames will never change — then we only need to calculate its hash once.

Rather than do this every time, we could save the result somewhere — a database perhaps. Which means we are introducing new behaviour.

When producing a hash, first check whether it already exists
If it does exist, then return it
Otherwise generate it and store it for next time

But now our interface is expanding, this has a habit of getting out of control and you can end up with ten constructor arguments. This is commonly known as the “Too Many Dicks on the Dance Floor” problem.

So what’s wrong with it?

Mp3HashController is no longer composing its behaviour from one layer of abstraction. This notion of learning return values has changed that. It is now exposed to details it shouldn’t have any knowledge of let alone dependency on.

Mp3HashController is now more difficult to test, there are more paths that have to exercised though the same interface

There is new conditional behaviour here that clearly belongs on its own — clients should not even know this is happening

It is much easier to test the learning behaviour on its own — I shouldn’t have to probe a controller

I shouldn’t have to describe an object’s behaviour in terms of another objects interface, I should be able to use that object directly

I would likely have to suppress some behaviour(s) while testing others. This will be manifest itself as complicated stubbing

Lots of dependent stubbing. Each of those collaborators are actually collaborators themselves. I think this is a smell. I shouldn’t have to consider this when unit testing Mp3HashController.

There is an opportunity to introduce an abstraction here. If we consider that all Mp3HashController requires is something to get a hash, then we can actually reduce it to:

The LearninghHashRepository has the responsibility of storing any hashes that don’t exist.

This could probably be condensed even further. TrackFileFinder and StreamHasher represent the concept of obtaining the hash of a file given a track identifier, so they can be combined. This reduces LearningHashRepository to a sort of write-through cache.