Unit testing .NET Storm applications

When you're building Storm apps, the functionality lends itself to the usual suspects of automated testing: unit tests, to ensure your bolts behave correctly when they get the expected input, and integration tests to ensure your bolts and spouts connect up correctly and the topology as a whole functions correctly.

This post is about unit testing Storm components built in .NET, to be run on HDInsight.

Storm topologies are intended to process an unbounded stream of events, but a new style of programming doesn't mean you can afford to lose trusted quality-control tools. If the business logic in your bolts isn't tested, then you're postponing the finding of errors to the point when your topology is processing thousands of events per second. If the connections between components aren't tested, then you postpone finding errors till deployment time, when your whole topology will break.

For .NET Storm applications, you can test components in isolation, using the LocalContext class from the Microsoft.SCP.Net.SDK NuGet package. The local context is an object you use to build the tuples to test your component with:

The context and the component are closely coupled. When you instantiate a new component like this, the context sets itself up for the component, using the input and output schemas the component defines, and any custom serializer or deserializers.

That means your unit tests will be a little bit integration test-y.

To pre-load the context with tuples for the component under test, you first need to set the context up for the component which emits the tuples. So in this topology:

If I want to unit test the SectorTimeBolt, I first need to set up LocalContext for the TimingEventBolt, so I can load it with tuples that look like they came from the Timing Event Bolt:

But then you don't need to actually use the emitting bolt, you can emit directly to the context - so your test isn't dependent on the behaviour of another component, just the configuration. Once the context has been configured from the emitter you can write tuples to it directly with Emit() and then persist the contents of the stream with WriteMsgQueueToFile().

One of my test setups looks like this, which emits 100 tuples to the local context:

In the test execution, you set up the context to use the component you want to test - SectorTimeBolt - then read in the populated tuples with ReadFromFileToMsgQueue(), and grab them to a local list (actually a List<SCPTuple>) with RecvFromMsgQueue():

Now you have a list of tuples which are correctly formatted to match the expected input, and you can dispense with LocalContext, re-creating your component with a mock context that you can verify against:

In this case, my bolt is a batching component, and it should only emit tuples when the tick stream fires, which is why I verify the bolt never calls Emit(). That's a common pattern, and you can test the tick scenario entirely with a mock context and a StormTuple object:

The joins between components in Storm are what actually makes the application, so integration tests at that level are just as important as unit tests. I'll be posting soon on how to integration tests those joins, and the generation of the topology spec.