Apache MRUnit 0.9.0-incubating has been released!

We (the Apache MRUnit team) have just released Apache MRUnit 0.9.0-incubating (tarball, nexus, javadoc). Apache MRUnit is an Apache Incubator project that is a Java library which helps developers unit test Apache Hadoop MapReduce jobs. Unit testing is a technique for improving project quality and reducing overall costs by writing a small amount of code that can automatically verify the software you write performs as intended. This is considered a best practice in software development since it helps identify defects early, before they’re deployed to a production system.

The MRUnit project is quite active, 0.9.0 is our fourth release since entering the incubator and we have added 4 new committers beyond the projects initial charter! We are very interested in having new contributors and committers join the project! Please join our mailing list to find out how you can help!

The MRUnit build process has changed to produce mrunit-0.9.0-hadoop1.jar and mrunit-0.9.0-hadoop2.jar instead of mrunit-0.9.0-hadoop020.jar, mrunit-0.9.0-hadoop100.jar and mrunit-0.9.0-hadoop023.jar. The hadoop1 classifier is for all Apache Hadoop versions based off the 0.20.X line including 1.0.X. The hadoop2 classifier is for all Apache Hadoop versions based off the 0.23.X line including the unreleased 2.0.X.

This release contains 2 new features, 15 improvements and 6 bug fixes. I will highlight a few below:

Support custom counter checking

but this is quite tedious. As such Jarek Jarcec Cecho (our second newest committer) added this feature directly to the drivers:

1

.withCounter(CustomMapper.CustomCounter.Name,2);

runTest() should optionally ignore output order

Previous to this change MRUnit required Mapper/Reducer classes to output key value pairs in the order specified on the test. Well defined output order is common, but strictly not universal. Dave Beech (our newest committer) contributed a patch so you optionally turn this ordered requirement off by using:

1

.runTest(false)

instead of

1

.runTest()

Driver.runTest throws RuntimeException should it throw AssertionError

Previous versions of MRUnit threw a RuntimeException when a test failed. This worked well, but it meant that testing frameworks saw the the test as having erred, not failed. We have changed this to AssertionError so that testing frameworks see the tests as failed. The distinction is small but important.

o.a.h.mrunit.mapreduce.MapReduceDriver should support a combiner

Previously the MRUnit only supported a combiner in the mapred MapReduceDriver class but now the mapreduce MapReduceDriver also supports a combiner by:

1

MapReduceDriver.newMapReduceDriver(mapper,reducer,combiner)

or

1

.withCombiner(combiner)or.setCombiner(combiner)

Better support for other serializations besides Writable

Previous versions of MRUnit did not support JavaSerialization, Avro or other Serialization frameworks well. We improved alternative serialization support by not forcing K2 in MapReduceDriver to be Comparable and supporting serializations that cannot clone into a object or that do not have default constructors.

Better error messages from validate, null checking and forgetting to set mappers and reducers

We have improved checking of parameters passed to MRUnit and the error messages when the parameters are invalid including throwing NullPointerException immediately when receiving a null value and throwing a IllegalStateExcpetion when no mapper or reducer class is provided instead of a NullPointerException.

Add static convenience methods to PipelineMapReduceDriver class

add static convenience constructors similar to those in the other driver classes:

1

PipelineMapReduceDriver.newPipelineMapReduceDriver()

or

1

PipelineMapReduceDriver.newPipelineMapReduceDriver(list of Pair<Mapper,Reducer>)

The OutputFromString and InputFromString methods are now deprecated because they required Text inputs or outputs with no way to enforce that the inputs or outputs from a mapper or reducer were actually Text. These methods also provided little convenience as a user can just pass the string they intended to new Text(string)