README.md

Documentation

NoSQLUnit Core

Overview

Unit testing is a method by which the smallest testable part of an
application is validated. Unit tests must follow the FIRST Rules; these
are Fast, Isolated, Repeatable, Self-Validated and Timely.

It is strange to think about a JEE application without persistence layer
(typical Relational databases or new NoSQL databases) so should be
interesting to write unit tests of persistence layer too. When we are
writing unit tests of persistence layer we should focus on to not break
two main concepts of FIRST rules, the fast and the isolated ones.

Our tests will be fast if they don't access network nor filesystem,
and in case of persistence systems network and filesystem are the most
used resources. In case of RDBMS ( SQL ), many Java in-memory
databases exist like Apache Derby , H2 or HSQLDB . These databases, as
their name suggests are embedded into your program and data are stored
in memory, so your tests are still fast. The problem is with NoSQL
systems, because of their heterogeneity. Some systems work using
Document approach (like MongoDb ), other ones Column (like Hbase ), or
Graph (like Neo4J ). For this reason the in-memory mode should be
provided by the vendor, there is no a generic solution.

Our tests must be isolated from themselves. It is not acceptable that
one test method modifies the result of another test method. In case of
persistence tests this scenario occurs when previous test method insert
an entry to database and next test method execution finds the change. So
before execution of each test, database should be found in a known
state. Note that if your test found database in a known state, test will
be repeatable, if test assertion depends on previous test execution,
each execution will be unique. For homogeneous systems like RDBMS ,
DBUnit exists to maintain database in a known state before each
execution. But there is no like DBUnit framework for heterogeneous
NoSQL systems.

NoSQLUnit resolves this problem by providing a JUnit extension which
helps us to manage lifecycle of NoSQL systems and also take care of
maintaining databases into known state.

Requirements

To run NoSQLUnit , JUnit 4.10 or later must be provided. This is
because of NoSQLUnit is using Rules , and they have changed from
previous versions to 4.10.

Although it should work with JDK 5 , jars are compiled using JDK 6 .

NoSQLUnit

NoSQLUnit is a JUnit extension to make writing unit and integration
tests of systems that use NoSQL backend easier and is composed by two
sets of Rules and a group of annotations.

First set of Rules are those responsible of managing database
lifecycle; there are two for each supported backend.

The first one (in case it is possible) it is the in-memory mode.
This mode takes care of starting and stopping database system in "
in-memory " mode. This mode will be typically used during unit
testing execution.

The second one is the managed mode. This mode is in charge of
starting NoSQL server but as remote process (in local machine) and
stopping it. This will typically used during integration testing
execution.

You can add them in Test Suites and/or Tests Classes, NoSQLUnit takes care of only starting database once.

Second set of Rules are those responsible of maintaining database into
known state. Each supported backend will have its own, and can be
understood as a connection to defined database which will be used to
execute the required operations for maintaining the stability of the
system.

Note that because NoSQL databases are heterogeneous, each system will
require its own implementation.

Seeding Database

@UsingDataSet is used to seed database with defined data set. In brief
data sets are files that contain all data to be inserted to configured
database. In order to seed your database, use @UsingDataSet
annotation, you can define it either on the test itself or on the class
level. If there is definition on both, test level annotation takes
precedence. This annotation has two attributes locations and
loadStrategy .

With locations attribute you can specify classpath datasets location.
Locations are relative to test class location. Note that more than one
dataset can be specified.

Also withSelectiveLocations attribute can be used to specify datasets location. See Advanced Usage chapter for more information.

If files are not specified explicitly, next
strategy is applied:

First searches for a file on classpath in same package of test class
with next file name, [test class name]#[test method
name].[format]
(only if annotation is present at test
method).

If first rule is not met or annotation is defined at class scope,
next file is searched on classpath in same package of test class,
[test class name].[default format] .

Warning

datasets must reside into classpath and format depends on NoSQL
vendor.

INSERT Insert defined datasets before executing any test method.
DELETE_ALL Deletes all elements of database before executing any test method.
CLEAN_INSERT This is the most used strategy. It deletes all elements of database and then insert defined datasets before executing any test method.

Verifying Database

Sometimes it might imply a huge amount of work asserting database state
directly from testing code. By using @ShouldMatchDataSet on test
method, NoSQLUnit will check if database contains expected entries
after test execution. As with @ShouldMatchDataSet annotation you can
define classpath file location, or using withSelectiveMatche, see Advanced Usage chapter for more information.
If it is not dataset supplied next convention is used:

First searches for a file on classpath in same package of test class
with next file name, [test class name]#[test method
name]-expected.[format]
(only if annotation is present at test
method).

If first rule is not met or annotation is defined at class scope,
file is searched on classpath in same package of test class,
[test class name]-expected.[default format] .

Warning

datasets must reside into classpath and format depends on NoSQL
vendor.

An example of usage:

@ShouldMatchDataSet(location="my_expected_data_set.json")

MongoDB Engine

MongoDB

MongoDB is a NoSQL database that stores structured data as JSON-like
documents with dynamic schemas.

Notice that if attributes value are integers, double quotes are not
required.

If you want to use ISODate function or any other javascript function you should see how MongoDB Java Driver deals with it. For example in case of ISODate:

"bornAt":{ "$date" : "2011-01-05T10:09:15.210Z"}

With last versions of MongoDB, index support is also implemented allowing developers to define indexes through defined document properties. For more information visit MongoDB. In this case dataset has been changed to let us define indexes too.

Note that we define the collection name, and then we define two subdocuments. The first one is where we define an array of indexes, all of them related to defined collection and we define which fields are going to be indexed (same as document as defined in MongoDB index specification). And then data property, where we define all documents that goes into collection under test.

Getting Started

Lifecycle Management Strategy

First step is defining which lifecycle management strategy is required
for your tests. Depending on kind of test you are implementing (unit
test, integration test, deployment test, ...) you will require an
in-memory approach, managed approach or remote approach.

To configure in-memory approach you should only instantiate next
rule :

In example we are overriding
MONGO_HOME variable (in case has been set) and set mongo home at
/opt/mongo . Moreover we are appending a single argument to MongoDB
executable, in this case setting log level to number 3 (-vvv). Also you
can append property=value arguments using
appendCommandLineArguments(String argumentName, String
argumentValue)
method.

Warning

when you are specifying command line arguments, remember to add slash
(-) and double slash (--) where is necessary.

Configuring remote approach does not require any special rule
because you (or System like Maven ) is the responsible of starting and
stopping the server. This mode is used in deployment tests where you are
testing your application on real environment.

Configuring MongoDB Connection

Next step is configuring MongoDB rule in charge of maintaining
MongoDB database into known state by inserting and deleting defined
datasets. You must register MongoDbRule JUnit rule class, which
requires a configuration parameter with information like host, port or
database name.

To make developer's life easier and code more readable, a fluent
interface can be used to create these configuration objects. Two
different kind of configuration builders exist.

The first one is for configuring a connection to in-memory Fongo
server. For almost all cases default parameters are enough.

In previous test we have defined that
MongoDB will be managed by test by starting an instance of server
located at /opt/mongo . Moreover we are setting an
initial dataset in file initialData.json
located at classpath
com/lordofthejars/nosqlunit/demo/mongodb/initialData.json
and expected
dataset called expectedData.json .

Replica Set

Introduction

Database replication in MongoDB adds redundancy and high availability of the data.
In case of MongoDB instead of having traditional master-slave pattern architecture, it implements Replica Set architecture,
which can be understood as more sophisticated master-slave replication. For more information about Replica Set read
mongoDB

Set up and Start Replica Set architecture

In NoSQLUnit we can define a replica set architecture and starting it up, so our tests are executed against a replica set servers instead of a single server. Due the nature of replica set system, we can only create a replica set of managed servers.

So let's see how to define an architecture and starting all related servers. The main class is ReplicaSetManagedMongoDb which manages lifecycle of all servers involved in replica set. To build a ReplicaSetManagedMongoDb class, ReplicaSetBuilder builder class is provided and it will allow us to define the replica set architecture. Using it we can set the eligible servers (those that can be primary or secondary), the only secondary servers, the arbiters, the hidden ones, and configure all of them with the attributes like priority, voters, or setting tags.

So let's see an example where we are defining two eligible servers and one arbiter in a replica set called rs-test.

Notice that you must define different port for each server and also a different database path. Also note that ReplicaSetManagedMongoDb won't let start executing tests until all replica set becomes stable (this can take some minutes).

Then we only have to create a MongoDbRule as usually which will populate defined data into replica set servers. For this case a new configuration builder is provided that allows us to define the mongo servers location and the write concern used during seeding phase. By default Aknownledge write concern is used.

Now we have configured and deployed a replica set and populated them with the dataset.

But NoSQLUnit also provides an utility method to cause server failures. It is as easy as calling shutdownServer method.

replicaSetManagedMongoDb.shutdownServer(27017);

Keep in mind two aspects of using this method:

Because @ClassRule is used, we are responsible for restarting the system by calling startServer.

System may become unstable and Mongo driver can throw many exceptions (that's normal because of MonitorThread) and even do some test fails. If you want to wait until all servers become stable again (in real life you won't have this possibility), you can use next call:

replicaSetManagedMongoDb.waitUntilReplicaSetBecomesStable();

Also you can use NoSQLUnit to test your replica set deployment of remote servers. You can use MongoDbCommands to retrieve replica set configuration.

Sharding

Introduction

Sharding is another way of replication, but in this case we are scaling horizontally. MongoDB partitions a collection and stores the different portions on different machines. From a logical overview client only see one single database, but internally a cluster of machines are being used with data spread across all system.

To run sharding we must set up a sharded cluster. A sharded cluster is composed by next elements:

shards which are mongod instances that holds a portion of the database collections.

config servers which stores metadata about the clusters.

mongos servers determine the location of required data from shards.

Apart from setting up a sharding architecture, we also have to register each shard, enable sharding for database, enable sharding for each collection we want to partition, and defining which element of the document is used to calculate the shard key.

Set up and Start Sharding

In NoSQLUnit we can define a sharding architecture and starting it up, so our tests are executed against it instead of a single server. Due the nature of sharding system, we can only create sharding for managed servers.

So let's see how to define an architecture and starting all related servers. The main class is ShardedManagedMongoDb which manages lifecycle of all servers involved in sharding (shards, configs and mongos). To build a ShardedManagedMongoDb class, ShardedGroupBuilder builder class is provided and it will allow us to define each server involved in sharding.

Let's see an example on how to set up and start a system with two shards, one config server and one mongos.

Notice that you must define different port for each server and also a different database path. Also note that in case of mongos you must set the config server port, and is not necessary to set up the database path.

And finally we only have to create a MongoDbRule as usually which will populate defined data into sharding servers. For this case we must use the same builder used for replica set but enabling sharding. Keep in mind that in this case we only have to register the mongos instances, not shards or config servers.

For each collection you define which attributes are used for calculating the shard key by using shard-key-pattern attribute, and finally using data attribute we set the whole document which will be inserted into collection.

In case we use this dataset as expected dataset, shard-key-pattern is ignored, and only data document is used for comparison.

Replicated Sharded Cluster

Introduction

The third way of replication is an hybrid. Each shard contains a replica set with n-member replica set. And as sharding at least one config server and one mongos server is required.

Set up and Start Sharding

In NoSQLUnit we can define a replicated sharded cluster architecture and starting it up, so our tests are executed against it instead of a single server. Due the nature of replicated sharded cluster, we can only create sharding for managed servers.

So let's see how to define an architecture and starting all related servers. The main class is ShardedManagedMongoDb which manages lifecycle of all servers involved in sharding (shards, configs and mongos). To build a ShardedManagedMongoDb class, ShardedGroupBuilder builder class is provided and it will allow us to define each server involved in sharding, but in contrast of sharding, we need to add a replica set instead of a shard. For this reason ReplicaSetManagedMongoDb is also used.

Let's see an example on how to set up two replicated sharded cluster, with one member each replica set, (of course in production environment you would have more), one config server and one mongos.

key : description for graph element properties, you must define if
property type is for nodes or relationships, name, and type of
element. In our case string, int, long, float, double and boolean
are supported.

graph : the beginning of the graph representation. In our case
only one level of graphs are supported. Inner graphs will be
ignored.

node : the beginning of a vertex representation. Please note that
id 0 is reserved for reference node, so cannot be used as id.

edge : the beginning of an edge representation. Source and target
attributes are filled with node id. If you want to link with
reference node, use a 0 which is the id of root node. Note that
label attribute is not in defined in standard definition of GraphML
specification; GraphML supports adding new attributes to all GrpahML
elements, and label attribute has been added to facilitate the
creation of edge labels.

data : the key/value data associated with a graph element. Data
value will be validated against type defined in key element.

attr.autoindexName : this attribute is optional and can only set in key element.
It creates an index with given name for properties of that type for all nodes or edges.

index : This tag is optional and creates an index with given name, key and value in
the node or edge where it is declared.

Getting Started

Lifecycle Management Strategy

First step is defining which lifecycle management strategy is required
for your tests. Depending on kind of test you are implementing (unit
test, integration test, deployment test, ...) you will require an
in-memory approach, embedded approach, managed approach or remote
approach.

In-memory Lifecycle

To configure in-memory approach you should only instantiate next
rule :

Target path This is the directory where Neo4j server is started and is target/neo4j-temp .

: Default Embedded Values

Managed Lifecycle

To configure managed way, two possible approaches can be used:

The first one is using an embedded database wrapped by a server .
This is a way to give an embedded database visibility through network
(internally we are creating a WrappingNeoServerBootstrapper instance) :

By default managed Neo4j rule uses next default values, but can be
configured programmatically as shown in previous
example :

Target path This is the directory where Neo4j process will be started and by default is target/neo4j-temp .
Port Where server is listening incoming messages is 7474.
Neo4jPath Neo4j installation directory which by default is retrieved from NEO4J_HOME system environment variable.

: Default Managed Values

Warning

Versions prior to Neo4j 1.8, port cannot be configured from command
line, and port should be changed manually in
conf/neo4j-server.properties . Although this restriction, if you
have configured Neo4j to run through a different port, it should be
specified too in ManagedNeoServer rule.

Remote Lifecycle

Configuring remote approach does not require any special rule
because you (or System like Maven ) is the responsible of starting and
stopping the server. This mode is used in deployment tests where you are
testing your application on real environment.

Configuring Neo4j Connection

Next step is configuring Neo4j rule in charge of maintaining Neo4j
graph into known state by inserting and deleting defined datasets. You
must register Neo4jRule JUnit rule class, which requires a
configuration parameter with information like host, port, uri or target
directory.

To make developer's life easier and code more readable, a fluent
interface can be used to create these configuration objects. Two
different kind of configuration builders exist.

In-Memory/Embedded Connection

The first one is for configuring a connection to in-memory/embedded
Neo4j .

If you are only registering one embedded Neo4j instance like previous
example , calling build is enough. If
you are using more than one Neo4j embedded connection like explained
in Simultaneous Engine section,
targetPath shall be provided by using buildFromTargetPath method.

If you are using in-memory approach mixed with embedded approach, target
path for in-memory instance can be found at
InMemoryNeo4j.INMEMORY_NEO4J_TARGET_PATH variable.

Managed/Remote Connection

The second one is for configuring a connection to remote Neo4j server
(it is irrelevant at this level if it is wrapped or not). Default values
are:

Spring Connection

If you are plannig to use Spring Data Neo4j, you may require to use the GraphDatabaseService defined within Spring Application Context, mostly because you are defining an embedded connection using Spring namespace:

<neo4j:configstoreDirectory="target/config-test"/>

In these cases you should use an special method which gets GraphDatabaseService instance instead of creating new one.

Note that you need to autowire the application context, so NoSQLUnit can inject instance defined within application context into Neo4jRule.

Verifying Graph

@ShouldMatchDataSet is also supported for Neo4j graphs but we should
keep in mind some considerations.

To compare two graphs, stored graph is exported into
GraphML format and then is compared with expected
GraphML using XmlUnit framework. This approach implies two aspects
to be considered, the first one is that although your graph does not
contains any connection to reference node, reference node will appear
too with the form ( <node id="0"></node> ). The other aspect is that
id's are Neo4j's internal id, so when you write the expected file,
remember to follow the same id strategy followed by Neo4j so id
attribute of each node could be matched correctly with generated output.
Inserted nodes' id starts from 1 (0 is reserved for reference node),
meanwhile edges starts from 0.

This way to compare graphs may change in future (although this strategy
will be always supported).

As I have noted in verification section I find
that using @ShouldMatchDataSet is a bad approach during testing because
test readibility is affected negatively. So as general guide, my advice
is to try to avoid using @ShouldMatchDataSet in your tests as much as
possible.

Full Example

To show how to use NoSQLUnit with Neo4j , we are going to create a
very simple application that counts Neo's friends.

MatrixManager is the business class
responsible of inserting new friends and counting the number of Neo's
friends.

Getting Started

Lifecycle Management Strategy

First step is defining which lifecycle management strategy is required
for your tests. Depending on kind of test you are implementing (unit
test, integration test, deployment test, ...) you will require an
embedded approach, managed approach or remote approach.

Embedded Lifecycle

To configure embedded approach you should only instantiate next
rule :

Target path This is the directory where Cassandra server is started and is target/cassandra-temp .
Cassandra Configuration File Location of yaml configuration file. By default a configuration file is provided with correct default parameters.
Host localhost
Port By default port used is 9171. Port cannot be configured, and cannot be changed if you provide an alternative Cassandra Configuration File.

Managed Lifecycle

By default managed Cassandra rule uses next default values but can be
configured programmatically:

Target path This is the directory where Cassandra server is started and is target/cassandra-temp .
CassandraPath Cassandra installation directory which by default is retrieved from CASSANDRA_HOME system environment variable.
Port By default port used is 9160. If port is changed in Cassandra configuration file, this port should be configured too here.

: Default Managed Values

Warning

To start
Cassandra
java.home
must be set. Normally this variable is already configured, you would
need to do nothing.

Remote Lifecycle

Configuring remote approach does not require any special rule
because you (or System like Maven ) is the responsible of starting and
stopping the server. This mode is used in deployment tests where you are
testing your application on real environment.

Configuring Cassandra Connection

Next step is configuring Cassandra rule in charge of maintaining
Cassandra graph into known state by inserting and deleting defined
datasets. You must register CassandraRule JUnit rule class, which
requires a configuration parameter with information like host, port, or
cluster name.

To make developer's life easier and code more readable, a fluent
interface can be used to create these configuration objects. Three
different kind of configuration builders exist.

Port parameter is already configured with default parameter of managed
lifecycle. If port is changed, this class provides a method to set it.
Note that host parameter must be specified in this case.

Verifying Data

@ShouldMatchDataSet is also supported for Cassandra data but we should
keep in mind some considerations.

Warning

In
NoSQLUnit
, expectations can only be used over data, not over configuration
parameters, so for example fields set in
dataset
file like compactionStrategy, gcGraceSeconds or maxCompactionThreshold
are not used. Maybe in future will be supported but for now only data
(keyspace, columnfamilyname, columns, supercolumns, ...) are
supported.

Full Example

To show how to use NoSQLUnit with Cassandra , we are going to
create a very simple application.

PersonManager is the business class
responsible of getting and updating person's car.

Root element must be called data , and then depending on kind of
structured data we need to store, one or more of next elements should
appear. Note that key field is used to set the key of the element, and
value field is used to set a value.

simple : In case we want to store simple key/value elements. This
element will contain an array of key/value entries.

list : In case we want to store a key with a list of values. This
element contain a key field for key name and values field with
an array of values.

set In case we want to store a key within a set (no duplicates
allowed). Structure is the same as list element.

sortset : In case we want to store a key within a sorted set. This
element contain the key, and an array of values, which each one,
apart from value field, also contain score field of type Number,
to set the order into sorted set.

hash : In case we want to store a key within a map of field/value.
In this case field element set the field name, and value set the
value of that field.

Getting Started

Lifecycle Management Strategy

First step is defining which lifecycle management strategy is required
for your tests. Depending on kind of test you are implementing (unit
test, integration test, deployment test, ...) you will require an
embedded approach, managed approach or remote approach.

Embedded Lifecycle

To configure embedded approach you should only instantiate next
rule :

Managed Lifecycle

By default managed Redis rule uses next default values but can be
configured programmatically:

Target path This is the directory where Redis server is started and is target/redis-temp .
RedisPath Cassandra installation directory which by default is retrieved from REDIS_HOME system environment variable.
Port By default port used is 6379. If port is changed in Redis configuration file, this port should be configured too here.
Configuration File By default Redis can work with no configuration file, it uses default values, but if we need to start Redis with an specific configuration file located in any directory file path should be set.

: Default Managed Values

Remote Lifecycle

Configuring remote approach does not require any special rule
because you (or System like Maven ) is the responsible of starting and
stopping the server. This mode is used in deployment tests where you are
testing your application on real environment.

Configuring Redis Connection

Next step is configuring Redis rule in charge of maintaining Redis
store into known state by inserting and deleting defined datasets. You
must register RedisRule JUnit rule class, which requires a
configuration parameter with information like host, port, or cluster
name.

To make developer's life easier and code more readable, a fluent
interface can be used to create these configuration objects. Three
different kind of configuration builders exist.

Embedded Connection

The first one is for configuring an embedded connection to managed
Redis .

Dataset Format

Default dataset file format in HBase module is json. Dataset in HBase
is the same used by
Cassandra-Unit but not all
fields are supported. Only fields available in TSV HBase application can
be set into dataset.

Getting Started

Lifecycle Management Strategy

First step is defining which lifecycle management strategy is required
for your tests. Depending on kind of test you are implementing (unit
test, integration test, deployment test, ...) you will require an
embedded approach, managed approach or remote approach.

Embedded Lifecycle

To configure embedded approach you should only instantiate next
rule :

Target path This is the directory where HBase stores data and is target/data .
Host localhost
Port By default port used is 60000.
File Permissions Depending on your umask configuration, HBaseTestingUtility will create some directories that will not be accessible during runtime. By default this value is set to 775, but depending on your OS you may require a different value.

Managed Lifecycle

By default managed HBase rule uses next default values but can be
configured programmatically:

Target path This is the directory where HBase server is started and is target/hbase-temp .
HBasePath HBase installation directory which by default is retrieved from HBASE_HOME system environment variable.
Port By default port used is 60000. If port is changed in HBase configuration file, this port should be configured too here.

: Default Managed Values

Warning

To start
HBASE
JAVA_HOME
must be set. Normally this variable is already configured, so you would
need to do nothing.

Remote Lifecycle

Configuring remote approach does not require any special rule because
you (or System like Maven ) is the responsible of starting and stopping
the server. This mode is used in deployment tests where you are testing
your application on real environment.

Configuring HBase Connection

Next step is configuring HBase rule in charge of maintaining HBase
columns into known state by inserting and deleting defined datasets. You
must register HBaseRule JUnit rule class, which requires a
configuration parameter with some information.

To make developer's life easier and code more readable, a fluent
interface can be used to create these configuration objects. Three
different kind of configuration builders exist.

By default configuration used is the one loaded by calling
HBaseConfiguration.create() method.
HBaseConfiguration.create()
which uses hbase-site.xml and hbase-default.xml classpath files.

But also a method setProperty method is provided to modify any
parameter of generated configuration object.

Remote Connection

Configuring a connection to remote HBase uses same approach like
ManagedHBase configuration object but using
com.lordofthejars.nosqlunit.hbase.RemoteHBaseConfigurationBuilder class
instead of
com.lordofthejars.nosqlunit.hbase.ManagedHBaseConfigurationBuilder. .

Warning

Working with Apache HBase required a bit of knowledge about how it
works. For example your /etc/hosts file cannot contain a reference to
your host name with ip 127.0.1.1.

Moreover NoSQLUnit uses HBase-0.94.1 and this version should be
also installed in your computer to work with managed or remote approach. If you
install another version, you should exclude these artifacts from
NoSQLUnit dependencies, and add the new ones manually to your pom
file.

Verifying Data

@ShouldMatchDataSet is also supported for HBase data but we should
keep in mind some considerations.

If you plan to verify data with @ShouldMatchDataSet in Managed and
Remote approach, you should enable Aggregate coprocessor by editing
hbase-site-xml file and adding next lines:

Dataset Format

Notice that if attributes value are integers, double quotes are not
required.

Getting Started

Lifecycle Management Strategy

First step is defining which lifecycle management strategy is required
for your tests. Depending on kind of test you are implementing (unit
test, integration test, deployment test, ...) you will require an
managed approach or remote approach.

There is no CouchDB inmemory instance, so only managed or remote
lifecycle can be used.

To configure the managed way, you should use ManagedCouchDb rule and
may require some configuration
parameters.

CouchDB installation directory is retrieved from COUCHDB_HOME
system environment variable.

Target path, that is the directory where CouchDB server is
started, is target/couchdb-temp .

Port where CouchDB will be started. Note that this parameter is
used only as information, if you change port from configuration file
you should change this parameter too. By defaultCouchDB server is
started at 5984 .

Configuring remote approach does not require any special rule
because you (or System like Maven ) is the responsible of starting and
stopping the server. This mode is used in deployment tests where you are
testing your application on real environment.

Configuring CouchDB Connection

Next step is configuring CouchDB rule in charge of maintaining
CouchDB database into known state by inserting and deleting defined
datasets. You must register CouchDbRule JUnit rule class, which
requires a configuration parameter with information like host, port or
database name.

To make developer's life easier and code more readable, a fluent
interface can be used to create these configuration objects.

Dataset Format

Default dataset file format in Infinispan module is json. With this dataset you can define the key and the value that will be inserted into Infinispan. Value can be a simple types like Integer, String, ..., collection types, like set and list implementations or objects (using default Jackson rules (no annotations required).

Note that first key is inserting an object. You should set its implementation, and set the object properties in json format so Jackson can create the required object. User object only contains getter and setters of properties.
The second key is a simple key, in this case an integer.
The third one is a set of strings. See that we must provide the implementation of collection or an ArrayList will be used as default. Also you can define objects instead of simple types.

Getting Started

Lifecycle Management Strategy

First step is defining which lifecycle management strategy is required
for your tests. Depending on kind of test you are implementing (unit
test, integration test, deployment test, ...) you will require an
embedded approach, managed approach or remote approach.

Embedded Lifecycle

To configure embedded approach you should only instantiate next
rule :

Target path This is the directory used for starting Embedded Infinispan and by default is target/infinispan-test-data/impermanent-db, .
Configuration File Configuration file used by Infinispan for configuring the grid. By default no configuration file is provided and default Infinispan internal values are used.

By default managed Infinispan rule uses next default values but can be
configured programmatically:

Target path This is the directory where Infinispan server is started and is target/infinispan-temp .
InfinispanPath Infinispan installation directory which by default is retrieved from INFINISPAN_HOME system environment variable.
Port By default port used is 11222.
Protocol By default hotrod is used and internally NoSQLUnit uses hotrod too, so it should be desirable to no change it.

Remote Lifecycle

Configuring remote approach does not require any special rule because
you (or System like Maven ) is the responsible of starting and stopping
the server. This mode is used in deployment tests where you are testing
your application on real environment.

Configuring Infinispan Connection

Next step is configuring Infinispan rule in charge of maintaining Infinispan
columns into known state by inserting and deleting defined datasets. You
must register InfinispanRule JUnit rule class, which requires a
configuration parameter with some information.

To make developer's life easier and code more readable, a fluent
interface can be used to create these configuration objects. Three
different kind of configuration builders exist.

Embedded Connection

The first one is for configuring a connection to embedded Infinispan .

Embedded Infinispan does not require any special parameter. But you can use com.lordofthejars.nosqlunit.infinispan.EmbeddedInfinispanConfigurationBuilder class for creatinga custom configuration object for setting cache name.

By default the port used is the 11222, and configuration is used the default ones provided by Infinispan. You can also set the configuration properties (used by hotrod client) and cache name.

Remote Connection

Configuring a connection to remote Infinispan uses same approach like
ManagedInfinispan configuration object but using
com.lordofthejars.nosqlunit.infinispan.RemoteInfinispanConfigurationBuilder class. .

Verifying Data

@ShouldMatchDataSet is also supported for Infinispan data but we should
keep in mind some considerations.

If you plan to verify data with @ShouldMatchDataSet and POJO objects equals method is used, so implements it accordantly.

Full Example

To show how to use NoSQLUnit with Infinispan , we are going to create a
very simple application.

UserManager is the business class
responsible of getting and addinga user to the system.

Notice that if attributes value are integers, double quotes are not
required. Also you can define as many index subdocuments as required, but only one data document which will be inserted into Elasticsearch.
Moreover property indexId is only mandatory if you want to use the inserted data to be validated with @ShouldMatchDataSet. Document index is used to run comparisions faster than retrieving all data. If you are not planning to use the expectations capability of NoSQLUnit then you are not required to set indexId property and Elasticsearch will provie one for you.

Getting Started

Lifecycle Management Strategy

First step is defining which lifecycle management strategy is required
for your tests. Depending on kind of test you are implementing (unit
test, integration test, deployment test, ...) you will require an
embedded approach, managed approach or remote approach.

By default Elasticsearch Node is started with property local set to true, but this property and all other supported properties can be configured or you can still use the default configuration approach provided by Elasticsearch by creating elasticsearch.yml file into classpath.

Data will be stored in target/elasticsearch-test-data/impermanent-db directory.

Managed

To configure the managed way, we should use ManagedElasticsearch rule and
may require some configuration parameters.

Elasticsearch installation directory is retrieved from ES_HOME
system environment variable.

Target path, that is the directory where Elasticsearch server is
started target/elasticsearch-temp .

ManagedElasticsearch can be created from scratch, but for making life easier,
a DSL is provided using ManagedElasticsearchRuleBuilder class as seen in previous example.

Remote

Configuring remote approach does not require any special rule
because you (or System like Maven ) is the responsible of starting and
stopping the server. This mode is used in deployment tests where you are
testing your application on real environment.

Configuring Elasticsearch Connection

Next step is configuring Elasticsearch rule in charge of maintaining
Elasticsearch database into known state by inserting and deleting defined
datasets. You must register ElasticsearchRuleJUnit rule class, which
requires a configuration parameter with information like host, port or
database name.

To make developer's life easier and code more readable, a fluent
interface can be used to create these configuration objects. Three
different kind of configuration builders exist.

Embedded

The first one is for configuring a connection to embedded Node instance. For almost all cases default parameters are enough.

Advanced Usage

Customizing Insertion and Comparation strategy

NoSQLUnit provides a default dataset format, for example in case of Neo4j
we are providing a GraphML format, or in case of Cassandra we are offering cassandra-unit format.
But because you may have already written datasets in another format, or because you feel more comfortable with another format,
NoSQLUnit provides a way to extend the behaviour of insertion and comparation action.

To create an extension, each engine offers two interfaces (one for insertion and one comparation). They are called with the form:

<engine>ComparisonStrategy and <engine>InsertionStrategy. For example CassandraComparisonStrategy or Neo4jInsertionStrategy.

They provide a method for insert/compare data with inputstream of defined dataset file, and a callback interface with all connection objects.

Apart of that, each engine has a default implementation Default<engine>ComparisonStrategy and Default<engine>InsertionStrategy that can be used as a guide for developing your own extensions.

To register each strategy we must use @CustomInsertionStrategy and @CustomComparisonStrategy annotations.

Let's see a very simple example where we are defining an alternative insertation strategy of Redis system by using properties file instead of json.

Custom annotations are only valid on type scope. The custom strategy will be applied to whole test.

When using custom strategies for inserting and comparing data, location attribute of @UsingDataSet and @ShouldMatchDataSet must be specified.

Embedded In-Memory Redis

When you are writing unit tests you should keep in mind that they must
run fast, which implies, among other things, no interaction with IO
subsystem (disk, network, ...). To avoid this interaction in database
unit tests, there are embedded in-memory databases like H2 , HSQLDB
, Derby or in case of NoSQL , engines like Neo4j or Cassandra
have their own implementation. But Redis does not have any way to
create an embedded in-memory instance in Java. For this reason I have
written an embedded in-memory Redis implementation based on Jedis
project.

If you are using NoSQLUnit you only have to register embedded
Redis rule as described here , and
internally NoSQLUnit will create instance for you, and you will be
able to inject the instance into your code.

But also can be used outside umbrella of NoSQLUnit , by
instantiating manually, as described in next
example:

All the other operations, including flushing, expiration control, and
each operation of every datatype is supported in the same way Jedis
support it. Note that expiration management is also implemented as
described in Redis manual.

Warning

This implementation of Redis is provided for testing purposes not as a
substitution of Redis. Feel free to notify any issue of this
implementation so can be fixed or implemented.

Managing lifecycle of multiple instances

Sometimes your test will require that more than one instance of same
database server (running in different ports) was started. For example
for testing database sharding. In next
example we see how to configure
NoSQLUnit to manage lifecycle of multiple instances.

Note that target path should be set to different values for each
instance, if not some started processes could not be shutdown.

Fast Way

When you instantiate a Rule for maintaining database into known state (
MongoDbRule , Neo4jRule , ...) NoSQLUnit requires you set a
configuration object with properties like host, port, database name, ...
but although most of the time default values are enough, we still need
to create the configuration object, which means our code becomes harder
to read.

We can avoid this by using an inner builder inside each rule, which
creates for us a Rule with default parameters set. For example for
Neo4jRule :

Simultaneous engines

Sometimes applications will contain more than one NoSQL engine, for
example some parts of your model will be expressed better as a graph (
Neo4J for example), but other parts will be more natural in a column way
(for example using Cassandra ). NoSQLUnit supports this kind of
scenarios by providing in integration tests a way to not load all
datasets into one system, but choosing which datasets are stored in each
backend.

For declaring more than one engine, you must give a name to each
database Rule using connectionIdentifier() method in configuration
instance.

When you use more than one engine at a time you should take under
consideration next rules:

If location attribute is set, it will use it and will ignore
withSelectiveMatcher
attribute data. Location data is populated through all registered
systems.

If location is not set, then system tries to insert data defined in
withSelectiveMatcher
attribute to each backend.

If
withSelectiveMatcher
attribute is not set, then default strategy (explained in
section
) is taken. Note that default strategy will replicate all datasets
to defined engines.

You can also use the same approach for inserting data into same engine
but in different databases. If you have one MongoDb instance with two
databases, you can also write tests for both databases at one time. For
example:

Support for JSR-330

During test execution you may need to access underlying class used to
load and assert data to execute extra operations to backend.
NoSQLUnit will inspect @Inject annotations of test fields, and try
to set own driver to attribute. For example in case of MongoDb ,
com.mongodb.Mongo instance will be injected.

Note that in example we are setting this as
second parameter to the Rule. This is only required in versions of JUnit prior to 4.11. In new versions is no longer required passing the this parameter.

But if you are using more than one engine at same time (see
chapter ) you need a way to
distinguish each connection. For fixing this problem, you must use
@Named annotation by putting the identifier given in configuration
instance. For example:

There are some situations (mostly if using Arquillian) that you want to inject the value managed by container instead of the one managed by NoSQLUnit. To avoid an injection conflict NoSQLUnit provides an special annotation called @ByContainer. By using it, the injector processor will leave the field unttouched.

@Inject
@ByContainerprivateMongo mongo2;

Spring Data

With NoSQLUnit you can also write tests for Spring Data project. You can