So you want a NoSQL database, and you have chosen MongoDB. Here is a tutorial on how easy MongoDB can be integrated with Java application and some advice on how to isolate your code from the database as much as possible, for encapsulation purposes.

The API

Many aspects of the MongoDB API are not language specific, like the terms and the formats used for communicating with the persistence mechanism (JSON on the client side and BSON for transport and storage).

You can download a single JAR containing the Java bindings of the API, where the com.mongodb package is the main one that you will import classes from.

The setup is pretty simple: a hierarchy of Mongo (connection), DB and Collection objects, each contained into the other. A Collection loosely correspond to a relational table, but since embedded objects are allowed and arbitrary fields can be added to contained objects, it is more similar to an Aggregate class.

Special string keys (starting with $) allow for matching not based on equality. Check out MongoDB's Java tutorial for more about queries and indexes.

Isolation

I don't want com.mongodb.DBObject hanging around in arbitrary places in my applications, which in most places has nothing to do with MongoDB and persistence: this persistence agnosticism tells me to try hiding the database from most of my code.

Even if DBObjects and their implementations are the format that must be used to talk with MongoDB, we are not forced to use them inside the application. We can:

Convert JSON objects (instances of a class of yours) to DBObjects for storage.

Convert DBObjects returned after querying into JSON objects.

JSON objects can be just JSON strings (a rare choice), a JSON manipulable representation (there are plenty of Java libraries for that), or domain objects. The com.mongodb.util.JSON class has two static methods for going back and forth between a String and DBObject, so everything that you can serialize to a JSON string can be used inside the application. The impedance mismatch between object aggregates and MongoDB's model is minimal.

The natural choice for hiding the translation between domain and Mongo-based format is the Repository pattern, which in fact models a Collection of objects. If the conversion is really complex, a Data Mapper like Morphia can be used internally.

But for starting out, producing a JSON string representation of an object manually may be enough. In any case, the persistence strategy decision is hidden inside the particular Repository implementation, so you'll be able to change it later if the code grows.

Writing tests

There is no Fake equivalent for the Mongo connection object: Sqlite can be used as a fast, in-memory, non-transactional test database for relational solutions, but there aren't alternate implementations of MongoDB yet.

So when tests involve persistence, we must use the real one. The mongod service must be started separately (usually as a Unix daemon or a Windows service). On Linux platforms, check that the configuration in mongodb.conf defines a dbpath option as a folder where the mongodb user can write.

At the end of each test, resetting the database is necessary. In end to end tests, and in the tests of the Repository itself, you will then use a brand new instance of the database. There is no schema to create, nor constraints to be satisfied: it's very simple as even if the database do not exist MongoDB will create an emtpy one for you.Here is an example, the test for my Repository containing JSONObject instances (representing User's here):