Sunday, 4 August 2013

If you're entirely new to MongoDB - this is just a very gentle introduction that gets you up and running with a sample database in just a few easy steps.

It's really easy to get started. The steps are:

1. Install MongoDB
2. Create a Database
3. Query the Database

1. Install MongoDB

The mongoDB site has simple, clear instructions for downloading and installing. Just click on the link below and follow the instructions for your particular operating system (much clearer that trying to explain them again here!):

(Make sure you have the have started the 'mongod' server process as explained in these instructions - you should have a mongod server up and running before moving on to Step 2 - like in this screenshot below)

2. Create A Database

Now we have a server up and running - the next step is to create a database and pop some data in. For our example we have a very simple Library database, with a single 'book' table (or 'collection' in mongo-speak).

Instead of running SQL against mongo - you use Javascript - an example build file is included below (you could enter the commands in the interactive console window - shown later - but for speed and ease of use - you can also run them all together from a script file - which is the approach we use here)

Copy the javascript code from the GIST below, and save it to a file called 'library_build_script.js' build script on your machine

Open a new console window, navigate to the 'bin' folder in the extracted mongo installation files, and use the mongo command to load the script - just type:

mongo SCRIPT_LOCATION/library_build_script.js'

(Note: if you set the PATH up on your machine, you can of course run the mongo executables from anywhere)

this will run the build script to create the database, collection and indexes, and prepopulate with some test data.

Note - we are using getSiblingDB() to create and work with a sibling database. This is slightly more complex than the default, but if you create multiple databases, this helps organise them better (as you build databases in a tree like structure). It keeps it easier to manage/view multiple databases on the same server. Have a play around with different options when you get up and running to see what works best for you.

3. Query the Data

It's as simple as that to have a populated database up and running on your machine - we can now use the interactive mongo command line tool to query the data

Start up the interactive mongo console. Still in the 'bin' folder, type

mongo

Find All Books Query - list all books in the database

db.getSiblingDB("library").book.find()

Find All Books With Type 'Fantasy' Only

db.getSiblingDB("library").book.find({'type':'Fantasy'})

As you can see - it's really easy to get up and running in just a few minutes. Of course there's much more to cover - but this should be a good starting point to start playing.

I'll hopefully be expanding on the Library database in some future posts - exploring more advanced features and functions.

Addendum: Using A GUI Client

You can also use a GUI client to explore your local MongoDB database if you prefer that to using the command line.

Check out UMongo - you can download it for free from the website here:

Working Example on Github

Running the Demo

See the README file over on GitHub for details of running the demo. Essentially - it's just running the Unit Tests, with the usual maven build and test results output to the console - example below. This is the result of running the DBUnit test, which inserts Book data into the HSQL database using JPA, and then uses Lucene to query the data, testing that the expected Books are returned (i.e. only those int he SCI-FI category, containing the word 'Space', and ensuring that any with 'Space' in the title appear before those with 'Space' only in the description.

The Book Entity

Our simple example stores Books. The Book entity class below is a standard JPA Entity with a few additional annotations to identify it to Lucene:

@Indexed - this identifies that the class will be added to the Lucene index. You can define a specific index by adding the 'index' attribute to the annotation. We're just choosing the simplest, minimal configuration for this example.

In addition to this - you also need to specify which properties on the entity are to be indexed, and how they are to be indexed. For our example we are again going for the default option by just adding an @Field annotation with no extra parameters. We are adding one other annotation to the 'title' field - @Boost - this is just telling Lucene to give more weight to search term matches that appear in this field (than the same term appearing in the description field).

This example is purposefully kept minimal in terms of the ins-and-outs of Lucene (I may cover that in a later post) - we're really just concentrating on the integration with JPA and Spring for now.

The Book Manager

The BookManager class acts as a simple service layer for the Book operations - used for adding books and searching books. As you can see, the JPA database resources are autowired in by Spring from the application-context.xml. We are just using an in-memory hsql database in this example.

application-context.xml

This is the Spring configuration file. You can see in the JPA Entity Manager configuration the key for 'hibernate.search.default.indexBase' is added to the jpaPropertyMap to tell Lucene where to create the index. We have also externalised the database login credentials to a properties file (as you may wish to change these for different environments), for example by updating the propertyConfigurer to look for and use a different external properties if it finds one on the file system).

Testing Using DBUnit

In the project is an example of using DBUnit with Spring to test adding and searching against the database using DBUnit to populate the database with test data, exercise the Book Manager search operations and then clean the database down. This is a great way to test database functionality and can be easily integrated into maven and continuous build environments.

Because DBUnit bypasses the standard JPA insertion calls - the data does not get automatically added to the Lucene index. We have a method exposed on the service interface to update the Full Text index 'updateFullTextIndex()' - calling this causes Lucene to update the index with the current data in the database. This can be useful when you are adding search to pre-populated databases to index the existing content.