What is CouchDB and Why Should I Care?

CouchDB is one of what many are calling NoSQL solutions. Specifically, CouchDB is a document-oriented database and within each document fields are stored as key-value maps. Fields can be either a simple key/value pair, list, or map.

Each document that is stored in the database is given a document-level unique identifier (_id) as well as a revision (_rev) number for each change that is made and saved to the database.

NoSQL databases represent a shift away from traditional relational databases and can offer many benefits (and their own challenges) as well. CouchDB offers us these features:

Compose makes it simple to deploy production-ready databases in minutes in the cloud or on your own servers.

An excellent tool to decide which data-store is right for you can be found in the Visual Guide To NoSQL Systems. This guide describes the three area of concerns that you can use to pick a database (be it NoSQL or relational in nature). For our project we used the guide to hunt for a database with the following features:

Availability

Consistency

Partition Tolerance

CouchDB fell into the AP camp (Availability and Partition Tolerance), which was what we were looking for for our own data concerns (not to mention the ability to replicate data on either a continuous or ad-hoc basis). As a comparison, MongoDB falls into the CP camp (Consistency and Partition Tolerance) and some databases, like Neo4J, offer a unique graph-oriented structure.

Another great tool to use is this blog post which compares Cassandra, MongoDB, CouchDB, Redis, Riak, HBase, and Membase.

It is highly conceivable that you may have more than one tool for a given project - in other words, you need to determine your needs and find the right tool to fit those needs.

How are we going to use CouchDB?

We are going to build a simple local events database to store some events as well as the venues at which they’ll take place. We will be splitting this up into two documents and wiring them together using their document ids. These two documents are:

Event

Place

(We will get into creating the Java classes for these two documents a bit later in this article.)

Jcouchdb

We are going to use jcouchdb to interface with our CouchDB database. This is an extremely well-tested and easy to use Java library that will automatically serialize and deserialize Java objects into the CouchDB database. Another reason why we chose jcouchdb is because of how close it is to the actual API of CouchDB itself.

What alternatives to jcouchdb are there?

If you find that you don’t like jcouchdb or would like to try a different library, there are quite a few to choose from:

A few of these haven’t been updated in quite awhile, so be sure to plan some time for programming spikes if you need to do some testing.

Getting Started

Where to get started? We are going to be using Maven 3 to build our sample project. You won’t need to know Maven in order to understand the code, but you will need to have it installed in order to build and run the sample project. You can find Maven 3 on the Maven website.

For this part of the tutorial we will be assuming some level of Maven 3 knowledge, or if you don’t know Maven you can just download the pom.xml file directly from our repository and use it directly.

We’re going to skip the initial part of POM creation, but you can download it from our github repository at (https://github.com/r351574nc3/spring-couch-intro/blob/master/pom.xml) if you need the nitty gritty details of creating a pom or just want to get started coding. First order of business is to specify the jcouchdb and Spring components we will be needing.

Getting CouchDB setup

Both of these offer free accounts and are perfect for getting our database setup so we can start developing.

(Click on the image to enlarge it)

Fig. 1 - CouchAnt home screen

(Click on the image to enlarge it)

Fig. 2 - CouchAnt’s Futon screen

(Click on the image to enlarge it)

Fig. 3 - The signup screen for Iris Couch

(Click on the image to enlarge it)

Fig. 4 - Iris Couch’s Futon screen

The other option we have is to install CouchDB on a local machine (or host). We won’t walk you through installing on your specific operating system but there are some excellent instructions on CouchDB’s wiki.

Once you have your account created (or CouchDB up and running), we will need to create a database to play with. For our application we chose couchspring as the database name. Feel free to choose your own but you’ll need to change it when we begin to configure our setup.

To create a database in CloudAnt you can do this from their databases screen (Fig. 1), for Iris Couch you can do this directly in Futon (the user interface for managing your CouchDB instance). More information on Futon can be found on the CouchDB wiki. We won’t be doing much using Futon in this article but it is a great tool for playing around with views.

Fig. 5 - Create database in Futon Step 1

Fig. 6 - Create database in Futon Step 2

Configuring jcouchdb, Spring and our POJOs

Now that we have a new database setup we need to:

Create our base POJO objects

Provide a json configuration mapping, which will automatically convert between the Java objects and the JSON objects that CouchDB uses

Spring configuration

First, let’s create some objects!

POJOs with some custom annotations

What are the base objects we’ll need to create, then, for our event system?

Event - to store events either from outside sources (like Eventful.com) or using a web interface

Place - to store venues where events are being held at

We have a few other objects that will be used in conjunction (and do some additional data processing while pulling in data from external sources):

AppDocument - base object used by the json mapping utility to define a document-type differentiator field

Description - used for formatting and filtering out the event’s description

Location - used to record the latitude and longitude of a given place/venue

This object extends from jcouchdb’s own BaseDocument object and provides a way to differentiate between different document types. CouchDB doesn’t have a default way to handle this and leaves it up to you the developer to implement on your own. We’ve chosen to use the class name as our differentiator; for example, Event objects will output docType as Event and Place objects will output Place.

Next we need to create our Event class.

Event.java (we have abbreviated some of the fields and methods for brevity)

There are a few things of interest going on here. First is the fact that we’re storing the venueId instead of the venue in our object, why do we do this?

Because CouchDB isn’t a relational database, there isn’t a direct way to define a relationship between two different documents so we store the id of the venue in the Event object. We could store the venue object embedded in our event object, but it makes more sense to store these separately, especially since you could have multiple events at a given venue. So, instead of storing the relationship, we will provide a dynamic getter that will retrieve the venue object only when we need it. We’ll describe how to do this in the Querying for documents section. [todo: dynamic query]

We won’t detail the other helper objects Description or Location, as they are fairly simple. If you’re interested, you can check them out from the GitHub repository.

Configuring jcouchdb and the JsonConfigFactory

Before we configure, we need to create a few classes we’ll be using. JsonConfigFactory for mapping between the json data (CouchDB) and the Java classes, and CouchDbServerFactory for creating a new instance of our server we will be connecting to.

This class creates a generator for converting from a Java class (Event or Place) and its json equivalent, the parser reverses the process. There are a few key things to look at in the typeMapper (used in both generator and parser), specifically the base type and the discriminator field. typeMapper.setEnforcedBaseType(AppDocument.class) will only convert docs that inherit from the AppDocument class. typeMapper.setDiscriminatorField("docType") will use our docType field and value to discriminate between different types of documents. You can feel free to change this field to some other name, but you’ll need to change the method and json mapping in the AppDocument class. To refresh your memory, here is the method we’re referring to:

The final item to look at is typeMapper.setPathMatcher(new SubtypeMatcher(AppDocument.class)) which will automatically look at sub-types to make sure that we’re converting between objects that inherit from AppDocument. It is possible to supply your own parser for several of the jcouchdb method calls for retrieving or querying the database, but we won’t be investigating those in this tutorial.

Now that we have the classes we need it’s time to configure our spring context. We’ve separated out our CouchDB-specific points to couchdb-config.xml.

The first thing we need to do is setup our annotations with <context:annotation-config />, which sets up the spring context’s annotations. The next two sections setup the jsonConfigFactory and gets it ready to use in our server instance. Finally, we create our serverFactory that we use to create an instance of our couchDbServer, which is then fed into the jcouchd database instance along with our jsonConfig and the database name we want to connect with. All of our properties - username, password and url are currently passed in through the command-line but you could just as easily provide a specific property file.

Now that we’ve got everything configured it’s time to write some tests.

Create, Save, Retrieve, Update, and Delete

Before we dive into creating views, let’s start with some basics like creating, updating, retrieving and deleting. For all of our tests we want to do a few things to them. Here’s the class definition for CouchSaveTest, but it is the same for the other tests as well.

The first annotation @RunWith tells Maven to use the SpringJUnit4ClassRunner to run this test (as opposed to a standard JUnit class runner). This allows our next annotation to start up a Spring context for this test @ContextConfiguration("/root-context.xml"). This context loads all of our CouchDB beans, our POJOs with their JSON annotations, and our CouchDBUpdater that automatically updates our views for us to the CouchDB server. We will cover this last one in the Views section below.

Finally, we tell Spring to autowire in our database into the test class so that we can use it.

Document creation

One of the first steps in any kind of DB storage system is the ability to create a new record (or in our case, a document). How do we do this using jcouchdb’s API?

Here we create a new Event object and then call the database.createDocument() method and pass in the new event. Our JsonConfigFactory will then map our fields into a CouchDB document. [insert screenshot]

This method actually tests two things for us, first retrieving a document by calling Event document = database.getDocument(Event.class, "2875977125"); and passing in its document id - “2875977125” in this case. We’re also testing the update method database.createOrUpdateDocument(document); which will, as its name suggests, either create a new document or update an existing one (meaning if it already has an id that matches a document in the database, it will update).

These two methods test calling the delete() method first on a document that does exist and second on one that doesn’t (which will throw a NotFoundException).

Querying for documents

Now that we have the basic CRUD operations complete, we need to get down into doing something a bit more complex. Querying our database by more than just the id of the document we’re looking for. For this article we’re just going to delve into views a little bit, as they can be very complex. More on views can be found on the CouchDB wiki as well as the online version of CouchDB: The Definitive Guide.

With that being said, let’s get started writing some views!

Introduction to Views

First, what exactly are CouchDB views and how do they work?

Views are a way to filter or query the data in your database. Views are typically written using JavaScript, it is possible to write views using other languages, but that is a different topic we won't be covering here. Each view maps keys to values inside of a document. Views indexes are not updated until a document is accessed, but you can changes this behavior with an external script if you wish. All views in a single design document get updated when one of the views in that design document gets queried.

Design documents

Before we look at creating views we should discuss how our application automatically uploads (and keeps the views up to date). All views are tied to a design document. We will have two design documents in this instance:

event

place

These two design documents will be created automatically by the org.jcouchdb.util.CouchDBUpdater class. This class is configured in the couchdb-services.xml file.

The CouchDBUpdater listens for changes in our designdocs directory and automatically pushes those changes up to the configured CouchDB database. What does the designdocs directory actually contain then?

Our first view

Here, then, is a simple view that looks for all documents that are “event” documents:

function(doc) {
if (doc.docType == 'Event') {
emit(doc.id, doc);
}
}

This view simply returns the id of all documents that have a field docType that matches the value Event. Let’s examine this a bit to see what it is doing. The first line is a JavaScript function definition which accepts a doc as its sole parameter. We can then examine values stored inside the documents themselves (doc.docType in our case). Then finally, we have the emit function which takes two arguments key and value, where value can be null. Our key in this case is the doc.id field and our value is the full document.

The emit function is what we will actually be using to query our database in the next few view examples. The other key thing to understand about emit is that it will order the returned documents by their key value.

Retrieving events through venue id

One of the first views that will be handy to use will be to retrieve a given set of events by their associated venueId. To do this we will need to write a view that emits the venueId as its key and the document as its value (although not strictly needed with jcouchdb’s functions). So, what does the view look like, then?

One of the key differences here in how we are calling the view is that we are using the queryViewAndDocumentsByKeys() method to pass in the viewName, the mapping class Event and the keys we are querying on (in this case just one key is queried that of the venueId).

Retrieving events by date

Both of those views were relatively simple. How do we do something a bit more complex like querying by date? First, we need to define our view.

We have a new object here called Options which allows us to specify which query options we wish to pass in to our view. In this instance we are providing a startKey and an endKey to retrieve a set of objects. One thing to be aware of is that what you emit/match against must be the type of data you are passing in. In our case we are dealing with ints so we must pass in int fields to our keys. Order (of course) is also key, we are passing in year, day, month to match against the year, day and month in the view.

Now, what is this endKey? So, the endKey parameter allows us to specify a range for our query. In this instance we've chosen the same date, but we could easily have chosen different values to get more or fewer documents back. CouchDB will simply compare each of the keys in turn until it no longer matches and will return that set of documents back to us.

Dynamic query for retrieving venue from an event

What we're doing here is simply applying the same logic that we did for queryByVenueId, except for places by event id.

You just need to write another view similar to the allByVenueId for the place document and that's it.

Where can we go from here?

The view (or map) is just the first part of the map/reduce functionality that CouchDB provides. So, what is the reduce (and re-reduce) functionality and what can we do with it?

Reduce allows us to take a set of results from a previous map and perform additional operations on it to reduce the results into a more compact form.

We will leave reduce and re-reduce for you to explore on your own, but you can do some very interesting things with them. Explore, and have fun with CouchDB!

About the Authors

Leo Pryzbylski is a pillar of technical innovation at Clearbox media. He wields a giant mallet of creative problem solving in one hand and a enchanted claymore of software architecture experience in melee combat against the horde of software irrelevence. Leo has a broad skillset attributed to his experience as a game developer, qa engineer, release manager, configuration managerdevelopment manager, computer scientist, network intrusion specialist, embedded sofware engineer, software architect, scientific programmer, and system administrator.

Warner Onstine started his career in the tech industry doing technical support for Intuit in the early 90′s. While there, he learned how to develop web applications and left to pursue a career as a software engineer. Since then, he’s worked at a variety of places including Intalio, the University of Arizona, and now works as a lead developer at rSmart.The seed for ClearBox Media started when Warner learned about ARGs and started playing them for fun a few years ago. They are appealing as they are a good blend of playing in the real world, but within a fictional environment. Warner, and others, see the potential for ARGs to change our society for the better and is one of the guiding principles of “The Human Mosaic Project” – Have Fun. Do Good.