TomEllis.io

In Part 1 of this blog series, "Adventures in NoSQL", I deployed a single
instance of MongoDB and used Python's tweetstream module to fill a collection with a data feed from Twitter.

In the real world you wouldn't ever use a single instance of MongoDB (or twitter data :-) ) as there is no redundancy
if an instance fails, all your data is gone or you need to take some time to restore it from a backup.

However, we can harness the power of a private Eucalyptus IaaS Cloud to use as our infrastructure, this means we can
quickly scale out resources using direct EC2 API calls, the euca2ools command line utilities or the Eucalyptus Web interface.

In this post, I'll explore using Replication to spread your data across multiple
MongoDB servers for redundancy.

You've deployed and setup a private Cloud platform but now what? You need an application!

I've been experimenting with a number of technologies to generate workloads and give some demos to
prospective Eucalyptus customers. A NoSQL database seems like a great use-case to demo as the technology benefits from
being designed for scale-out workloads and this happens to be exactly what an IaaS Cloud does best.

There are an abundance of NoSQL implementations (Cassandra, MongoDB, Couchbase, Neo4j...), written in different programming
languages and with slightly different takes on which two parts of the CAP theorem
they choose to implement and which method they will use to store and display data.

For this post I'm going to be using MongoDB, which is in the "CP" camp, it handles Consistency and Partition Tolerance
whilst forgoing Availability (Every request may not see a response), although MongoDB still provides some great availability
options.