Introduction

I don't know how many of you work with SQL. Loads? Well I do, as we know it's
a relational database which we can store um well relational SQL data types in,
such as INT/CHAR/NVARCHAR etc etc, I am sure you all know what I am talking
about and have used relational databases loads in the past.

Do you think there are other sort of databases out there? No? Well actually
there are some different types of databases other than relational databases,
such as

Flat file

Object

NoSQL / Document / KeyValue pair

Now I do not profess to know much about Flat file or Object databases per se,
but I have spent some time evaluating and getting to know some of the newer
Document databases. In this article I have chosen to look at 3 different
Document databases, which I have created demos for in the code attached to this
article, but before we go on to talk about each of them and how to get started
with them, let's just spend a bit of time talking about the gaining popularity
of these document databases.

So I have stated that this article will talk about document databases, but
what are these document databases, and why might you want to use one.

The reason to use a document database may come from any number of
requirements, such as

A more accurate representation of your business model may be able to be
expressed if you ditch the relational model

RESTFul API (though most users would try and find a native client for
their language of choice)

Schema changes do not really matter as much as they would in a
relational database, ad-hoc changes to the schema are supported

So those are some of the reasons, how about looking at some of the features
that a typical document database might minimally provide:

Http enabled server, which is capable of handling standard HTTP requests
for data (think PUT/GET/POST/DELETE), so if there was no driver for your
language of choice available you could always just use standard Http
requests.

Documents are typically stored as some sort of serializable format (most
typically this appears to be JSON or BSON)

The abilty to store an entire document. Yes I mean a document in
comparison to a rich object, with methods inheritence heirachy etc etc.
These are not present in a document database; code is not part of the
database.

Sharding

Replication

One thing that is of note when working with document databases, is this idea
of "eventual consistency", which is not what we might be used to seeing by and
large, in the relational world.

What is meant by this term, it sounds scary right. Well I guess it is a bit,
but when working with document databases, its seems a common approach that they
all follow, to allow you to push updates/inserts into the document store, but
that does not mean these will be changes will be nesecarrily seen by all readers
of the data straight away They will of course be written to to all reader
sources eventually, but not straight away. Which means we may occassionally see
inconsistencies.

Why is this, how has it come about. It's all about scaling and
availability really, if you only have a single source of data, a you do
typically in a relational database world, then you must lock reads while you are
writing. It's a simple model but it remains totally consistent, but it will not
scale that well, some of sort of sharding must be used, which is not that common
in relational databases. In fact having never done that is a RDBMS I am not sure
its even possible, may be it is maybe not.

Anyway that is one area I thought I would alert you to straight away, I
found these links to be highy informative on this subject, should you want to
know more

You may find that this alone is reason enough that document databases may not
be a good fit for you, but that is a decison only you can make

Anyway you do not need to be too concerned with these issues, as this article
will be more focussed on elementary usage of document databases, such as
how to perform simple CRUD operations, I just thought it was worth mentioning
upfront, so you knew about it, so there we go you have been warned.

Now there are loads and loads of document databases out there, far too many
for me to go through, for my initial evaluations I chose to look at a few based
on what I perceived to be the best attributes, such as

Features

Ease of use

Reputation

.NET driver availability (I am a .NET developer these day, so this
article is about using .NET with the relevant document database)

With those list of attributes, I ended up with quite a large list, which I
whittled down further to end up with 3 document databases, which I will talk
about in this article.

Redis

Raven

Mongo

Do not expect to reach the end of this article and be an expert in document
databases, but I would hope by the end of reading this article you will be able
to understand how they work (kinda) and would be able to carry on using them and
finding any remaining answers you need by yourselves.

PreRequisites

Before we start you will need to make sure you have downloaded the relevant NoSQL server and .NET client APIs.
I would have liked to have upload them with this article but unfortunately they
are just too big for codeprojects limits, so this task must fall to you. So
shown below are the components you will need to download

Redis

For Redis you will need to download the following 3 items, and ensure the
correct .NET portions are references correctly

Once you have downloaded this ensure you fix all the references within
the DocumentDB.Mongo project

IMPORTANT NOTE : Once you have downloaded these items and put them
somewhere good, you will need to do the following

Ensure each of the 3 projects within the downloadable demo code attached
to this article, has its references fixed to point to where you downloaded
the .NET client API Dlls to, which I am hoping you did when you following
the steps above. If not make sure to do that now

Within each of the 3 projects within the downloadable demo code attached
to this article is a simple server wrapper which simply spawns the correct
actual NoSQL server process. This is more for convience than anything else,
and you will need to change the path to the actual NoSQL server location to
match where you downloaded it to. This is done by changing a string value
within each of the XXXXServer.cs classes in the 3 demo projects within the
downloadable demo code attached to this article.

Just for the record, when I was developing the code for
this article I put the NoSQL servers/.NET Clients with the downloadable solution
something like shown below, but you put them where you want, and just make sure
to do the 2 steps above for all 3 projects in the demo code

Redis Document Database Usage

In this section I will discuss using
Redis. Redis commands are at the
core of Redis, where a full list of commands can be found at
http://redis.io/commands, here is screen
shot to show you the sort of thing I am talking about

This screen shot shows only a small portion of the available Redis commands.

Now even though this is how Redis works internally there is not really a need
for you to get to know these commands, as any Redis client will already be
calling these commands on your behalf when it talks to the server. I just
thought it might be useful to show you how redis works under the hood.

One other very import aspect of getting to know Redis is that fact that it is
designed to work with extremely dynamic datasets, as such it is really intended
that your entire dataset should fit into memory. This may sound mad, but it
really depends on the type of application you need to write.

Although Redis operates in memory it does have multiple disk persistance modes i.e Journalling and/or entire snapshots. See this link for more information on this
http://redis.io/topics/persistence

If you want years and years or storage that you could bring back into memory at any time then
Redis may not be a good fit for you. On the other had is you have some very
dynamic fast moving data, that you could live with expiring after x-time, then
Redis would be a very good fit.

The Server

The Redis server (available for download here :
http://redis.io/download) is witten in C++ and can be run using
the "redis-server<code>.exe" process. In fact when you have downloaded
Redis server
you should see something like this

Where there are a number of different processes that can be used for managing
Redis server.

The .NET Client

There are quite a few Redis clients for all sorts of languages. This aricles
demo code uses .NET, so I obviously had to pick a .NET client, but which one.
Well that was down to me doing a bit of research and picking one, and it seemed
that ServiceStack seemed to be quite popular, so that is the one I chose. It can
be foud here:
https://github.com/ServiceStack/ServiceStack.Redis

It would be
pretty much impossible for me to outline every feature of Reis, but I shall
outline what I think are the most important parts when getting started with
Redis.

1st Steps

You must have the Redis actual server running. I have tried to make this
easy for by creating a helper class called "RedisServer" which
you should modify to point to your Redis server downloads. Once the Redis actual
server is running we need to create a software RedisClient (which
connects to the actual Redis server instance).

Some skeleton code is shown below, that all the Redis code uses in the
attached demo code

This software RedisClient is then used by the various classes in
the demo code, so you can expect to see the use of a RedisClient
object in use in the attached demo code

Basic CRUD Using Typed Objects

In order to show you how to use my chosen Redis client (Service Stack) all
you really need to know how to do is use an instance of a RedisClient
which would typically be used as follows, to obtain a IRedisTypedClient

An important IMPORTANT note here is that the Save() method here is not quite what it seems. This actually does a foreground/synchronous snapshot save to disk - you generally never want to do this in Production. You would be better of doing a BGSAVE (background save) for Production environments

Linq Support

As I have already stated on numerous occassions Redis works with in memory
datasets, that is the entire dataset MUST fit into memory as such it exposes
numerous collection classes for managing the in memory store. Within the Service
Stack Redis client these collections are typically managed using standard .NET
collection classes, as such any of the standard LINQ extension methods may be
applied straight to the collections. For example here is where I get a List<Blog> items, and filter it using LINQ

It should of course be noted that since Redis works with a in memory dataset, it is not using a true IQueryProvider, and it really just using Linq to objects, there is no lazy loading occurring in the database. So just be mindful of this fact

Transaction Support

Within the Service Stack Redis client, transactions are managed by obtaining
a new IRedisTransaction of the IRedisClient by using
the CreateTransaction() method, after you have a IRedisTransaction you can simply use the following methods to work with
the IRedisTransaction

QueueCommand(..) : Will inlist the contained command in the
transaction

Cache Expiry

Sorry to harp on and on about this, but the point that makes Redis so fast is
that it works with the dataset entirely in memory. So what happens to data you
no need, is there a way to delete is, well yeah we could do it programatically,
but is there a way of use Redis as some sort of Most Recently Used (MRU) cache
where data can expire on its own?

Turns out this is possible and what we need to do is use one of the standard
Redis Service Stack typed client methods ExpireIn which has a
method signature which looks like this

bool ExpireIn(object id, TimeSpan expiresAt);

To see this in action I have provided a bit of code in the demo which is as follows

So that is how you might manage fast moving data that you only want to live for x amount of time.

Pooling

Service Stack (The Redis Client I chose to use) does offer connection pooling
via the use of 2 classes PooledRedisClientManager and any class of
your own that implements the Servce Stack interface IRedisClientFactory

After following these steps all that is left to do is spin up some and use a new PooledRedisClientManager. We get access to a client via the use of the
PooledRedisClientManager. In the demo code I spin up new clients in new threads to simulate concurrent access to the connection pool, here
is the relevant code

///<summary>/// Use the PooledRedisClientManager to gain access to n-many clients
///</summary>publicvoid Start()
{
Thread t = new Thread((state) =>
{
constint noOfConcurrentClients = 5; //WaitHandle.WaitAll limit is <= 64
var clientUsageMap = new Dictionary<string, int>();
var clientAsyncResults = new List<IAsyncResult>();
using (var manager = CreateAndStartManager())
{
for (var i = 0; i < noOfConcurrentClients; i++)
{
var clientNo = i;
var action = (Action)(() => UseClient(manager, clientNo, clientUsageMap));
clientAsyncResults.Add(action.BeginInvoke(null, null));
}
}
WaitHandle.WaitAll(clientAsyncResults.ConvertAll(x => x.AsyncWaitHandle).ToArray());
Console.WriteLine(TypeSerializer.SerializeToString(clientUsageMap));
var hostCount = 0;
foreach (var entry in clientUsageMap)
{
hostCount += entry.Value;
}
});
t.SetApartmentState(ApartmentState.MTA);
t.Start();
}
privatestaticvoid UseClient(IRedisClientsManager manager, int clientNo,
Dictionary<string, int> hostCountMap)
{
using (IRedisClient client = manager.GetClient())
{
lock (hostCountMap)
{
int hostCount;
if (!hostCountMap.TryGetValue(client.Host, out hostCount))
{
hostCount = 0;
}
hostCountMap[client.Host] = ++hostCount;
}
Console.WriteLine("Client '{0}' is using '{1}'", clientNo, client.Host);
//YOU COULD USE THE SPECIFIC CLIENT HERE, YOU MAY HAVE TO TEST THE HOST TO SEE IF ITS THE ACTUAL ONE YOU WANT
//YOU COULD USE THE SPECIFIC CLIENT HERE, YOU MAY HAVE TO TEST THE HOST TO SEE IF ITS THE ACTUAL ONE YOU WANT
//YOU COULD USE THE SPECIFIC CLIENT HERE, YOU MAY HAVE TO TEST THE HOST TO SEE IF ITS THE ACTUAL ONE YOU WANT
//YOU COULD USE THE SPECIFIC CLIENT HERE, YOU MAY HAVE TO TEST THE HOST TO SEE IF ITS THE ACTUAL ONE YOU WANT
}
}

It is slightly backward in its usage in that you must ask the PooledRedisClientManager
for a IRedisClient where it will use the IRedisClientFactory you
provided, and just give you a IRedisClient which you can use. But
which IRedisClient you get is up to the PooledRedisClientManager.
So if you are relying on it being a specific IRedisClient guess
again, you will need to check which IRedisClient has been dished
out by the PooledRedisClientManager.

Raven Document Database Usage

There are a couple of points that are worth note before we start to look at
using Raven, so lets
give these a quick bit of discussion right now shall we:

Raven is written entirely in .NET, yes even the Server is .NET. I have
seen plenty of chat/internet noise about this, and people saying that it
would not be fast enough for high volume dataset demands. To be frank I am
not in a position to say for sure whether this is the case or not, as I was
just in an evaluation mode whilst looking at various different document
databases. What I can say though is that for my evaluations I found no
issues at all

Raven has a concept of denying unbounded results sets, so if you try and
bring back to much data Raven will step in and not allow that. This is a
setting that can be changed but it is not encouraged

Raven borrows ideas from other well known frameworks, mainly NHibernate,
so when you see a IDocumentSession this should seem pretty
familiar and almost be the same to use as ISession was/is in
NHibernate.

The commercial version of Raven is not free, but its not that much if it
fits your needs

The Server

The Raven server (available for download here :
http://ravendb.net/download) is witten in C# and can be run using
the "Raven.Server.exe" process. In fact when you have downloaded
Raven server
you should see something like this

Where there are a number of different processes that can be used for managing
Raven server.

The .NET Client

It would be
pretty much impossible for me to outline every feature of Raven, but I shall
outline what I think are the most important parts when getting started with
Raven.

1st Steps

You must have the Raven actual server running. I have tried to make this
easy for by creating a helper class called "RavenServer" which
you should modify to point to your Raven server downloads. Once the Raven actual
server is running we need to create a software DocumentStore (which
connects to the actual Raven server instance).

Some skeleton code is shown below, that all the Raven code uses in the
attached demo code

This software DocumentStore is then used by the various classes in
the demo code, so you can expect to see the use of a DocumentStore
object in use in the attached demo code

Basic CRUD Using Typed Objects

In order to show you how to use my chosen Raven client all
you really need to know how to do is use an instance of a IDocumentSession
which would typically be used as follows, where a Query is run using a
generic type of the Document that you would like to obtain data for

Transaction Support

Raven supports transactions whole heartedly, and as Raven is all written in
.NET you even get to use familiar transaction classes such as TransactionScope, which I have to say does make life easier. Here is an
example of how to use Transactions with Raven.

Direct Database Operations

Sometimes you just need direct access to the underlying Raven database
commands. In Raven this is done using the DocumentStore.DatabaseCommands
property which will give you an instance of a IDatabaseCommands,
which allows you to carry out the various tasks. Shown below is the IDatabaseCommands
interface definition straight from Raven, which shows you what sort of things you can do with a
IDatabaseCommands instance

Full Text Search

One of the more interesting things that Raven allows you to do is to do full
text searches. To do this we would typically create an Index where we supply a
Map LINQ Exrpression to build the Index. Here is an example

Batch Operations

This section will outline how to carry out batch operations using Raven.

Bulk Updates

Sometimes you may need to update/delete a whole batch worth of data. Raven
supports these type of operations by using what it called "set based
operations". To do "set based operations" in Raven you have to use its "patching
api". We will now see some examples of how to use Ravens "patching api".

As the heart of Ravens patching API are 2 objects that you will need to get
familiar with, which are

This code relies on there being an index which specifies an Index (which Raven builds using LINQ queries) to put a "UsersByName" index in place that the
PatchRequest can use when its doing its bulk update to obtain the correct document.

MongoDB Document Database Usage

The Server

The MongoDB server (available for download here :
http://www.mongodb.org/downloads) is witten in C++ and can be run using
the "mongod.exe" process. In fact when you have downloaded MongoDB
you should see something like this

Where there are a number of different processes that can be used for managing
MongoDB.

The .NET Client

The .NET client that this article uses is the official (supported) .NET
client, which can be downloaded from
https://github.com/mongodb/mongo-csharp-driver/downloads. It would be
pretty much impossible for me to outline every feature of MongoDB, but I shall
outline what I think are the most important parts when getting started with
MongoDB.

1st Steps

You must have the MongoDB actual server running. I have tried to make this
easy for by creating a helper class called "MongoDBServer" which
you should modify to point to your MongoDB downloads. Once the MongoDB actual
server is running we need to create a software MongoServer (which
connects to the actual MongoDB server instance).

Some skeleton code is shown below, that all the MongoDB code uses in the
attached demo code

It can be seen that when working with MongoDB we are working closely with
MongoCollection<T> which we can use to store BsonDocument objects in. But these BsonDocuments don't seem
that handy, we don't have any typing at all there, they are nothing more than a
dictionary of key/values, mmm not that useful. Surely there is a way we can
store our own object structures.

Well yes as it turns out there is, lets see that.

Serializing Your Own Types

Ok so we have seen hw to deal with BsonDocument objects and found them lacking, and we now want to serialize
our own objects, so how do we do that. Well its simply a question of doing the following:

That we use the BsonElementAttribute to mark our members as
being serializable

That the Ids are using ObjectId, which is a MongoDB type.
You can use long/int but if you use ObjectId you will get this Id automatically filled in by MongoDB

I have designed my models is to use links to Ids for foreign keys, this is
for 2 reasons

I don't want a massive graph brought back when I get a whole
document stored, I want to decide when to fetch this linked data. So I
just store the Ids of other data I may be interested in, if I find I
need it, I'll look it up later

BSON/JSON doesn't support circular references, so its just easier to
store Ids, and have no cycles

Having this arrangement will allow these type of objects to be serialized to
Bson (Binary JSON)

I have to say I am not sure I like this too much, as its putting persistence
concerns in my model, I guess WCF does that with its DataContract/DataMember
attributes so its not so different, but WCF objects are more DTOs to me. But hey
you can decide if you like it or not.

Basic CRUD Using Typed Objects

I think the best way to show you some examples of basic CRUD operations using
strongly typed models is to literally show you a complete listing of the demo
code. So here goes

Transaction Support

This is not something that MongoDB offers.

Comparisons

Whist evaluating the 3 document databases that I chose, these were my take
away points from them, as I say these are my own opionions and only I can be blamed
for them. I am also talking more from a beginners point of view as well, the
links I include below this table may help you if you are looking for deeper
answers, such as what sort of replication mechanisms are supported, but from a
purely zero to simple CRUD usability stance this is what I thought.

Redis

Raven

Mongo

As Redis uses an approach where the entire dataset must fit into
memory, it is fast. Lightning fast in fact.

Supports transactions

Supports LINQ

Good support for different types of collections

Sets/Lists etc etc

Values can expire

Replication

Very feature rich

Web based tool for inspecting/managing database store

Supports transactions

Supports LINQ

Map/Reduce

Full text search

Bulk operations/Patching API

Supports Sharding

Embedded mode

InMemory mod (for testing)

No where near as fast as Redis (due to Redis everything in memory approach)

Replication

Mature well used API

NO support for transactions

Supports LINQ (though there is some strangeness with LINQ API)

Connection sharing

Map/Reduce

Bulk operations

No where near as fast as Redis (due to Redis everything in memory approach)

In terms of which document database I would choose to use it would reall
depend on what I was trying to do, for example if I had a very quick changing
data such as FX Rate ticks/tweets which would quicky expire I would use Redis (I
also hear very good things about Cassandra for this sort of scenario, but I did
not look at that particular document database).

If I was dealing with more standard storage requirements, which is what
really drew us to look into a document database that was scalability, I would use
RavenDB as I found it to be much richer and better thought out than Mongo. Mongo
seemed like it had been written a while ago, and was showing signs of needing a
breath of fresh air, in fact I read somewhere that the guys behind Mongo are
starting ove again on a new project which I imagine will be great. However at
the time of writing this article, out of the document databases I looked at for
this scenario (ie not very fast moving data requirments) I would choose RavenDB.

That's It

Thats all I wanted to say in this article, hope you have got something out of it, and enjoyed my rantings on this subject,
and can see that there may be times when a No SQL based solution may be exactly what you need

Comments and Discussions

Yeah, nice intro. I've been working with Mongo since March.Much prefer NoSql, in all my years using T-SQL etc I've never liked it to be honest.

See you mentioned Kinect in reply to another.Some really good articles and dev stuff being done now.From Sept 2011 until Feb 2012 I was working on a Silverlight Xbox contract, lots of fun waving hands and talking/shouting at the device. Fun watching the testers testing the application Yes, frantically gesturing at the device may get you arrested on the train, certainly get a seat to yourself. Still could liven up the mundane commute some what.

Ah you see the new 1.5 SDK for windows, has much better range, so can recognize gestures whilst seated even.

I have loads of nice ideas for it, just need to find the time. I am currently in a "reading phase", reading loads, DSLs is my latest.

I use Moq a lot at work too, and was debugging something the other day, and noticed something that I saw as Castle Proxy, and I was like Aha that's how Moq works. I immediately set out to write my own version of Moq from scratch just to see if I could do it. Involves lots of Expression tree stuff, but I am getting there. It will be no where near as good/finished/usable as Moq, as its a bit of fun really. Moq does have nice DSL like syntax so it ties in with my current reading phase.

Also have a WCF/NHibernate architecture thing on the back burner, which is kind of me doing what we do at work, but the way I think we should have done it, to see that it works the way I feel it could, instead of what we currently have, which is not that well archicted (old team started it). This will have WPF client, so may not be for you.

Sounds like a plan, but a tad busy at the moment. Summers here (he says as it's pooring down outside here in West Yorkshire) and I tend to do outside/other things rather than sit on a computer in the evenings generally.

Can you email me from this page. I've just tried emailing myself and it didn't come thro'.Checked my profile, it seems ok.Failing that I have a "public" email that I use that I don't mind if it gets spammed, which I could put on here