Thursday, 10 April 2008

Getting started with distributed Erlang - Mnesia table relocation

Mnesia is a distributed database that forms part of the Erlang release. One of the features that I think is potentially powerful, is transparent table relocation across machines. With Mnesia, you can replicate tables to any nodes you wish in your network, and Mnesiatakes care of all the back end bits for you. With "transparent", I mean that you don't need to do anything in your clients to make them "aware" of the new tables. Reads that were taking place from a table on one machine, will now be distributed across multiple nodes (where the nodes reside on single or multiple machines).

I wanted to see how difficult it is to achieve this. For the setup, I installed two virtual Ubuntu 7.10 machines using VMware Player. You can get images for most Ubuntu distros at http://isv-image.ubuntu.com/vmware/. FYI, the username and password for these images is ubuntu:ubuntu. I named the two nodes

node1.21ccw.blogspot.com andnode2.21ccw.blogspot.com

You'll need to edit the network configurations with the IP addresses if you want to reproduce this experiment. If you need some help, post a question as comment :)

I now had two machines that could ping each other using the full names, and a warm and fuzzy feeling inside:

The next step was to start up an Erlang node on each machine. There's a catch here though. I got some problems using erl -sname, probably because of the way I set up the hostnames of the machines. So, I had to specify the fully qualified names manually:

Notice the output of the nodes() command. This will return a list of other Erlang nodes that this node is aware of. Initially there's no awareness. To let a node know of another node, you can use net_adm:ping/1 to ping the other node. Both nodes will then become be aware of each other:

Cool. Now the nodes know of each other. To get Mnesia started, you have to create a schema on each node. A schema is located on the file system, in the same location where the actual disc-copies of tables will reside. [node()|nodes()] creates a list of the current node and all the other connected nodes. ls() shows the directory that Mnesia has created for the database.

Next we'll create an actual database table, and populate it with some data. We define a record using rd(), then create a table on node1 (by default, this table will reside in RAM and have a disc copy), write a record to it and then read the record again. The primary key of the table is the first field of the record, i.e. the name.

Nice. Mnesia has transparently read the record from a table that's on another machine :)

Now we decide to copy the table to node2. This requires a single command. Mnesia does the copying of the actual data for you to the other machine, and when you look at the file system on node2, there will now be "person.DCD" file, which is the disc copy of the table.

What I've show is how to start up an Erlang/Mnesia node on two machines that are networked together, create tables on either node, and move the tables to other nodes by copying and then deleting them. Mnesia has the ability to configure tables to be RAM only, RAM and disc and disc only, which gives you lots of power for optimisation. Couple this with the fact that you can change your configuration dynamically and you have powerful, dynamically configurable distributed database!

My bad. I just read the last part and assumed that you just replicated the table created on the first node to the second node. And hence I was wondering how one would split data across multiple nodes (each running 1 mnesia instance).

But I think you answered my qn right above, where you demonstrated that you could retrieve a row from a table on a different node transparently over the network.

The performance degradation will definitely be application-specific, so it's impossible to say at which N that would be. But yes, you're right, there will be some point where the synchronisation overhead starts to trump the benefits of doing distributed reads. I think you'll have to start partitioning your data at that point.

P.S. Since writing this I've come to really like CouchDB (implemented in Erlang), you should have a look at it if you're interested in databases...

Hi Benjamin, I looked into couchdb, but I can't escape the highly relational structure of the data I have to deal with. Also, couchdb performs poorly when you have the requirement to support ad-hoc queries.

Perhaps a marriage between couchdb and mnesia is possible. I.e., use mnesia to store relational structures (in essence only integers would be stored) and store the rest (strings, text fields, blobs) in couchdb. Ad hoc queries would query mnesia for the relational stuff and couchdb via an inverted index doing text searches only (the latter should perform well for ad-hoc text queries).