There are probably other things I’ve overlooked. Really, you should just head over to the Riak wiki to get the intro.

Basho released Riak to the world just over a week ago. Some think that the docs are good enough that further explanation is not necessary. But, if you’re looking for a little more introduction, read on.

The short version of this post: Riak is as simple as downloading it and hitting the HTTP interface:

Aside: Some have found it difficult to stop Riak, so I’ll throw the tip out here: killall heart. You may also have to kill the erl command afterward, but if you don’t kill heart first, Riak will just come right back.

That’s all it takes to use Riak. You can stop reading now and happily speak REST to it for the rest of your application’s lifetime.

If you’re still hanging around, though, maybe you’d be interested in a few more features that Jiak has to offer, like link-walking and field validation. To demonstrate those, I’ll spend the rest of this post describing the creation of an application on Riak.

There will be two kinds of objects in my system: notes and groups. The properties of a note will be the text of the note, the color, the position on the screen, and the stacking order. The properties of a group will be the name of the group and the list of notes in the group.

I’ll start by whipping up two modules to manage those buckets for me. These modules tell jiak_resource how to validate the structure of objects in these buckets (jiak_resource is the Webmachine resource that runs Riak’s HTTP interface). They’ll follow the basic structure of jiak_example.erl, which ships with Riak.

The groups module ensures that every group has a name field (through implementation of allowed_fields/0, required_fields/0, read_fields/0, and write_fields/0), and that the value of that field is a string (using check_write/4 and check_diff/2).

The notes module ensures that all notes have a text field that is a string; x, y, and z fields that are integers; and a color field that is one of a specific list of strings (using the same functions as the groups module used).

One more interesting thing happens in the notes module. If you look at after_write/4, you’ll see that it fetches the groups object that the note links to, and adds the note to that group’s links. Jiak calls after_write/4 after the note has been stored in Riak, so what I’ve written here is effectively an automatic back-link monitor. We’ll return to the concept of links in a moment.

If I put the notes and groups modules in place, I can fire up Riak and immediately being sending HTTP requests to its Jiak interface to create and modify groups and notes. For example:

These two lines would create a group named todo with a note labeled finish blog post.

Now, about those links. See the ["groups","todos","open"] item in the links field of that notes object? That’s a link to the groups object named todos, and I’ve tagged it open. The affect_write/4 function in the notes module will add a link to the groups object of the form ["notes","blog","note"].

What does this get me? More than just record-keeping: I can now use the Jiak utility jaywalker to get all of the notes in the todo group with a single query:

$ curl http://127.0.0.1:8098/jiak/groups/todos/notes,_,_

You’ll recognize the first part, through /jiak/groups/todos/. The segment after that is a link query. This ones says "all notes objects, with any tag." The links are structured as {bucket},{tag},{accumulate} segments, with underscore meaning "any."

The example query will return an object with a results field that is a list of lists of results. That is, if I were to store the data returned from that query in a variable called data, my list of notes would be at data.results[0].

I’m not limited to one hop, either. If there were also person objects in my system, linked from notes objects as authors I might get all of the authors of the notes in a group with:

The curl commands and the HTTP requests they represent are pretty simple, but I’m going to be hitting these resources from Javascript running in a browser. So, I’ll wrap it all up in a nice utility class:

Okay, now I have my backend and frontend – I’d better fill in the middle. Any intermediary webserver could work, but because Riak comes with Webmachine included, I’ll just setup a quick Webmachine app.

The former puts the stickynotes ebin in Riak’s code path, so Jiak can reach the notes and groups modules I just wrote. The latter does nothing but force the atoms 'notes' and 'groups' into the Riak node so Jiak can use list_to_existing_atom/1.

To my new Webmachine app, I’ll add four things:

the notes and groups modules I just wrote

the static files from Beebole’s example applcation (HTML, CSS, JS) … modified a bit to load and use the jiak.js utility from above

a simple static resource server

a simple proxy resource to pass requests through to jiak (the couchdb_proxy from my wmexamples repo will do fine)

Once I have everything aligned (including dispatch setup correctly), I just start Riak and my Webmachine node:

…then point my browser at http://localhost:8000/, and I see the app’s UI with an empty group ready to store some notes. I recommend opening Firebug to watch the requests fly by.

I realize that I’ve glossed over a bit of the Webmachine application stuff, but that’s because it’s mostly rehash of older posts. The better way to cover all of that material is for me to tell you to open up the demo/stickynotes directory in the Riak repo you just cloned, and read the code written there yourself. 🙂

21 comments so far

Hey – thanks, B. We’re pretty happy with Riak. There’s a list of improvements we’re anxious to get to, but it works now, and we’re excited to show it off.

We’re seeing quite a bit of splash privately. Lots of people have discussed their Riak experiments with us, but those conversations haven’t bubbled into public channels yet. Being the Rage of Reddit isn’t Riak’s primary goal at the moment anyway.😉

Great post. Riak looks like a great tool. Any thoughts on how a two way link would be maintained? For example, if you updated a note with new groups the old groups would still point to the note. Is there a way to find out the links from the old note and update them accordingly?

Hi, Dan. Two-way links can be tricky, partially for the reason you brought up: grooming the other side.

For your use case, yes, there is a way to handle this. If you look at notes:check_write/4, as implemented in stickynotes, the first thing it does is:

{ObjDiffs,_} = Context:diff()

The second half of that tuple, the bit ignored by the underscore, is actually a links diff list. That diff will tell you which links were added, and which were removed. Using this diff list, it would be possible to extend notes:after_write/4 to comb through the list of removed links, and alter the links in each group that the note has just been unlinked from (that is, “remove the stale backlinks”).

And in the config/riak-demo.erlenv the $RIAK was replaced by the actual path to Riak.

And the config/riak.erlenv has two places to take care of:
riak_dets_backend and riak_heart_command.
After I created a new store folder for the riak_dets_backend I was able to re-produce your steps 5 to 8.
Now it gets interesting (to me at least): dets means persistence unlike ets in the riak-demo.erlenv. So I did:

And trying again:
14. curl http://localhost:8098/jiak/foo
> 503 Service UnavailableService UnavailableThe server is currently unable to handle the request due to a temporary overloading or maintenance of the server.mochiweb+webmachine web server

A) This blog post is out of date in some spots now (hooray for fast iterations). You did the right thing by switching to config/riak-demo.erlenv, or editing config/riak.erlenv.

B) The reason your line 14 didn’t work, but line 17 did has to do with how you restarted riak on line 13. In order to retrieve the foo/baz document, Riak needed to look up the schema for the foo bucket. Bucket schemas are stored in the bucket properties, which are stored in the ring state. Using start-fresh.sh creates a fresh, empty ring state. What you wanted to do was “./start-restart.sh config/riak.erlenv”. start-restart.sh would have loaded the ring state left on disk by the riak node you killed earlier.

Bryan –
after deleting one key, the list of keys still gets reported the same for several seconds, like the deletion did not happen. If I will try to pull the non-existing value during that period, I will get an error. How will I be “in control” here?

Taken out of your “doc/basic-client.txt”:
… I have that key / value for “mine”:
(cli@127.0.0.1)30>Cli:list_keys(“groceries”). {ok,[“mine”]}

… all fine, let me delete “mine”:
(cli@127.0.0.1)31> Cli:delete(“groceries”,”mine”,1).
ok
… “ok”! I like it so far. Now let us repeat the same instruction, list_keys, continuously:

… 10 (some) seconds later:
(cli@127.0.0.1)48> Cli:list_keys(“groceries”).
{ok,[]}
… now it seems deleted.
I am guessing about the gossip interval etc, but in the “normal” DB you will see the desired result right after that deletion “ok”. Any hints?

This delete lag was standard behavior in an earlier version of Riak, but it was tightened considerably in a recent release. Pull the latest, and you will see this 10-second lag disappear.

However, list_keys is a “special” operation. It’s not supported by most distributed key-value stores, and it’s something that takes a little finesse to make work with them. Riak’s list_keys skirts basically all the logic around vclocks and such to provide something in the form of a key-list, at the expense of 100% consistency.

And yet, there is one way you can guarantee the behavior you desire: set the RW parameter for the delete/3 function call to the N-value for the bucket you’re deleting from. This will ensure that all N nodes storing replicas of the object have removed that object before delete/3 returns success. In a non-degraded cluster, on a backend that conforms to spec, the key should not appear in a list_keys after such a successful delete.

Where you have used RW=1 in your example delete call, only one node need have responded with success by the time your delete/3 call returned. The other two nodes may still respond with the “deleted” key in a list_keys call immediately afterward.

Thank you for the update – the release 0.6 is certainly different: I am back “in control” for my deleted key no longer appears in the list.
About N and RW.
I’ve got a feeling that I am working on a bit too low level here and that initial N should be used all over the place for consistency. If I hit it with HTTP RESTful requests, the resources you have written are supposed to take care of such things.
Also I miss the “delete_bucket” or something like stickydb:reset type of function to “start over” if I am on the dets-based storage. Maybe I just did not dig deep enough for that.

Being able to control N/R/W is one of the powerful benefits of Riak, and you should work understand them before making arbitrary decisions about them.

For example, there may be cases where you want to demand full consistency by always using N=R=W. However, you need to be aware that doing so is accepting a tradeoff in availability. Reads fail if R nodes don’t respond successfully. Specifying R=N means that you demand all nodes storing replicas of an object to be alive, reachable, and ready.

The restful HTTP interface exposes R/W through query parameters (?r=R&w=W). By default this interface uses R=W=2 with N=3 to provide a somewhat standard CAP tradeoff (read-your-writes consistency, allow one node down, …), but since no single N/R/W choice is correct for all application, it’s not possible to just “take care of such things.”

There is no “delete_bucket” function for Riak. To clear out a bucket, delete each of the keys in the bucket, or stop Riak, clear out the disk storage directory, and start a new cluster.

Yes, I have read about R<N great feature of "always available", "consistency – may be later" in the original. I have seen yours and Karger et al references on it, that is the starting point of all things here.

My question was rather about that later moment. When all nodes are back up again or connection gets restored – I strongly believe the consistensy will be back for deleted keys as well. The test (experiment) will certainly demonstrate it.
And you are right – the reset won't need to delete the bucket, clearing all keys will suffice for my purposes.

The last (is there such thing?) question: is dets_backend capable to hold more than that notorious 2Gb? This one seems critical in the persistent storage selection and I could not find any definite answer yet.

BUT, you must realize that Riak, by its distributed nature, mostly skirts the space limitations of dets.

Each Riak vnode using the dets_backend will open its own table. The number of vnodes is determined by the ring_creation_size parameter in the configuration file. This is often set to something like 16 in developer setups, or 1024 or more in production deployments. 16*2GB or 1024*2GB is a different ballgame.

Furthermore, it is *not* a requirement of Riak that the entire dataset fit on one node. In fact, data stored with a N-value less than the number of nodes in your cluster will not be stored on all nodes. Therefore, the “maximum capacity” of a given cluster configuration is not as simple as the “maximum capacity” of any given node or backend.

I was close to that in my assumptions, but not that definite, of course. To summarize that (please correct me if any), the “small” DB for _corporate_ use can be set up as
– one box with
– one node on it
– with 16 (say) vnodes assigned to it
will be able to hold 16×2 = _lotsa_ textual data, provided the BLOBs will be placed directly into the file system and the static links will go into the dets instead.

The last straw of fear before the thing is that under improbable conditions one of vnodes gets overflown way earlier than the others. Are there any triggers implemented or in mind to face that? Or it is totally unlikely?

One vnode overflowing before all others is very unlikely, unless you happen to store some single object big enough to fill an entire vnode on its own. Otherwise, the consistent hashing function and ring partition claim strategies are well-distributed enough that all vnodes should fill approximately evenly.

There is currently nothing in the dets_backend watching for a potential overflow, but limit-checking code could be added.

Your first curl command creates the “todos” item in the “groups” bucket.

Your second curl command creates the “blog” item in the “notes” bucket. You’ve added a link in the “notes/blog” item that points to the “groups/todos” item.

Your third curl command asks Riak to start at the “groups/todos” item and follow all “notes” links it has. The “groups/todos” item has no links at all, so Riak finds nothing to give you.

Your final curl command asks Riak to start at the “notes/blog” item and follow all “groups” links it has. It has one link (the one you added in the second curl command), so it follows that one, finds the “groups/todos” item and returns it to you.

Aha! The important paragraph was the one just before the bit you quoted – the part where it talks about the affect_write/4 function in the notes module.

When using the stickynotes app, instead of curl, the notes bucket is marked as being handled by the ‘notes’ module. When objects are stored in the notes bucket, functions from this module are called, and it is one of those functions that adds the reverse link to the groups object.

If you wanted to do the same with just curl, you’d need to execute something like:

Assuming you’re using the stickynotes riak config file, and therefore have the stickynotes ebin in your code path, and the notes and groups atoms exist, riak will make the calls you expect, and notes:affect_write/4 will add that reverse link for you.

HOWEVER, the curl command described required that you pull the latest riak (as of this evening). I just patched a bug while writing this reply. If it’s inconvenient for you to update riak, you can get the same effect by typing the following two lines at the erlang shell running riak (assuming you used the debug-fresh.sh script):