This is simple, we create tweet objects and insert them in a loop. So how does our Cassandra data looks after we run this program.

Tweets DB View

As we see for each key (UUID) we have stored two column, username and tweet data.

Example 2: Saving User Action Log

This example is very similar to our first example. Here we are storing userid, action and the URL in the DB.

Example 3: Saving Logs per Hour

In this example we wanted to save Log file per hour so that we can analyze them easily. I choose to use a SuperColumn for this, Day and hour as the keys. There can be other approaches of getting the same functionality. The idea is to have following structure for the logs

Log Storage

For each day, we will store logs per hour

Log POJO just has a string message to be saved. Real world scenario can be more sophisticated

The calls are essentially similar, but we just add more keys, like tag is Day key (YYYYMMDD), LOGS is the name of the SuperColumn. Inside the SuperColumn, we add log message with a unique id.

Here is how it looks, when stored in Cassandra.

Logs DB View

Cassandra Data Model is slightly tricky to understand in the begining. There are really wondeful posts out there explaining the same. Take some time to read about the Data Model and tweak the examples, and have fun.

That may not be a great data schema for the logs. The data is partitioned between your Cassandra nodes based on the row key. Since your row key is the same for a whole day, this would mean all logs for each day get sent to only one of your nodes, which may or may not be what you want. See http://www.datastax.com/docs/1.0/cluster_architecture/partitioning

It may be better to use a UUID for the key so that the logs are distributed between servers.