Installing Apache Cassandra

Introduction to Cassandra

Cassandra is often mentioned in books and in several NoSQL blogs, but there is still not enough attention to this very interesting NoSQL database. Cassandra is extremely scalable and can deliver continuous availability. Cassandra is very good for managing large amounts of data in a cluster that span across multiple data centers and the cloud(s).
Did i mentioned scalability? Again cassandra db is linear scalable!
Ok, you say, what about maintenance? Apache Cassandra promises operational simplicity, zero conf and self-balancing architecture. It also promises some degree of hardware agnosticism and can run on commodity servers and even on any kind of consumer hardware. And of course there is no single point of failure by design (but you can get one if you treat it wrong ;) ).
So why there is still no so much attention to this incredible database, even if access times outperforms Mongo DB (especially writes)?
I don't know exactly, but i think it has to do with a relatively steep learning curve in comparison to some competitors in the field.

In this article i try to show that one can get pretty fast familiar with Cassandra.

Installation of Cassandra

I tried it on current xubuntu (13.04.2) with java version "1.7.0_25" OpenJDK 64-Bit. This tutorial explains how to install Apache Cassandra as Debian package.

Step 2: update package information and install cassandra

apt-get update
apt-get install cassandra

You may be warned that 'cassandra' is untrusted package. When you hit "Y" cassandra will be installed and started for the first time showing you initial start parameters. Pretty easy so far, isn't? I like this zero-conf, that is the way scalable software should be. If you look on default configuration you will find out that is follows linux standards.
Cassandra's start script even tries to guess good JVM parameters considering your hardware, very nice feature. This self-configuration is very useful on large clustered erviroments.

Cassandra filse will be installed in the following directories:

/var/lib/cassandra (data )

/var/log/cassandra (logs)

/var/run/cassandra (runtime files)

/usr/share/cassandra (environment settings)

/usr/share/cassandra/lib (JAR files)

/usr/bin (binaries)

/usr/sbin (binaries)

/etc/cassandra (configuration files)

/etc/init.d (service startup script)

/etc/security/limits.d (cassandra user limits)

/etc/default (additional startup config)

First Installation Tests

This installation has created init.d script at /etc/init.d/. You can use this script to control Cassandra database.