The steps listed below only require an AWS EC2 account and do not utilize the ec2snitch.

Optimizing Volume Performance for a Transient Cluster

Depending on node size, and on how many EBS volumes are attached, most EC2 nodes will have many independent attached volumes.

How should the Cassandra config be modified to take advantage of multiple attached volumes?

What are the tradeoffs for EBS vs local drives as backing store for a persistent cluster?

Cloud clusters are often transient: when all jobs are finished, the nodes are terminated. Any data held on EBS volumes will remain until next restart, and any data on the local scratch disks disappears forever. When the cluster restarts, each node will have a new ip addresses and identity.

For a non-persistent cluster, can Cassandra take advantage of the scratch disks (assume they are fast but could disappear at once across the whole cluster at any time)

Monitoring Cassandra in the Cloud

CloudKick offers an out-of-the-box solution for monitoring Cassandra in the cloud. See their Cassandra Checks page to learn how to use that service to monitor.

Note on using Cloudkick for monitoring Cassandra running on Debian: Set the "Path" argument to "/usr/".

Step-By-Step Guide to Installing Cassandra on EC2 & Debian

Assumptions

We will assume that the goal is to install Cassandra in a multi-Availability Zone configuration. However, all nodes must be in one Region because we will use the private IP addresses for the nodes to talk to each other.

We also will setup Security Groups for the Cassandra nodes to talk to one another, and also for other nodes to talk to Cassandra.

In the course of this document, we reference 'lwp-request' and 'ec2_signer.pl'. These are just simple perl programs that send HTTP requests (lwp-request) and that construct a signed URL based on the parameters given (ec2_signer.pl).

Steps

Step 1. Setup the "talk to Cassandra" Security Group

This group will contain any machine which desires to communicate with Cassandra.

Step 9. Select the Availability Zone you want.

For the us-east-1, there are 4 AZs:

us-east-1a

us-east-1b

us-east-1c

us-east-1d

Step 10. Start up your SEED instance

NOTE, the KeyName must be one for which you have the security key (from the KeyPair). In the example below, we use the key name 'dviner' and the AZ us-east-1a. Replace these with your key name and AZ selection.