Creating a Keyspace and Column Families

We need to create a keyspace and some column families to work with. There
are two good ways to do this: using cassandra-cli, or using pycassaShell. Both
are documented below.

Using cassandra-cli

The cassandra-cli utility is included with Cassandra. It allows you to create
and modify the schema, explore or modify data, and examine a few things about
your cluster. Here's how to create the keyspace and column family we need
for this tutorial:

Using pycassaShell

:ref:`pycassa-shell` is an interactive Python shell that is included
with pycassa. Upon starting, it sets up many of the objects that
you typically work with when using pycassa. It provides most of the
functionality that cassandra-cli does, but also gives you a full Python
environment to work with.

Here's how to create the keyspace and column family:

user@~ $ pycassaShell
----------------------------------
Cassandra Interactive Python Shell
----------------------------------
Keyspace: None
Host: localhost:9160
ColumnFamily instances are only available if a keyspace is specified with -k/--keyspace
Schema definition tools and cluster information are available through SYSTEM_MANAGER.

Getting a ColumnFamily

A column family is a collection of rows and columns in Cassandra,
and can be thought of as roughly the equivalent of a table in a
relational database. We'll use one of the column families that
are included in the default schema file:

Without any other arguments, :meth:`~pycassa.columnfamily.ColumnFamily.get()`
returns every column in the row (up to column_count, which defaults to 100).
If you only want a few of the columns and you know them by name, you can
specify them using a columns argument:

We may also get a slice (or subrange) of the columns in a row. To do this,
use the column_start and column_finish parameters. One or both of these may
be left empty to allow the slice to extend to one or both ends.
Note that column_finish is inclusive.

Sometimes you want to get columns in reverse sorted order. A common
example of this is getting the last N columns from a row that
represents a timeline. To do this, set column_reversed to True.
If you think of the columns as being sorted from left to right, when
column_reversed is True, column_start will determine the right
end of the range while column_finish will determine the left.

Typed Column Names and Values

Within a column family, column names have a specified comparator type
which controls how they are sorted. Column values and row keys may also
have a validation class, which validates that inserted values are
the correct type.

The different types available include ASCII strings, integers, dates,
UTF8, raw bytes, UUIDs, and more. See :mod:`pycassa.types` for a full
list.

Cassandra requires you to pack column names and values into a format it can
understand by using something like :meth:`struct.pack()`. Fortunately,
when pycassa sees that a column family has a particular comparator type
or validation class, it knows to pack and unpack these data types automatically
for you. So, if we want to write to the StandardInt column family, which has
an IntegerType comparator, we can do the following:

As mentioned above, Cassandra also offers validators on column values and keys
with the same set of types. Column value validators can be set for an entire
column family, for individual columns, or both. pycassa knows to pack these
column values automatically too. Suppose we have a Users column family with
two columns, name and age, with types UTF8Type and IntegerType:

Automatic retries (or "failover") happen by default with ConectionPools.
This means that if any operation fails, it will be transparently retried
on other servers until it succeeds or a maximum number of failures is reached.