-column slices can be retrieved by range, are always sorted, can be reversed, etc.

-if a target key a TimeUUID you get both grouping AND sorting by timestamp. Good for inboxes, feeds, logs, etc

– This is the best option when you need to combine groups, sort and search, such as a friends list, inbox, etc.

But…what about 2 people with the same last name? (non-unique keys)

In the docs, the first answer you might find is SuperColumns – lets you have your col name have >1 col value. So, 2 row keys for people with 1 last name

Use SuperColumns with caution.

– Not officially deprecated, but they’re not highly recommended either.

– sorts only on the supercolumn, not the subcolumn

– some performance issues

– Cannot do more nesting, can only have 1 level of subcolumns

– Many projects have moved away from supercolumns b/c of these limitations.

So, let’s revisit regular CF’s — what happens with >1 person with the same last name? you can’t have 2 cols with the same column name. You can do a composite column name:

“User_Keys_by_last_name”: {

(“alden”, 1): “e5d”,

(“adams”, 1): “et3”,

(“anderson”, 1): “e5f”,

(“anderson”, 2): “e71..”,

(“doe”, 1): “a4f”,

(“franks”, 1): “f4e”,

Composite column names

Comparator = “CompositeType” or “DynamicCompositeType”

– you don’t lose the sort capability – sorts by component values using each component type’s sort order

2 types of composites, static and dynamic

column_families:

name: My_Composite_Index_CF

compare_with: CompositeType(UTF8Type, UUIDType)

— note in static composite types, fixed # and order of columns in the definition

name: My_Dynamic_Composite_Index_CF

compare_with: DynamicCompositeType(s=>UTF8Type, u=>UUIDType)

— Any # and order of types at runtime, the definitions are just for convenience and smaller serialized component names.

The main difference is whether you need to create one CF per index s, or one CF for all indexes with one row per index

How does this work?

Queries are easy – just regular column slice ops

Updates are harder – need to remove old value and insert the new value — this is why they recommend starting with the built-in native secondary indexes. You have to know how to remove old values then insert new values, which involves a read before write.

Example – Users by Location

Use 3 CFs, not 2 for safe concurrent access

First 2 CF’s are natural:

Users

Indexes

We also need a 3rd:

User_Index_Entries

Users = {

“username”: “…”

“location”: <location>

}

Indexes = {

Users_By_Location” : {

{<location>, <user_key>, <ts>} : …, …: …,

}

}

Users_Index_Entries = {

<user_key>: {

{“location”, <ts 1>}: <location 1>,

{“location”, <ts 2>}: <location 2>,

{“location”, <ts N>}: <location N>,

Allows you to read the previous index value from Users_Index_Entries CF and delete the previous one.

Read from Users_I_E WHERE KEY=<user_key>;

DELETE FROM Users_Index_Entries

DELETE FROM Users_By_Location

UPDATE Users_Index_Entries

UPDATE Users_By_Location

B/c there’s a timestamp, doing it >1 time has no consequence, so eventually consistent.

What if something goes wrong? Repeat batch operation until it completes.

False positive? possible, so if it’s a problem, filter on the reads.

This approach is VERY common — with some variations. So use this *idea*, but not necessary to be an exact copy of this example.

At least now, composite indexing is now standard.

Can do derived indexes — create a “last_name, first_name” index from a “fullname” column. Can also unroll a JSON object to construct deep indexes of serialized JSON structures.

– Include additional denormalized values in the index for faster lookups

– use composites for column values, too — not just column names.

custom secondary indexes

Note: no official alternate index “way”. Everything talked about here is using an official Cassandra feature/property.