Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Cassandra 2.0 and timeseries

At this meetup Patrick McFadin, Solutions Architect at DataStax, will be discussing the most recently added features in Apache Cassandra 2.0, including: Lightweight transactions, eager retries, improved compaction, triggers, and CQL cursors. He'll also be touching on time series data with Apache Cassandra.

32.
Time Series Taming the beast
• Peter Higgs and Francois Englert. Nobel prize for Physics
• Theorized the existence of the Higgs boson
• Found using ATLAS
• Data stored in P-BEAST
• Time series running on Cassandra
Friday, October 11, 13

39.
Time Series Further partitioning
• At every minute you will eventually run out of rows
• 2 billion columns per storage row
• Data partitioned by weather station ID and time
• Use the partition key to split things up
CREATE TABLE temperature_by_day (
weatherstation_id text,
date text,
event_time timestamp,
temperature text,
PRIMARY KEY ((weatherstation_id,date),event_time)
);
Friday, October 11, 13

40.
Time Series Further Partitioning
• Still easy to insert
• Still easy to query
INSERT INTO temperature_by_day(weatherstation_id,date,event_time,temperature)
VALUES ('1234ABCD','2013-04-03','2013-04-03 07:01:00','72F');
SELECT temperature
FROM temperature_by_day
WHERE weatherstation_id='1234ABCD'
AND date='2013-04-03'
AND event_time > '2013-04-03 07:01:00'
AND event_time < '2013-04-03 07:04:00';
Friday, October 11, 13