Data Sheet: Apache Cassandra

Apache Cassandra™ Backgrounder

What is Apache Cassandra™

Apache Cassandra™, an Apache Software Foundation project, is an open-source NoSQL distributed database management system. Cassandra is designed to handle Big Data workloads across multiple data centers with no single point of failure, providing enterprises with extremely high database performance and availability. Cassandra was initially developed at Facebook, and is used by companies such as Twitter, Netflix, and Cisco.

Cassandra’s “cluster ring”, peer-to-peer architecture, which can include hundreds to thousands of identical nodes, protects from data loss and business disruption because there is no single point of failure. Cassandra is architected to deliver extreme levels of performance for both read and write workloads, and is designed to offer linear performance increases via the addition of new nodes to an existing cluster.

What are the beneﬁts of Apache Cassandra?

Massively scalable ring architecture: Based on the best of Amazon Dynamo and Google BigTable, Cassandra’s peer-to-peer architecture overcomes the limitations of master-slave designs and allows for both high availability and massive scalability.

Linear scale performance: Nodes added to a Cassandra cluster (all done online) increase the throughput of your database in a predictable, linear fashion for both read and write operations.

No single point of failure: Data is replicated to multiple nodes to protect from loss during node failure, and new machines can be added incrementally while online to increase the capacity and data protection of your Cassandra cluster.

Transparent fault detection and recovery: Cassandra was designed with the understanding the hardware failures can and do occur, and therefore, failover is handled transparently.