Share This Page

What is Apache Cassandra™?

Apache Cassandra™, a top level Apache project born at Facebook and built on Amazon’s Dynamo and Google’s BigTable, is a distributed database for managing large amounts of structured data across many commodity servers, while providing highly available service and no single point of failure. Apache Cassandra™ offers capabilities that relational databases and other NoSQL databases simply cannot match such as: continuous availability, linear scale performance, operational simplicity and easy data distribution across multiple data centers and cloud availability zones.

Apache Cassandra™’s architecture is responsible for its ability to scale, perform, and offer continuous uptime. Rather than using a legacy master-slave or a manual and difficult-to-maintain sharded architecture, Apache Cassandra™ has a masterless “ring” design that is elegant, easy to setup, and easy to maintain.

In Apache Cassandra™, all nodes play an identical role; there is no concept of a master node, with all nodes communicating with each other equally. Apache Cassandra™’s built-for-scale architecture means that it is capable of handling large amounts of data and thousands of concurrent users or operations per second— even across multiple data centers— as easily as it can manage much smaller amounts of data and user traffic. Apache Cassandra™’s architecture also means that, unlike other master-slave or sharded systems, it has no single point of failure and therefore is capable of offering true continuous availability and uptime — simply add new nodes to an existing cluster without having to take it down.

Many companies have successfully deployed and benefited from Apache Cassandra™ including some large companies such as: Apple, Comcast, eBay, Instagram, Spotify, Uber, Netflix, and many more. The larger production environments have PB’s of data in clusters of over 75,000 nodes. Apache Cassandra™ is available under the Apache 2.0 license.

Looking for the most up to date version of open source Apache Cassandra™?

Fast Linear-Scale Performance — Enables millisecond response times with linear scalability (double your throughput with two nodes, quadruple it with four, and so on) to deliver response time speeds your customers have come to expect.

Flexible Data Model — The Apache Cassandra data model allows for new entities or attributes to be added over time and you’re not restricted to a rigid data model that can’t evolve with the needs of the business application — such as the addition of a new complicated data structure that may be unique to your environment, or adding a new column to a column family.

Operational and Developmental Simplicity — With all nodes in a cluster being the same, there is no complex software tiers to manage so administration duties are greatly simplified. Plus, the Cassandra Query Language (CQL) looks and acts just like SQL, which makes moving to Cassandra from any RDBMS very easy.

Strong Developer Community — There is a rich developer community that surrounds Apache Cassandra that strives to support developers working on the project, as well as those developing applications that leverage the database. Active in the IRC chat room and mailing lists, the Cassandra developer community is one of the most active for an open source project

Transparent Fault Detection and Recovery – Nodes that fail can easily be restored or replaced.

Tunable Data Consistency – Support for strong or eventual data consistency across a widely distributed cluster.

OpsCenter Monitoring/Management Tool — A graphical management and monitoring tool for Cassandra that provides a view of the system from a centralized dashboard. OpsCenter installs seamlessly, and gives system operators the flexibility to monitor and manage even the most complex workloads with ease from any web browser.

Runs on Commodity Hardware — Apache Cassandra is built-to-run on commodity hardware and is unparalleled in value. Don’t waste another dime on disaster recovery, high-end hardware, or revenue loss due to downtime. Focus your resources on building a great application, not on maintaining an expensive backend.

Mitigate Risks of Downtime — Apache Cassandra’s architecture is built with no single point of failure. If a node (rack, machine, or entire data center) goes down, another is available to take its place and serve read/write requests without interruption.

Faster Time to Market — DataStax goes beyond standard open-source deployments by providing resources that make it easier to deliver Apache Cassandra in a single data center, or across multiple data centers, and clouds.