Blog

A Brief Introduction to Apache Cassandra NoSQL

Email This Post

Apache Cassandra is a highly scalable, high-performance distributed database designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. It is a type of NoSQL database.

Learn how to use Cassandra, from beginner basics to advanced techniques, with online video tutorials taught by industry experts. Enroll for Free Cassandra Training Demo!

NoSQL Database

A NoSQL database (sometimes called as Not Only SQL) is a database that provides a mechanism to store and retrieve data other than the tabular relations used in relational databases. These databases are schema-free, support easy replication, have simple API, eventually consistent, and can handle huge amounts of data.

The primary objective of a NoSQL database is to have

simplicity of design,

horizontal scaling, and

finer control over availability.

NoSql databases use different Data Structures compared to relational databases. It makes some operations faster in NoSQL. The suitability of a given NoSQL database depends on the problem it must solve.

Types Of NOSQL Databases

There are four general types of NoSQL databases, each with their own specific attributes:

1. Graph database – Based on graph theory, these databases are designed for data whose relations are well represented as a graph and has elements which are interconnected, with an undetermined number of relations between them. Examples include: Neo4j and Titan.

2. Key-Value store – we start with this type of database because these are some of the least complex NoSQL options. These databases are designed for storing data in a schema-less way. In a key-value store, all of the data within consists of an indexed key and a value, hence the name. Examples of this type of database include: Cassandra, DyanmoDB, Azure Table Storage (ATS), Riak, BerkeleyDB.

3. Column store – (also known as wide-column stores) instead of storing data in rows, these databases are designed for storing data tables as sections of columns of data, rather than as rows of data. While this simple description sounds like the inverse of a standard database, wide-column stores offer very high performance and a highly scalable architecture. Examples include: HBase, BigTable and HyperTable.

4. Document database – expands on the basic idea of key-value stores where “documents” contain more complex, in that, they contain data and each document is assigned a unique key, which is used to retrieve the document. These are designed for storing, retrieving, and managing document-oriented information, also known as semi-structured data. Examples include: MongoDB and CouchDB.

The following table lays out some of the key attributes that should be considered when evaluating NoSQL databases.

Datamodel

Performance

Scalability

Flexibility

Complexity

Functionality

Key-value store

High

High

High

None

Variable (None)

Column Store

High

High

Moderate

Low

Minimal

Document Store

High

Variable (High)

High

Low

Variable (Low)

Graph Database

Variable

Variable

High

High

Graph Theory

A NoSQL Example – Apache Cassandra

Apache Cassandra(tm) is a massively scalable open source NoSQL database delivering continuous availability, linear scale performance, operational simplicity and easy data distribution across multiple data centers and cloud availability zones. Cassandra was originally developed at Facebook and sports a design combining capabilities from Amazon’s Dynamo and Google’s Bigtable architectures; it was open sourced in 2008.

What Makes Cassandra Ideal for Modern Online Applications

Modern applications that succeed in today’s digital, Internet economy age are those that interact intelligently with the end customer in specifically tailored and personalized ways, benefiting both the customer and the underlying business. Cassandra provides a number of key features and benefits to facilitate the development and management of these types of modern online applications:

Top Use Cases

While Cassandra is a general purpose NoSQL database used for a variety of different applications in all industries, there are a number of use cases where the database excels over most of any other option. These include:

Internet of Things (IOT) applications – Cassandra is perfect for consuming and analyzing lots of fast-incoming data from devices, sensors and similar mechanisms that exist in many different locations.

User activity tracking and monitoring – Media, gaming and entertainment companies use Cassandra to track and monitor the activity of users’ interactions with their movies, music, games, website and online applications.

Social media analytics and recommendation engines – Online companies, websites, and social media providers use Cassandra to ingest, analyze, and provide analysis and recommendations to their customers.

Other time series based applications – because of Cassandra’s fast write capabilities, wide-row design, and ability to read only those columns needed to satisfy certain queries, it is well suited for most any time series based applications.

Join MindMajix Network

Mindmajix - Online global training platform connecting individuals with the best trainers around the globe. With the diverse range of courses, Training Materials, Resume formats and On Job Support, we have it all covered to get into IT Career. Instructor Led Training - Made easy.