How to Get the Most Out of Apache Cassandra™

Since it first appeared in 2008, ApacheCassandra™—an open source distributed NoSQL database that originally began as a Facebook project—has become increasinglypopular for enterprise applications that require high availability, high performance, and scalability.

As more and more enterprises deploy Cassandra, the demand for engineers skilled in the new database is increasing, too. Only 8% of respondents to a recentsurvey believe that there are enough qualified NoSQL experts on hand to meet the needs of today’s enterprises.

Unfortunately, you can’t just migrate to Cassandra and expect your wildest dreams to come true.

Getting the most out of Cassandra requires a well-thought-out game plan and a team of skilled and knowledgeable database administrators and operators who know exactly how to carry it out.

With that in mind, let’s take a look at four tips your organization can employ to ensure your deployment of Cassandra helps you achieve your business goals.

1. Train your team thoroughly

According to Gartner, skills shortages in data science remain a problem for many organizations.

Getting the most out of Cassandra starts with making sure your team knows the ins and outs of the technology and is comfortable using it. The easiest way to do this is to invest adequate resources into training and professional development.

For the best results, begin training your team a few weeks or even months before you roll out Cassandra. That way, they’ll have enough time to become familiar with the new technology before it’s deployed.

2. Give your team access to additional resources

Not everyone on your team will learn at the same pace.

In addition to regular training exercises, direct your team to additional resources they can leverage on their own time to become even more familiar with Cassandra’s functionality.

For example, DataStaxAcademy is a free resource that engineers can use to train themselves at their own pace. The academy features ad-hoc learning opportunities, how-tos, podcasts, and more. There’s also a developerblog that offers tips and tricks, a Slackchannel for discussions and, from time to time, and in–personmeetups held all over the country.

To sum: After you’ve trained your team, point them to resources they can use to get up to speed on topics they might not be as comfortable with.

3. Pick the right data model

One of the hardest parts of using Cassandra is picking the right data model.

Generally speaking, your data model should help you achieve two main goals:

Spreading data evenly around the cluster

Minimizing the number of partitions read

To accomplish the first goal, you’ll need to pick a good primarykey. To accomplish the second goal, model your data to fit your queries instead of modeling around relations or objects.

4. Optimize your Cassandra implementation

As you scale your Cassandra deployment across the enterprise, managing the database can become more costly and increasingly complex without the right approach.

This is why we created DataStaxDistributionofApacheCassandra™, a production-ready implementation of Cassandra that is ready to go out of the box and is completely compatible with open source Cassandra.

The DataStax Distribution of Apache Cassandra also comes with best-in-class support. Choose between 24×7 or 8×5 support, depending on your needs. By leveraging these services, you’ll be able to reduce internal support costs considerably while ensuring SLAs are met.