DSE NodeSync: Operational Simplicity at its Best

Introduction

We’ve got something really special for administrators in DataStax Enterprise (DSE) 6: DSE NodeSync, designed with operational simplicity in mind, can virtually eliminate manual efforts required to run repair operations in a DataStax cluster.

NodeSync

To understand NodeSync, let’s talk about how we got here. One of the most important mechanisms for an administrator to run in Apache Cassandra™ is anti-entropy repair. Despite its name, repair is a process that should always be running in a cluster to ensure that data between nodes are consistent.

The fundamentals of repair haven’t changed since it was initially introduced many years ago.It’s designed as a single-process bulk operation that continuously runs for a long time which means when failure occurs, you must begin the repair over again. Repair is also computationally and network intensive as it creates merkle trees and streams them between nodes.

The longer classic repair runs, the more failure prone it is.

To help mitigate some of these problems, complex tools were built to help orchestrate and add some structure and resiliency to repair. These tools try to split the repair process in multiple, more manageable pieces in an effort to improve operational simplicity, but in the end, these client-side tools were built to solve issues with a server-side mechanism. There’s only so much that can be done with tooling.

Enter NodeSync: NodeSync is a ground-up rethinking of how we do entropy resolution in a DataStax cluster. Once you install DSE 6, NodeSync automatically starts running in the background. You simply tell it which keyspace or tables you’d like managed with NodeSync, and it handles the rest. No more compute-intensive tasks, no more complex tooling, just hands-off repair operations.

Enabling nodesync on a table is as easy as an alter table command.

NodeSync is designed to be simple and reliable. It divides the work it must complete into small tasks. These tasks are always tracked so it knows which data has been synchronized and which hasn’t. It also acts as a checkpoint mechanism so that if a node goes down, NodeSync knows exactly where to start again. NodeSync is also self-managing in that it will prioritize what to synchronize based on the last time the data was synced and whether it failed or not.

Easily enable/disable nodesync on tables through OpsCenter

While NodeSync is designed to be as hands-off as possible, we know how important it is for administrators to understand what’s happening in the cluster so we’ve also updated OpsCenter to monitor NodeSync progress for you.

OpsCenter 6.5 lets you monitor NodeSync progress

Conclusion

We know our customers are going love NodeSync as it’s designed to make operations simpler with DataStax. Eliminating the need to orchestrate and manage repair means that administrators spend less time managing their DataStax clusters and more time doing other important tasks. To download DSE 6, and to get more information about NodeSync, please check out this page.

SHARE THIS PAGE

SUBSCRIBE

Comments

Repair (or repair service in OpsC) will not run on tables with NodeSync enabled on it. So you can either disable it if you’ve enabled NodeSync for all tables or leave it on. If you have other questions about it, feel free to email me directly at marc.selwan@datastax.com

Thanks Marc.
Instead of email, commenting here, so that other people can see your valuable inputs too.

Few more questions:
Nodesync is by default enabled with DSE 6.0?
Can we disable it?
Can it work without OpsCenter?
Is there a way to monitor progress other than OpsCenter?
What are downsides of NodeSync, if any?
One last and most important question, how does it compare/sync data without anti-entropy algorithm?

MS: The Nodesync service is “on” by default however you have to enable it on the respective tables – it will not start synchronizing data until you do.

Can we disable it?

You can’t disable the service but you can stop it from running on respective keyspaces/tables

Can it work without OpsCenter?

Yes, you can configure it through CLI or CQL

Is there a way to monitor progress other than OpsCenter?

Yes, you can monitor it using the ‘nodetool nodesyncservice status’ command

What are downsides of NodeSync, if any?

None so far – we recommend using NodeSync in place of traditional repair.

One last and most important question, how does it compare/sync data without anti-entropy algorithm?

Our extensive testing shows an overall improvement in synchronization throughput. Because we don’t have to build merkletrees and stream them (we only stream the digest), CPU and network utilization is lower as well. My favorite part though is the ability for nodesync to automatically handle failures and prioritize which segments of data to synchronize based on the status of the last run and how much time is left until GC_GRACE passes.