DataStax Developer Blog

What’s New in OpsCenter 2.0: Visual Backup Management

Managing backups of a distributed database can be difficult. With Cassandra, the first step to backing up your data is taking a snapshot of your data, which creates a new hardlink to every live SSTable. There are several concerns when doing this: the snapshots need to happen simultaneously across the cluster; snapshotting should happen regularly and automatically; cleanup of old snapshots should be automatic; finally, you need to ensure that snapshots will never use an unsafe amount of a node’s disk space. Custom scripts and cron jobs can handle some of these concerns, but they require admin effort, understanding, and ongoing maintainance. In OpsCenter 2.0, we’ve added a way to manage all of these concerns painlessly.

Scheduling Automatic Snapshots

It’s straightforward to schedule recurring snapshots for any of your keyspaces.

Creating a new recurring snapshot schedule.

It’s also possible to run a cluster-wide snapshot immediately or schedule a non-recurring snapshot for some time in the future.

When it comes time to perform a snapshot, OpsCenter will simultaneously issue a snapshot command to each node in the cluster. The tag for the snapshot will have a name of the form opscenter_<schedule_id>_YYYY_MM_DD_HH_MM_SS_TZ, where the portion after the schedule ID represents the date and time of the snapshot; when looking at snapshot directories, this allows you to easly see which snapshots were taken by opscenter and when they occurred.

Within OpsCenter, you can see the details of all snapshots that exist in the cluster. The date, keyspace, and number of nodes which have the snapshot are all visible. If some nodes are missing the snapshot, hovering on that cell will allow you to see which nodes are missing it.

Details of the last few snapshots.

If OpsCenter was down when a snapshot was scheduled to occur, it will take a snapshot immediately upon starting up.

Cleaning Up Snapshots

Snapshots can end up using a fair amount of disk space as compactions replace the SSTables they are linked to. When creating a new snapshot schedule with OpsCenter, you can select the age at which old snapshots should be deleted automatically. If you have a separate process which will clean up the snapshots, you can also choose to have OpsCenter never clean up snapshots.

Watching Free Disk Space Levels

It can be difficult to correctly calculate the amount of disk space that snapshots will consume. Data sets fluctuate in size depending on compactions, and because of the nature of hard links, disk space consumption can grow suddenly when a large SSTable that has been snapshotted is replaced during a compaction.

To protect your nodes from running out of space on their data disks, OpsCenter allows you to set a minimum limit for the amount of free disk space a node must have before new snapshots will be performed. At the top of the Backups section, you’ll find a blue “Configure” link that will allow you to set that free disk space percentage:

You should be conservative when setting this limit, as snapshots will prevent older compacted files from being deleted to free disk space. Even if no new snapshots are taken, your disk space consumption may continue to grow for some time.

Conclusion

We think these new features will make administering a Cassandra cluster easier, and we plan to use this as a starting point for ensuring the safety of your big data. Additional capabilities are sure to come in the future, so stay tuned.