Multi Datacenter Cassandra on Vagrant with Ansible

Here I introduce a project for testing a multi-datacenter Cassandra setup locally. All files discussed are available for download here: cassandra-cross-dc

Goal

Create a local simulated multi-datacenter Cassandra cluster on Vagrant for testing various settings.

It should be easy to setup, teardown and recreate if necessary.

Sub-goals

Transmissions between data-centers should be SSL encrypted and authenticated.

Ansible should handle creation and configuration of SSL keys and certificates.

It should be easy to add nodes to an existing cluster.

Requirements

The attached vagrant file creates 6 vagrant ubuntu instances running Cassandra. You will need a reasonable amount of memory. I had no problems with a MacBook Pro with 16GB memory.

Some Design Decisions

Loose coupling between ansible and vagrant

Vagrant-ansible provisioning allows to create the complete environment just from the vagrant up command, the ansible playbook is called from within the Vagrantfile. This is fine for simple environments, but in this case I wanted the flexibility of running ansible only.
Invoking ansible directly from the command line we can invoke many different types of commands, using ansible tags, adhoc commands etc.
Note that when ansible is run from within the Vagrantfile, vagrant automatically creates an ansible inventory file with the connection properties for each ansible host as shown below..vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory

I used these connection properties in the custom inventory file to allow ansible to seamlessly connect to the vagrant guests.

We use vagrant connection properties with the vagrant private key specified for each node individually.
Vagrant specific configuration is isolated within the inventory file, allowing easy addition of non-vagrant inventories, just add another hosts file.

Ansible tags

Use ansible tags to give more fine-grained control over which tasks get run within the cassandra.yml playbook.

hosts inventory file

two datacenter groups (dc1, dc2) are defined containing the six vagrant guests

the cassandra group contains these two groups

each host has a listen_address which is the IP Cassandra nodes use to talk with each other

each host has vagrant connection configuration
– ansible_ssh_port: defines a port on localhost that is forwarded to port of 22 of each vagrant guest. This local port is configured within Vagrantfile.
– ansible_ssh_private_key_file this is the vagrant private key file used for the ssh connection

templates

cassandra.yaml.j2 is the main configuration for Cassandra.
within this file we have:cluster_name: '{{ cluster_name }}'
– name for the Cassandra cluster- seeds: "{{ cassandra_seeds | join(',') }}"
– how the nodes initially connect to each otherlisten_address: {{ listen_address }}
– the IP on this node that other nodes connect to

server_encryption_options:
internode_encryption: dc

– set to use SSL between datacenters
– configure the ssl file locations and passwords

files

– contains cert files that are downloaded from the Cassandra nodes into timestamped directories

handlers

– contains handler to restart Cassandra on a configuration change

restart-cassandra.sh

– slow rolling restart for all Cassandra nodes. Useful to restart Cassandra if nodes fail to start up properly etc.

Here we define Cassandra topology strategies for system_auth and web. system_auth is used internally by Cassandra, by default it is not enabled on multiple data-centers. This handles authorization for Cassandra users. web is our test schema. The keyspace configs specify a replication factor of 3 per data-center.

Import new nodes certificate to existing nodes

This command uses the --limit option to tell ansible to run on all nodes besides cassandra-7.cert_dir=20170630124622 is the directory containing the certificate file for cassandra-7 we want to import.
We specify the tag --tags upload_certs to tell ansible to run tasks to upload and import the certificate file.

Summary

Use this setup for testing multi-datacenter Cassandra configuration.
Setup a local Cassandra simulated multi-datacenter environment easily. You may also further extend it by adding new non-local environments as explained in the section “Note on environment configurations”. If you read this far, thanks for reading and good luck!