Configure a Cassandra HA cluster

Tip

A new guide is available in the latest release of API Gateway that includes information on Apache Cassandra best practices and tuning, setting up high availability, and backup and restore. Much of the information in this guide also applies to API Gateway7.5.3. See the API Gateway 7.6.2 Apache Cassandra Administrator Guide.

This topic describes how to set up an Apache Cassandra database cluster for high availability of your API Gateway system. It describes the necessary configuration steps and provides examples from a production environment.

Cassandra HA in a production environment

To tolerate the loss of one Cassandra node and to ensure 100% data consistency, API Gateway requires the following cluster configuration in a HA production environment:

Three Cassandra nodes (with one seed node)

QUORUM consistency to ensure that you are reading from a quorum of Cassandra nodes (two) every time

Replication factor set to 3 so each node holds 100% of the data and you can tolerate the loss of one node

If one Cassandra node fails, the cluster continues with two nodes to be HA, consistent, and read/write. There is no availability with one node and QUORUM consistency. This configuration applies in all supported use cases (for example, API Manager and API Gateway custom KPS, OAuth, and client registry data).

Note

Eventual consistency is not supported in a production environment (due to a risk of stale and incomplete data).

Upgrade from previous API Gateway version

When upgrading from a previous API Gateway version, you need only one Cassandra node in the cluster to receive upgraded data. After upgrade, you can then add more nodes to this cluster to provide high availability (HA), and configure TLS security. For more details, see the API Gateway Upgrade Guide.

Cassandra HA configuration

Use the following tools to configure Cassandra and the API Gateway Cassandra client:

Tool

Description

nodetool

Located in CASSANDRA_HOME/bin. This tool is required run most Cassandra administration operations. nodetool runs locally by default against a Cassandra node.

cqlsh

Located in CASSANDRA_HOME/bin. This tool provides a query language interface to Cassandra. Cassandra Query Language (CQL) is similar in syntax to SQL. You can use tab completion with cqlsh (for example, press Tab to complete keyspace, table, and command names, and so on).

setup-cassandra

Located in GATEWAY_INSTALL_DIR/bin. This script helps with Cassandra configuration and updates the cassandra.yaml configuration file. You can edit this file manually, but this script saves time, helps prevent errors, and creates a backup of the original cassandra.yaml file.setup-cassandra also outputs instructions for resetting the default user name and password.

Policy Studio enables you to configure API Gateway and API Manager as clients of Cassandra. It also enables you to configure KPS table definitions created in back-end storage, if they do not exist (for example, in Cassandra or a relational database). For Cassandra, these tables are created in a group keyspace with an initial replication factor of 1. For more details, see Configure the group keyspace and replication factor in Cassandra.

The following general guidelines apply to configuring Cassandra HA:

Decide on the number Cassandra nodes and the number of API Gateway nodes (local or remote). Axway recommends to configure a Cassandra HA cluster with three Cassandra nodes, and least two API Gateway instances (local or remote). For details, see Cassandra deployment architectures.

Example Cassandra HA configuration in a production environment

This section describes an example Cassandra HA configuration supported by Axway in a production environment.

Note

In this section, API Gateway and API Manager are both clients of Cassandra, and all API Gateway steps refer to both API Gateway and API Manager. API Manager is used only when additional API Manager-specific configuration is required.

HA production environment requirements

The following system requirements apply for Cassandra HA in a production environment:

Hardware requirements

Nodes: Three Cassandra nodes (one seed node).

IP address: One IP address per Cassandra node.

Disk space and memory: Depend on how much data you plan to store and how often this data changes:

KPS data and API Manager data consume small amounts of data (mostly read configuration data), and should not be an issue.

OAuth token use can be large, depending on the frequency of token generation and token time-to-live.

Double the amount of estimated storage: Needed for Cassandra to perform automatic compaction of data.

Storage: Cassandra is designed to run on commodity distributed drives. Storage Area Network (SAN) is not recommended or supported in a production environment.

64-bit Cassandra version 2.2.8 or 2.2.5 with 64-bit Oracle JRE on Linux and Windows (OpenJDK is not supported). Cassandra 2.2.8 is recommended. For more details, Supported Cassandra versions.

Note

You must download a 64-bit Oracle JRE manually on UNIX/Linux when Cassandra is remote to API Gateway. This also applies on Windows when Cassandra is remote or local (see Install a 64-bit Oracle JRE on Windows).

Earlier API Gateway versions used a default port of 9160 to communicate with Cassandra over the Apache Thrift protocol. This protocol is not supported in API Gateway version 7.5.3 or later. However, if necessary, after upgrade, you can configure Cassandra and API Gateway to use port 9160 to communicate over the Cassandra native protocol.

Start with one Cassandra seed node

You must always start with one Cassandra node (non-HA). You can test API Gateway and API Manager functionality and become familiar with Cassandra using one node, before growing the system for HA.

When upgrading from previous API Gateway and API Manager versions (with embedded Cassandra), you must upgrade to one node only. After upgrade, you can then grow the system for HA. You do not need to start with three nodes, or start from scratch to achieve HA. For more details on upgrade, see the API Gateway Upgrade Guide.

Note

Cassandra scales horizontally. This means that each node must have equal resources. Each node must run on the same hardware (CPU, disk, memory, and network) and on the same operating system. This ensures that nodes do not starve or out-compete other nodes, and that you can easily add, remove, and replace nodes, especially in cloud environments. For example, do not run some nodes with less or more memory than other nodes, or some nodes on Windows and some on Linux, or some nodes on SUSE Linux and some on CentOS Linux.

Configure the group keyspace and replication factor in Cassandra

If you have created a KPS collection or set up API Manager (which creates KPS collections), API Gateway creates a Cassandra keyspace and tables for data storage when it connects to a Cassandra node, if these do not exist. This topic assumes API Manager users have already run setup-apimanager in non HA standalone mode, so the keyspace will exist. For details on configuring API Manager, see the API Manager User Guide.

By default, the created Cassandra keyspace has a name in the form of xDOMAINID_GROUPID. This enables API Gateways in a group to share data and enables a single Cassandra cluster to host data from multiple API Gateway domains (for example, development, test, and staging).

Tip

You can find your DOMAINID and GROUPID as follows:

Open the following file to view the DOMAINID:GATEWAY_INSTALL_DIR/apigateway/groups/topology.json

Run the following command to output the GROUPID: ls -l groups/topologylinks/GroupName

Configure the API Gateway keyspace and replication factor

Initially, the keyspace has a default Replication Factor (RF) of 1. You must increase this for HA configuration. Perform the following steps:

Use cqlsh to verify that the keyspace has been created and to view its replication factor. For example:

Configure the API Gateway Cassandra consistency levels

Ensure that the API Server KPS collection has been created under Environment Configuration > Key Property Stores. This is required to configure Cassandra consistency levels, and is created automatically if you installed the Complete setup type (see Installation options). If you installed the Custom or Standard setup, run one of the following scripts to create the required KPS collections:

Repeat this step for each KPS collection using Cassandra (for example, Key Property Stores > OAuth, or API Portal for API Manager). This also applies to any custom KPS collections that you have created.

If you are using OAuth and Cassandra, you must also configure quorum consistency for all OAuth2 stores under Libraries > OAuth2 Stores:

Configure API Manager Cassandra client settings

To update the Cassandra client configuration for API Manager, perform the following steps:

Ensure the API Gateway and API Manager components have been installed on the API Gateway 1 and API Gateway 2 nodes. These can be local or remote to Cassandra installations. For details, see Install the API Gateway server and Install API Manager.

Ensure an API Gateway domain, group, and instance have been created on the API Gateway 1 node using managedomain. For more details, see the API Gateway Administrator Guide.

Note

This section assumes that you have already run setup-apimanager on the first node in non-HA standalone mode. For more details, see the API Manager User Guide.

On startup, this instance receives the API Manager configuration for the group. It now shares the same KPS and Cassandra configuration and data, and uses the ports specified in the envSettings.props file.

Step 3 – Secure the Cassandra HA configuration and verify

To secure your Cassandra HA configuration, perform the following steps:

nodetool can normally run on any machine against any Cassandra node. For improved security, you might have locked down JMX for localhost access only. In such cases, you could use ssh to access that machine, and then run nodetool.

updateCassandraSettings options

The updateCassandraSettings.py script options are explained as follows:

Option

Description

-f, --file

Enter the API Gateway deployment (.fed) to be updated. The default is INSTALL_DIR/system/conf/templates/FactoryConfiguration-VordelGateway.fed. If you do not specify a .fed file, you must back up this file before running the script.

Enter a comma-separated list of Cassandra host nodes in host:port format. For example, "127.0.0.1:9042,127.0.0.2:9042,127.0.0.3:9042". You can also enter hostnames as environment variables (for example, "\${env.CASS.HOST1}:9042),\${env.CASS.HOST2}:9042,\${env.CASS.HOST3}:9042").

--passphrase=PASSPHRASE

Enter the encryption passphrase for the API Gateway group if required.

--passphrasePrompt

Specify this option to prompt for the encryption passphrase for the API Gateway group. This disabled by default.