A simple command-line interface or administering a Scylla node. A nodetool command can display a given node’s exposed operations and attributes. Scylla’s nodetool contains a subset of these operations. See Ring Architecture.

Primary Key

In a CQL table definition, the primary key clause specifies the partition key and optional clustering key. These keys uniquely identify each partition and row within a partition. See Ring Architecture.

Partition

A subset of data that is stored on a node and replicated across nodes. There are two ways to consider a partition. In CQL, a partition appears as a group of sorted rows, and is the unit of access for queried data, given that most queries access a single partition. On the physical layer, a partition is a unit of data stored on a node and is identified by a partition key. See Ring Architecture.

Partition Key

The unique identifier for a partition, a partition key may be hashed from the first column in the primary key. A partition key may also be hashed from a set of columns, often referred to as a compound primary key. A partition key determines which virtual node gets the first partition replica. See Ring Architecture.

Partitioner

A hash function for computing which data is stored on which node in the cluster. The partitioner takes a partition key as an input, and returns a ring token as an output. By default Scylla uses the 64 bit Murmurhash3 function and this hash range is numerically represented as an unsigned 64bit integer, see Ring Architecture.

Read Amplification

Excessive read requests which require many SSTables. RA is calculated by the number of disk reads per query. High RA occurs when there are many pages to read in order to answer a query. See Compaction Strategies.

Replication

The process of replicating data across nodes in a cluster. See Fault Tolerance.

Replication Factor

The total number of replica nodes across a given cluster. An RF of 1 means that the data will only exist on a single node in the cluster and will not have any fault tolerance. This number is a setting defined for each keyspace. All replicas share equal priority; there are no primary or master replicas. An RF can be defined on for each DC. See Fault Tolerance.

Size-tiered compaction strategy

Triggers when the system has enough (four by default) similarly sized SSTables. See Compaction Strategies.

Space amplification

Excessive disk space usage which requires that the disk be larger than a perfectly-compacted representation of the data (i.e., all the data in one single SSTable). SA is calculated as the ratio of the size of database files on a disk to the actual data size. High SA occurs when there is more disk space being used than the size of the data. See Compaction Strategies.

SSTable

A concept borrowed from Google Big Table, SSTables or Sorted String Tables store a series of immutable rows where each row is identified by its row key. See Compaction Strategies. The SSTable format is a persistent file format. See Scylla SSTable Format.

Table

A collection of columns fetched by row. Columns are ordered by Clustering Key. See Ring Architecture.

A value in a range, used to identify both nodes and partitions. Each node in a Scylla cluster is given an (initial) token, which defines the end of the range a node handles. See Ring Architecture.

Token Range

The total range of potential unique identifiers supported by the partitioner. By default, each Scylla node in the cluster handles 256 token ranges. Each token range corresponds to a Vnode. Each range of hashes in turn is a segment of the total range of a given hash function. See Ring Architecture.

Tunable Consistency

The possibility for unique, per-query, Consistency Level settings. These are incremental and override fixed database settings intended to enforce data consistency. Such settings may be set directly from a CQL statement when response speed for a given query or operation is more important. See Fault Tolerance.

Virtual node

A range of tokens owned by a single Scylla node. Scylla nodes are configurable and support a set of Vnodes. In legacy token selection, a node owns one token (or token range) per node. With Vnodes, a node can own many tokens or token ranges; within a cluster, these may be selected randomly from a non-contiguous set. In a Vnode configuration, each token falls within a specific token range which in turn is represented as a Vnode. Each Vnode is then allocated to a physical node in the cluster. See Ring Architecture.

Write Amplification

Excessive compaction of the same data. WA is calculated by the ratio of bytes written to storage versus bytes written to the database. High WA occurs when there are more bytes/second written to storage than are actually written to the database. See Compaction Strategies.