Information about configuring DataStax Enterprise, including using virtual nodes; setting up security; storing
and accessing data exclusively from memory; setting up distributed data replication from remote clusters;
running multiple DataStax Enterprise nodes on a single host machine, and automating the movement of data
across different types of storage media.

Information about configuring DataStax Enterprise, including using virtual nodes; setting up security; storing
and accessing data exclusively from memory; setting up distributed data replication from remote clusters;
running multiple DataStax Enterprise nodes on a single host machine, and automating the movement of data
across different types of storage media.

Syntax

For the properties in each section, the main setting has zero spaces, and at least two
spaces are required before each entry in the section. For example, in the
node_health_options section, at least two spaces are required before
refresh_rate_ms and uptime_ramp_up_period_seconds:

Authentication options

Authentication options enable
multiple authentication schemes to be used on the same DataStax Enterprise cluster. Additional
configuration is required in the cassandra.yaml file. You must also grant
authorization to the configured schemes, see Configuring authorization and object permissions.

Controls whether the DSE Unified
Authenticator authenticates users. The DSE Unified Authenticator allows
multiple authentication schemes to be used at the same time. The driver selects which
scheme to use during authentication. Set enabled:
false to use the direct equivalent of AllowAllAuthenticator in
cassandra.yaml.

default_scheme

Selects which authentication scheme is used if the driver does not request a specific
scheme.

A list of schemes that can be automatically selected for
use by a driver. This option can use the same list of schemes as the
default_scheme.

scheme_permissions

Controls whether roles require permissions for specific
authentication schemes. These permissions can be granted only when the DSE Authorizer is used.

allow_digest_with_kerberos

Controls whether DIGEST-MD5 authentication is also allowed with Kerberos. The
DIGEST-MD5 mechanism is not directly associated with an authentication scheme, but is
used by Kerberos to pass credentials between nodes and jobs. In analytics clusters, set
to true when using Hadoop inter-node authentication with Hadoop and Spark jobs.

plain_text_without_ssl

Controls how the DseAuthenticator responds to plain text authentication requests over
unencrypted client connections. Set to one of the following values:

block - Block the request with an authentication error.

warn - Log a warning about the request but allow it to
continue.

allow - Allow the request without any warning.

transitional_mode

Allows the DseAuthenticator to operate in a temporary transitional mode during setup
of authentication in a cluster. Set to one of the following values:

disabled - Transitional mode is disabled.

permissive - Only a superuser is authenticated and logged in. All
other authentication attempts are logged in as the anonymous user.

normal - If credentials are passed, they are authenticated.

If the authentication is successful, the user is logged in.

If the authentication fails, the user is logged in as anonymous.

If no credentials are passed, the user is logged in as anonymous.

strict - If credentials are passed, they are authenticated.

If the authentication is successful, the user is logged in.

If the authentication fails, then an authentication error is returned.

The keytab file must contain the credentials for both of the fully
resolved principal names, which replace _HOST with the fully qualified domain name
(FQDN) of the host in the service_principal and
http_principal settings. The UNIX user running DSE must also have
read permissions on the keytab.

service_principal

The service_principal that the Cassandra and Hadoop
processes run under must use the form
dse_user/_HOST@REALM.

where
dse_user is:

Installer-Services and Package installations: cassandra

Package installations: the name of the UNIX user that starts
the service

where:

_HOST is converted to a reverse DNS lookup of the broadcast address.

REALM is the name of your Kerberos realm. In the Kerberos
principal, REALM must be uppercase.

The service_principal must be consistent
everywhere: in the dse.yaml file, present in the keytab, and in
the cqlshrc file
(where service_principal is separated into
service/hostname).

http_principal

The http_principal is used by the Tomcat application container
to run DSE Search. The Tomcat web server uses GSS-API mechanism (SPNEGO) to negotiate
the GSSAPI security mechanism (Kerberos). Set REALM to the name of
your Kerberos realm. In the Kerberos principal, REALM must be
uppercase.

qop

A comma-delimited list of Quality of Protection (QOP) values that
clients and servers can use for each connection. The client can have multiple QOP
values, while the server can have only a single QOP value. The valid values
are:

Encryption using auth-conf is
separate and independent of whether encryption is done using SSL. If both
auth-conf and SSL are enabled, the transmitted data is encrypted twice. DataStax
recommends choosing only one method and using it for both encryption and
authentication.

The username of the user that is used to search for
other users on the LDAP server. If not present, an anonymous bind is
used for the search.

search_password

The password of the search_dn
user.

use_ssl

Set to true to enable SSL connections
to the LDAP server. If set to true, you might need
to change server_port to the SSL port of the LDAP
server. Default: false

use_tls

Set to true to enable TLS
connections to the LDAP server. If set to true,
change the server_port to the TLS port of the LDAP
server. Default: false

truststore_path

The path to the truststore for SSL
certificates.

truststore_password

The password to access the trust
store.

truststore_type

The type of truststore. Default: jks

user_search_base

The search base for your domain, used to look
up users. Set the ou and dc
elements for your LDAP domain. Typically this is set to
ou=users,dc=domain,dc=top_level_domain.
For example,
ou=users,dc=example,dc=com.

user_search_filter

The search filter for looking up user
names. Default: uid={0}

user_memberof_attribute

For use with DataStax Enterprise unified authentication role management. The attribute on the user entry that contains group
membership information.

group_search_type

For use with DataStax Enterprise unified authentication role management. Define how group membership is determined for a user. Choose
from one of the following values:

directory_search - Filters the results by doing a subtree search
of group_search_base to find groups that match the group_search_filter.
(Default)

memberof_search - Get groups from the memberof attribute of the
user. The directory server must have memberof support.

group_search_base

The unique distinguished name (DN) of the group from which
to base the group membership search on.

group_search_filter

The LDAP group to filter the search on. Default:
(uniquemember={0})

group_name_attribute

The attribute in the group entry that holds the LDAP
group name. Default: cn

credentials_validity_in_ms

The duration period in milliseconds for
the credential cache. Default: 0

search_validity_in_seconds

The duration period in milliseconds for the
search cache. Default: 0

connection_pool

The configuration settings for the
connection pool for making LDAP requests.

max_active - The maximum number of active connections to
the LDAP server.
Default: 8

max_idle - The maximum number of idle connections in
the pool awaiting requests.
Default: 8

System encryption settings

DataStax recommends using remote encryption keys from a KMIP key
server when using Transparent Data Encryption (TDE) features. Local key support is
provided when a KMIP key server is not available.

enabled

Enable to locally encrypt system tables that might contain sensitive information,
including system.batchlog, system.paxos, hint files, and the Cassandra commit log. If
true, system tables that contain sensitive information are encrypted. When you enable
system table encryption on a node with existing data, run nodetool upgradesstables -a on the listed tables. Default:
false

System traces, which might contain sensitive information, are not affected by
this setting. To encrypt traces, configure encryption on tables in the system_traces
keyspace. See configuring encryption per table using
TDE.

cipher_algorithm

Default: AES

secret_key_strength

Default: 128

chunk_length_kb

Default: 64

key_name

Can
be added to specify the name of the keys file that is created to encrypt system tables
in the system_key_directory/system/key_name.
Comment out when using key_provider: KmipKeyProviderFactory Default:
system_table_keytab

key_provider

An alternate key provider only for local encryption when using a KMIP host as a key
provider. Omit this field if you are not using KmipKeyProviderFactory. Default:
KmipKeyProviderFactory

kmip_host

When key_provider: KmipKeyProviderFactory, the
kmip_groupname that is defined for the kmip_hosts entry in
dse.yaml that describes the KMIP key server or group of KMIP key
servers.

Encryption and system key settings

The directory where global encryption keys, called system keys, are created and
stored. Keys that are used for SSTable encryption must be distributed to all nodes.
DataStax Enterprise must be able to read and write to this directory, have 700
permissions, and belong to the dse user. Default: /etc/dse/conf

The name of the system key for encrypting and decrypting stored passwords in the
configuration files. To encrypt keyfiles, use dsetool createsystemkey. When
config_encryption_active is true, you must provide
a valid key with this name for the system_key_directory option.
Default: system_key

KMIP encryption options

Options for KMIP encryption keys and communication between the DataStax Enterprise node and
the KMIP key server or key servers. Enables DataStax Enterprise encryption features to use
encryption keys that stored on a server that is not running DataStax
Enterprise.

A user-defined name for a group of options to configure a
KMIP server or servers, key settings, and certificates. Configure options for a
kmip_groupname section for each KMIP key server or group of KMIP
key servers. Using separate key server configuration settings allows use of different
key servers to encrypt table data, and eliminates the need to enter key server
configuration information in DDL statements and other configurations. Multiple KMIP
hosts are supported.

hosts

A comma-separated list of hosts[:port] for the KMIP
key server. There is no load balancing. In failover scenarios,
failover occurs in the same order that servers are listed. For
example: hosts: kmip1.yourdomain.com,
kmip2.yourdomain.com

keystore_path

The path to a java keystore that identifies
the DSE node to the KMIP key server. For example:
/path/to/keystore.jks

keystore_type

The type of key store. The default value is
jks.

keystore_password

The password to access the key
store.

truststore_path

The path to a java truststore that
identifies the KMIP key server to the DataStax Enterprise node. For
example: /path/to/truststore.jks

truststore_type

The type of truststore.

truststore_password

The password to access the
truststore.

key_cache_millis

Milliseconds to locally cache the
encryption keys that are read from the KMIP hosts. The longer the
encryption keys are cached, the fewer requests are made to the KMIP
key server, but the longer it takes for changes, like revocation, to
propagate to the DataStax Enterprise node. DataStax Enterprise uses
concurrent encryption, so multiple threads fetch the secret key from
the KMIP key server at the same time. Default: 300000. DataStax
recommends using the default value.

timeout

Socket timeout in milliseconds. Default:
1000.

DSE In-Memory options

max_memory_to_lock_mb

To use the DSE In-Memory, choose one of these options to specify how
much system memory to use for all in-memory tables.

max_memory_to_lock_fraction

Specify a fraction of the system memory.
The default value of 0.20 specifies to use up to 20% of system
memory.

The amount of continuous uptime required for the node's uptime score to advance the
node health score from 0 to 1 (full health),
assuming there are no recent dropped mutations. The health score is a composite score
based on dropped mutations and uptime. Tip: If a node is
repairing after a period of downtime, you might want to increase the uptime period to
the expected repair time. Default: 10800 (3 hours)

dropped_mutation_window_minutes

The historic time window over which the rate of dropped mutations affect the node
health score. Default: 30

Scheduler settings for Solr indexes

To ensure that records with TTLs are purged from search indexes when they expire, the
search indexes are periodically checked for expired documents. The
ttl_index_rebuild_options settings control the schedulers in charge
of querying for and removing expired records, and the execution of the checks.

fix_rate_period

Schedules how often to check for expired data in seconds. Default: 300

initial_delay

Speeds start-up time by delaying the first TTL checks in seconds. Default: 20

max_docs_per_batch

Sets the maximum number of documents to check and delete per batch by the TTL rebuild
thread. Default: 4096

thread_pool_size

To manage system resource consumption and prevent many Solr cores
from executing simultaneous TTL deletes, define the maximum number of cores that can
execute TTL cleanup concurrently. Default: 1

The TCP listen port. For releases earlier than DataStax Enterprise 5.0, this setting
was mandatory to use the netty transport and is used only during the upgrade to 5.0.
After all nodes are running 5.0, requests that are coordinated by this node will no
longer contact other nodes on this port. For 5.0 and later, requests use Inter-node messaging options. Default:
8984

netty_server_acceptor_threads

The number of server acceptor threads. Default:
number_of_available_processors

netty_server_worker_threads

The number of server worker threads. Default: number_of_available_processors
* 8

netty_client_worker_thread

The number of client worker threads. Default: number_of_available_processors
* 8

netty_client_max_connections

The maximum number of client connections. Default: 100

netty_client_request_timeout

The client request timeout is the maximum cumulative time (in milliseconds) that a
distributed Solr request will wait idly for shard responses. Default: 60000

netty_max_frame_length_in_mb

The maximum length of a message frame, in MB. Default: 256

Deprecated. When type: http, define the following http transport settings
to configure http inter-node communication between DSE Search nodes. To avoid blocking
operations, DataStax strongly recommends changing these settings to a finite value. These
settings are valid across Solr cores:

Solr indexing settings

DSE Search implements multi-threaded indexing to improve performance on multi-core
machines. All index updates are internally dispatched to a per-core indexing thread pool and
executed asynchronously, which allows for greater concurrency and parallelism. However,
index requests can return a response before the indexing operation is
executed.

Configures the maximum number of concurrent asynchronous indexing threads per Solr
core. If set to 1, DSE Search uses synchronous indexing behavior in a single
thread. To achieve optimal performance, assign this value to number of available CPU
cores divided by the number of Solr cores. For example, with 16 CPU cores and 4 Solr
cores, the suggested value is 4. Also see Configuring and tuning indexing performance
and Increasing indexing throughput.
Default: number_of_available_CPU_cores. To prevent writes from
overwhelming reads, reduce this value and adjust parallelDeleteTasks in
solrConfig.xml.

Note: Dynamic switching to Solr concurrency level
at 1 is disallowed.

enable_back_pressure_adaptive_nrt_commit

Allows back pressure system to adapt max auto soft commit time (defined per core in
the solrconfig.xml file) to the actual load. Setting is respected only for NRT (near
real time) cores. When Solr core is using live
indexing with RT (real time) enabled, adaptive commits are disabled regardless
of this property value. Default: true

back_pressure_threshold_per_core

The total number of queued asynchronous indexing requests per Solr core. When this
number is exceeded, back
pressure prevents excessive resource consumption by throttling new incoming
requests. DataStax recommends using 1000 * max_solr_concurrency_per_core. Default is 2000.

flush_max_time_per_core

The maximum time, in minutes, to wait for the flushing of asynchronous index
updates, which occurs at Solr commit time or at Cassandra flush time. Expert level
knowledge is required to change this value. Always set the value reasonably high to
ensure that flushing completes successfully. If the configured value is exceeded,
index updates are only partially committed, and the Cassandra commit log is not
truncated to ensure data durability.

Note: When a timeout occurs, it usually means
this node is being overloaded and cannot flush in a timely manner. Live indexing
increases the time to flush asynchronous index updates.

Default: 5

load_max_time_per_core

The maximum time, in minutes, to wait for each Solr core to load on startup or
create/reload operations, expressed. This advanced option should be changed only if
exceptions happen during core loading. Default: 1 (if not specified)

The directory to store index data. By default, the Solr
data is saved in cassandra_data_dir/solr.data,
or as specified by the dse.solr.data.dir system
property.

solr_field_cache_enabled

The Lucene field cache is deprecated. Instead, for fields that are sorted, faceted, or
grouped by, set docValues="true" on the field in the schema.xml
file. Then RELOAD the core and reindex. The default value is false.
To override false, set useFieldCache=true in the Solr request.

Solr CQL query options

Available option for CQL Solr
queries.

cql_solr_query_row_timeout: 10000

cql_solr_query_row_timeout

The maximum time in milliseconds to wait for each row to be read from Cassandra during
CQL Solr queries. Default: 10000 milliseconds (10 seconds)

Global Performance Service options

Available options to configure the thread pool that is used by most plug-ins. A dropped
task warning is issued when the performance service requests more tasks than
performance_max_threads + performance_queue_capacity. When a task is dropped, collected
statistics might not be current. Tuning options include disabling or reconfiguring some
services, or increasing the queue size.

The number of queued tasks in the backlog when the number of performance_max_threads
are busy, with a minimum value of 0.

Default: commented out (32000)

Performance Service options

These settings are used by the Performance Service to configure collection of performance
metrics on Cassandra nodes. Performance metrics are stored in the dse_perf keyspace and can
be queried with CQL using any CQL-based utility, such as cqlsh, DataStax DevCenter, or any application using a
Cassandra CQL
driver.

DataStax Enterprise can control the memory and cores offered by particular Spark
Workers in semi-automatic fashion. Specify the fraction of system resources that are
made available to the Spark Worker.

The lowest values that you can assign to Spark Worker memory is 64 MB. The lowest
value that you can assign to Spark Worker cores is 1 core. If the results are lower,
no exception is thrown and the values are automatically limited. The range of the
initial_spark_worker_resources value is 0.01 to 1. If the range is
not specified, the default value 0.7 is used.

This mechanism is used by default to set the Spark Worker memory and cores.
To override the default, uncomment and edit one or both SPARK_WORKER_MEMORY
and SPARK_WORKER_CORES options in the
spark-env.sh file.

spark_shared_secret_bit_length

The length of a shared secret used to authenticate Spark components and encrypt the
connections between them. This value is not the strength of the cipher for encrypting
connections. Default: 256

Settings to configure the amount of resources that are designated for Hadoop
tasks.

task_tracker_cores

The maximum number of slots that can be allocated by the Task Tracker for running user
tasks. By default, this value is calculated automatically. Specify the maximum total
number of mappers and reducers that can be simultaneously run by the Task Tracker. The
individual number of mappers or reducers will never be greater than the number of
physical cores -1. Default: 2

task_tracker_memory

The maximum amount of memory that can be allocated by the Task Tracker for running
user tasks. By default, this value is calculated automatically. Specify the total memory
to split among particular slots, including the 128m per single slot Java overhead. The
maximum heap size of a single mapper or reducer will be no less than the hard-coded
minimum 256m. Specify suffix to indicate memory sizes: kilobyte (k), megabyte (m),
gigabyte (g), and so on. Default: 4g

spark_daemon_readiness_assertion_interval

Time interval, in milliseconds, between subsequent retries by the Spark plugin for
Spark Master and Worker readiness to start. Default: 1000

spark_encryption_options

Spark encryption can be enabled for Spark client-to-Spark cluster and Spark internode
communication. Spark encryption applies only to these communication protocols in Spark:

Control messages via Akka

File sharing with HTTP or HTTPS

Spark encryption does not apply to RDD data exchange or to Spark web UI. Encryption
is used to send all configuration settings and all files which are required by Spark
applications, including passwords and tokens. Spark encryption requires truststores to
be defined.

The keystore for Spark encryption keys. The relative file
path is the base Spark configuration directory that is defined by the SPARK_CONF_DIR
environment variable. The default Spark configuration directory is
resources/spark/conf. Default: .keystore

keystore_password

The password to access the key store. Default: cassandra

key_password

Default: cassandra

truststore

The truststore for Spark encryption keys. The relative file
path is the base Spark configuration directory that is defined by the SPARK_CONF_DIR
environment variable. The default Spark configuration directory is
resources/spark/conf.

The keyspace where the DSEFS metadata is stored. You can optionally configure multiple DSEFS file systems
within a single datacenter by specifying different keyspace names for each cluster.
Default: dsefs

work_dir

The local directory for storing the local node metadata, including the node
identifier. The volume of data stored in this directory is nominal and does not require
configuration for throughput, latency, or capacity. This directory must not be shared by
DSEFS nodes.

public_port

The public port on which DSEFS listens for clients. DataStax
recommends that all nodes in the cluster have the same value. Firewalls must open this port to trusted clients. The service
on this port is bound to the RPC address. Default: 5598

private_port

The private port for DSEFS inter-node communication. Do not open this
port to firewalls; this private port must be not visible from outside of the
cluster. Default: 5599

data_directories

One or more data locations where the DSEFS data is stored.

- dir

Mandatory attribute to identify the set of directories. DataStax recommends
segregating these data directories on physical devices different than the devices that
are used for Cassandra. Using multiple directories on JBOD improves performance and
capacity. Default: /var/lib/dsefs

storage_weight

The weighting factor for this location specifies how much data to
place in this directory, relative to other directories in the cluster. This soft
constraint determines how DSEFS distributes the data. For example, a directory with a
value of 3.0 receives about three times more data than a directory with a value of
1.0. Default: 1.0

min_free_space

The reserved space, in bytes, to not use for storing file data
blocks. You can use a unit of measure suffix to specify other size units. For example:
terabyte (1tb), gigabyte (10g), and megabyte (5000mb). Default: 5368709120

Advanced properties for DSEFS

service_startup_timeout_ms

Wait time, in milliseconds, before the DSEFS server times out while waiting for
services to bootstrap. Default: 30000

service_close_timeout_ms

Wait time, in milliseconds, before the DSEFS server times out while waiting for
services to close. Default: 600000

server_close_timeout_ms

Wait time, in milliseconds, that the DSEFS server waits during shutdown before closing
all pending connections.

gossip options

Options to configure DSEFS gossip rounds.

round_delay_ms

The delay, in milliseconds, between gossip rounds. Default: 5000

startup_delay_ms

The delay time, in milliseconds, between registering the location and reading back all
other locations from Cassandra. Default: 5000

shutdown_delay_ms

The delay time, in milliseconds, between announcing shutdown and shutting down the
node. Default: 30000

rest_options

Options to configure DSEFS rest times.

request_timeout_ms

The time, in milliseconds, that the client waits for a response that corresponds to a
given request. Default: 330000

The time, in milliseconds, that the client waits to establish a new connection.
Default: 55000

client_close_timeout_ms

The time, in milliseconds, that the client waits for pending transfer to complete
before closing a connection. Default: 60000

server_request_timeout_ms

The time, in milliseconds, to wait for the server rest call to complete. Default:
300000

idle_connection_timeout_ms

The time, in milliseconds, to wait before closing an idle connection. Commenting out
or a value of 0 disables the idle connection timeout. Default: commented out
(0=disabled)

transaction_options

Options to configure DSEFS transaction times.

transaction_timeout_ms

Transaction run time, in milliseconds, before the transaction is considered for
timeout and rollback. Default: 3000

conflict_retry_delay_ms

Wait time, in milliseconds, before retrying a transaction that was ended due to a
conflict. Default: 200

execution_retry_delay_ms

Wait time, in milliseconds, before retrying a failed transaction payload execution.
Default: 1000

execution_retry_count

The number of payload execution retries before signaling the error to the application.
Default: 3

sync - A query is not executed until the audit event is
successfully written.

async - Audit events are queued for writing to the audit table,
but are not necessarily logged before the query executes. A pool of writer threads
consumes the audit events from the queue, and writes them to the audit table in
batch queries. While this substantially improves performance under load, if there is
a failure between when a query is executed, and its audit event is written to the
table, the audit table might be missing entries for queries that were executed.

batch_size

Available only when mode: async.

Must be greater than 0. The maximum number of
events the writer dequeues before writing them out to the table. If warnings in the
logs reveal that batches are too large, decrease this value or increase the value of
batch_size_warn_threshold_in_kb in
cassandra.yaml. Default: 50

flush_time

Available only when mode: async.

The maximum amount of time in milliseconds before
an event is removed from the queue by a writer before being written out. This flush
time prevents events from waiting too long before being written to the table when
there are not a lot of queries happening. Default: 500

num_writers

Available only when mode: async.

The number of worker threads asynchronously logging
events to the CassandraAuditWriter. Default: 10

queue_size

The size of the queue feeding the asynchronous audit log writer threads. When there
are more events being produced than the writers can write out, the queue fills up, and
newer queries are blocked until there is space on the queue. If a value of 0 is used,
the queue size is unbounded, which can lead to resource exhaustion under heavy query
load. Default: 10000

write_consistency

The consistency level that is used to write audit events. Default: QUORUM

dropped_event_log

The directory to store the log file that reports dropped events. Default:
/var/log/cassandra/dropped_audit_events.log

DSE Tiered Storage options

Options to define one or more disk configurations for DSE Tiered Storage. Specify multiple
disk configurations as unnamed tiers by a collection of paths that are defined in priority
order, with the fastest storage media in the top tier. With heterogeneous storage
configurations across the cluster, specify each disk configuration with
config_name:config_settings, and in CREATE or ALTER table
statements.

Options to configure the smart movement of data across different types of storage
media so that data is matched to the most suitable drive type, according to the
performance and cost characteristics it requires

strategy1

The first disk configuration strategy. Create a strategy2, strategy3, and so on. In
this example, strategy1 is the configurable name of the tiered
storage configuration strategy.

tiers

The section that defines a storage tier with the paths and file paths
that define the priority order.

- paths

The section of file paths that define the data directories for this
tier of the disk configuration. The tier that is listed first is the top tier that
typically accesses the fastest storage media. These paths are used only to store data
that is configured to use tiered storage. These paths are independent of any settings
in the cassandra.yaml file.

- /filepath

Specific file paths to define the data directories for this tier of the disk
configuration.

Set enabled:true on an edge node to collect data in the replication
log. Default: false.

conf_driver_password_encryption_enabled

Enable or disable encryption of driver passwords. When enabled, the stored driver
password is expected to be encrypted with the system
key. After you create the system key, you must copy the same system key to
every node in the cluster.

security_base_path

The base path to prepend to paths in the Advanced Replication configuration locations,
including locations to SSL keystore, SSL truststore, and so on. Default:
/base/path/to/advrep/security/files/

Inter-node messaging options

Configuration for the internal messaging service used by several components of DataStax
Enterprise. For 5.0 and later, all internode messaging requests use this service.

The mandatory port for the inter-node messaging service. Default: 8609

frame_length_in_mb

Maximum message frame length. Default: 256

server_acceptor_threads

The number of server acceptor threads. Default: the number of available
processors.

server_worker_threads

The number of server worker threads. Default: the number of available processors *
8.

client_max_connections

The maximum number of client connections. Default: 100

client_worker_threads

The number of client worker threads. Default: the number of available processors *
8.

handshake_timeout_seconds

Timeout for communication handshake process. Default: 10

DSE Multi-Instance server_id

server_id

In DSE Multi-Instance/etc/dse-nodeId/dse.yaml files, the
server_id option is generated to uniquely identify the physical
server on which multiple instances are running. The server_id
default value is the media access control address (MAC address) of the physical
server. You can change server_id when the MAC address is not
unique, such as a virtualized server where the host’s physical MAC is cloned.

DSE Graph system-level options

These graph options are
system-level configuration options and options that are shared between graph instances. Add an
option if it is not present in the provided dse.yaml file.

These graph options are system-level configuration options and options that are shared
between graph instances.

adjacency_cache_clean_rate

The number of stale rows per second to clean from each graph's adjacency cache.
Default: 1024.

adjacency_cache_max_entry_size_in_mb

The maximum entry size in each graph's adjacency cache. When set to zero, the default
is calculated based on the cache size and the number of CPUs. Entries that exceed this
size are quietly dropped by the cache without producing an explicit error or log
message. Default: 0.

adjacency_cache_size_in_mb

The amount of RAM to allocate to each graph's adjacency (edge and property) cache.
Default: 128.

analytic_evaluation_timeout_in_minutes

Maximum time to wait for an analytic (Spark) traversal to evaluate. Default: 10080 (7
days).

Option names and values expressed in ISO 8601
format used in earlier DSE 5.0 releases are still valid. The ISO 8601 format is
deprecated.

gremlin_server_enabled

Enables or disables Gremlin Server. Default: true.

index_cache_clean_rate

The number of stale entries per second to clean from the adjacency cache. Default:
1024.

index_cache_max_entry_size_in_mb

The maximum entry size in the index adjacency cache. When set to zero, the default is
based on the cache size and the number of CPUs. Value: integer. + # default is
calculated based on the cache size and the number of CPUs. Entries that exceed this size
are quietly dropped by the cache without producing an explicit error or log message.
Default: 0.

index_cache_size_in_mb

The amount of ram to allocate to the index cache. Default: 128.

max_query_queue

The maximum number of CQL queries that can be queued as a result of Gremlin requests.
Incoming queries are rejected if the queue size exceeds this setting. Default:
10000.

max_query_threads

The maximum number of threads to use for queries to the database. When this option is
not set, the default is:

Maximum time to wait for a real-time traversal to evaluate. Default: 30 seconds.

Option names and values expressed in ISO 8601
format used in earlier DSE 5.0 releases are still valid. The ISO 8601 format is
deprecated.

schema_agreement_timeout_in_ms

Maximum time to wait for cassandra to agree on schema versions before timing out.
Default: 10000

Option names and values expressed in ISO 8601
format used in earlier DSE 5.0 releases are still valid. The ISO 8601 format is
deprecated.

schema_mode

Controls the way that the schemas are handled. Valid values:

Production = Schema must be created before data insertion. Schema cannot be
changed after data is inserted. Full graph scans are disallowed unless the option
graph.allow_scan is changed to TRUE.

Development = No schema is required to write data to a graph. Schema can be
changed after data is inserted. Full graph scans are allowed unless the option
graph.allow_scan is changed to FALSE.

system_evaluation_timeout_in_seconds

Maximum time to wait for a graph-system request to execute. For example, a
graph-system request like creating a new graph. Default: 180 (3 minutes).

Option names and values expressed in ISO 8601
format used in earlier DSE 5.0 releases are still valid. The ISO 8601 format is
deprecated.

window_size

The number of samples to keep when aggregating log events. Only a small subset of
graph log events use this system. Modifying this setting is rarely necessary or helpful.
Default: 100000.

max_query_params

The maximum number of parameters that can be passed on a
graph query request for TinkerPop drivers and drivers using Cassandra native protocol.
Passing very large numbers of parameters on requests is an anti-pattern, because the
script evaluation time increases proportionally. DataStax recommends reducing the number
of parameters to speed up script compilation times. Before you increase this value,
consider alternate methods for parameterizing scripts, like passing a single map. If the
graph query request requires many arguments, pass a list. Default: 256

The graph standard vertex ID allocator operates on blocks of contiguous IDs. Each
block is allocated using a database lightweight transaction that requires coordination
latency. To hide the cost of allocating a standard ID block, the allocator begins
asynchronously buffering a replacement block whenever a current block is nearly empty.
This block_renew parameter defines "nearly empty" as a floating point number between 0
and 1. The value is how much of a standard ID block can be used before graph starts
asynchronously allocating its replacement. This setting has no effect on custom IDs.
Value must be between 0 and 1. Default: 0.8.

community_reuse

For graphs using standard vertex IDs, if a transaction creates multiple vertices, the
allocator attempts to assign vertex IDs that colocate vertices on the same database
replicas. If an especially large vertex cohort is created, the allocator chunks the
vertex creation and assigns a random target location to avoid load hotspotting. This
setting controls the vertex chunk size and has no effect on custom IDs. Default:
28.

consistency_mode

Must be set to DC_LOCAL or GLOBAL.

DC_LOCAL - The node uses LOCAL_QUORUM when allocating an ID for a graph vertex.
The datacenter_id option must be correctly configured on every node in the cluster.

GLOBAL - (Default) The node uses QUORUM when allocating an ID for a graph vertex.
The datacenter_id option is ignored.

This option must have the same value on every node in the cluster. Its value can
only be changed when the entire cluster is stopped. This setting has no effect on custom
IDs.

datacenter_id

Applies only when consistency_mode is DC_LOCAL. Set to an arbitrary value between 1
and 127, inclusive. This setting has no effect on custom IDs.

Warning: Each
datacenter in the cluster must have a unique datacenter_id. Violating this constraint
will corrupt the graph database without warning.

This setting has no effect on
custom IDs. Default: no explicit default value.

id_hash_modulus

An integer between 1 and 2^24 (both inclusive) that affects maximum ID capacity and
the maximum storage space used by ID allocations. Lower values reduce the storage space
consumed and the lightweight transaction overhead imposed at startup. Lower values also
reduce the total number of IDs that can be allocated over the life of a graph, because
this parameter is proportional to the allocatable ID space. However, the proportion
coefficient is Long.MAX_VALUE (2^63-1), so ID headroom should be sufficient, practically
speaking, even if this is set to 1. This setting has no effect on custom IDs. Default:
20.

member_block_size

The graph standard vertex ID allocator claims uniformly-sized blocks of contiguous IDs
using lightweight transactions on the database. This setting controls the size of each
block. This setting has no effect on custom IDs. Default: 512.

DSE Graph messaging options

Graph messages must be acknowledged within this interval, or else the message is
assumed dropped/failed. Graph retries the message or fails the responsible request if
the retry limit is exceeded. Default: 5000

Option names and values expressed in ISO 8601
format used in earlier DSE 5.0 releases are still valid. The ISO 8601 format is
deprecated.

Options to configure all registered event observers identified by their name.

observer_name

Replace observer_name with a string that identifies the event
observers. This string is the names of event types that are ignored. All event types but
those given are observed. The string must begin with a lower case letter and can be composed of
lowercase letters, numbers, and underscores. Value: YAML-formatted
list of strings.

*.black_types

The names of event types that are ignored. All event types but those given are
observed. Value: YAML-formatted list of strings. Default: (empty).

observed_graphs

The names of the graphs for which events are observed. Value: YAML-formatted list of
strings. Default: (empty).

*.slow_tx_graphs

The names of the graphs for which slow transactions are monitored. Default:
(empty).

*.slow_threshold_in_ms

Threshold at which slow queries get reported. Default: 300000

Option names and values expressed in ISO 8601
format used in earlier DSE 5.0 releases are still valid. The ISO 8601 format is
deprecated.

*.type

The type of the event observer. Must be one of the following values: slf4j,
slow_request. Default: slf4j.

*.white_types

The names of event types that should be observed. Only those event types are observed
and all others ignored. Value: YAML-formatted list of strings. Default: (empty).

DSE Graph shared data options

shared_data:
refresh_interval_in_ms: 60000

shared_data

Options for shared data in DSE Graph.

refresh_interval_in_ms

The interval between refreshes in which the graph schema is reread from the Cassandra
tables. Note that schema is also immediately updated when schema changes occur, so this
parameter is a failsafe to poll for schema changes periodically. Default: 60000

Option names and values expressed in ISO 8601
format used in earlier DSE 5.0 releases are still valid. The ISO 8601 format is
deprecated.

The port value identifies the available communications port for
Gremlin Server. Default: 8182

threadPoolWorker

The number of worker threads that handle requests and responses on the Gremlin
Server channel, including routing requests to the right server operations, handling
scheduled jobs on the server, and writing serialized responses back to the client.
Default: 2

gremlinPool

The number of Gremlin threads available to execute actual scripts in a ScriptEngine.
This pool represents the workers available to handle blocking operations in Gremlin
Server. Default: 8

scriptEngines

Section to configure gremlin server scripts.

gremlin-groovy

Section for gremlin-groovy scripts.

sandbox_enabled

Sandbox is enabled by default. To disable the gremlin groovy sandbox entirely, set
to false.

sandbox_rules

Section for sandbox rules.

whitelist_packages

List of packages, one package per line, to whitelist.

-package.name

Retain the hyphen before the fully qualified package name.

whitelist_types

List of types, one type per line, to whitelist.

-fully.qualified.type.name

Retain the hyphen before the fully qualified type name.

whitelist_supers

List of super classes, one class per line, to whitelist. Retain the hyphen before
the fully qualified class name.

-fully.qualified.class.name

Retain the hyphen before the fully qualified class name.

blacklist_packages

List of packages, one package per line, to blacklist.

-package.name

Retain the hyphen before the fully qualified package name.

blacklist_supers

List of super classes, one class per line, to blacklist. Retain the hyphen before
the fully qualified class name.