Clustering

Druid is designed to be deployed as a scalable, fault-tolerant cluster.

In this document, we'll set up a simple cluster and discuss how it can be further configured to meet
your needs. This simple cluster will feature scalable, fault-tolerant servers for Historicals and MiddleManagers, and a single
coordination server to host the Coordinator and Overlord processes. In production, we recommend deploying Coordinators and Overlords in a fault-tolerant
configuration as well.

Select hardware

The Coordinator and Overlord processes can be co-located on a single server that is responsible for handling the metadata and coordination needs of your cluster.
The equivalent of an AWS m3.xlarge is sufficient for most clusters. This
hardware offers:

4 vCPUs

15 GB RAM

80 GB SSD storage

Historicals and MiddleManagers can be colocated on a single server to handle the actual data in your cluster. These servers benefit greatly from CPU, RAM,
and SSDs. The equivalent of an AWS r3.2xlarge is a
good starting point. This hardware offers:

8 vCPUs

61 GB RAM

160 GB SSD storage

Druid Brokers accept queries and farm them out to the rest of the cluster. They also optionally maintain an
in-memory query cache. These servers benefit greatly from CPU and RAM, and can also be deployed on
the equivalent of an AWS r3.2xlarge. This hardware
offers:

8 vCPUs

61 GB RAM

160 GB SSD storage

You can consider co-locating any open source UIs or query libraries on the same server that the Broker is running on.

Very large clusters should consider selecting larger servers.

Select OS

We recommend running your favorite Linux distribution. You will also need:

Java 8

Your OS package manager should be able to help for both Java. If your Ubuntu-based OS
does not have a recent enough version of Java, WebUpd8 offers packages for those
OSes.

Download the distribution

First, download and unpack the release archive. It's best to do this on a single machine at first,
since you will be editing the configurations and then copying the modified distribution out to all
of your servers.

Configure deep storage

Druid relies on a distributed filesystem or large object (blob) store for data storage. The most
commonly used deep storage implementations are S3 (popular for those on AWS) and HDFS (popular if
you already have a Hadoop deployment).

S3

In conf/druid/_common/common.runtime.properties,

Set druid.extensions.loadList=["druid-s3-extensions"].

Comment out the configurations for local storage under "Deep Storage" and "Indexing service logs".

Uncomment and configure appropriate values in the "For S3" sections of "Deep Storage" and
"Indexing service logs".

Place your Hadoop configuration XMLs (core-site.xml, hdfs-site.xml, yarn-site.xml,
mapred-site.xml) on the classpath of your Druid nodes. You can do this by copying them into
conf/druid/_common/.

Configure Tranquility Server (optional)

Data streams can be sent to Druid through a simple HTTP API powered by Tranquility
Server. If you will be using this functionality, then at this point you should configure
Tranquility Server.

Configure Tranquility Kafka (optional)

Druid can consuming streams from Kafka through Tranquility Kafka. If you will be
using this functionality, then at this point you should
configure Tranquility Kafka.

Configure for connecting to Hadoop (optional)

If you will be loading data from a Hadoop cluster, then at this point you should configure Druid to be aware
of your cluster:

Update druid.indexer.task.hadoopWorkingPath in conf/middleManager/runtime.properties to
a path on HDFS that you'd like to use for temporary files required during the indexing process.
druid.indexer.task.hadoopWorkingPath=/tmp/druid-indexing is a common choice.

Place your Hadoop configuration XMLs (core-site.xml, hdfs-site.xml, yarn-site.xml,
mapred-site.xml) on the classpath of your Druid nodes. You can do this by copying them into
conf/druid/_common/core-site.xml, conf/druid/_common/hdfs-site.xml, and so on.

Note that you don't need to use HDFS deep storage in order to load data from Hadoop. For example, if
your cluster is running on Amazon Web Services, we recommend using S3 for deep storage even if you
are loading data using Hadoop or Elastic MapReduce.

Configure addresses for Druid coordination

In this simple cluster, you will deploy a single Druid Coordinator, a
single Druid Overlord, a single ZooKeeper instance, and an embedded Derby metadata store on the same server.

In conf/druid/_common/common.runtime.properties, replace
"zk.service.host" with the address of the machine that runs your ZK instance:

druid.zk.service.host

In conf/druid/_common/common.runtime.properties, replace
"metadata.storage.*" with the address of the machine that you will use as your metadata store:

druid.metadata.storage.connector.connectURI

druid.metadata.storage.connector.host

In production, we recommend running 2 servers, each running a Druid Coordinator
and a Druid Overlord. We also recommend running a ZooKeeper cluster on its own dedicated hardware,
as well as replicated metadata storage
such as MySQL or PostgreSQL, on its own dedicated hardware.

Tune Druid processes that serve queries

Druid Historicals and MiddleManagers can be co-located on the same hardware. Both Druid processes benefit greatly from
being tuned to the hardware they run on. If you are running Tranquility Server or Kafka, you can also colocate Tranquility with these two Druid processes.
If you are using r3.2xlarge
EC2 instances, or similar hardware, the configuration in the distribution is a
reasonable starting point.

If you are using different hardware, we recommend adjusting configurations for your specific
hardware. The most commonly adjusted configurations are:

-Xmx and -Xms

druid.server.http.numThreads

druid.processing.buffer.sizeBytes

druid.processing.numThreads

druid.query.groupBy.maxIntermediateRows

druid.query.groupBy.maxResults

druid.server.maxSize and druid.segmentCache.locations on Historical Nodes

Tune Druid Brokers

Druid Brokers also benefit greatly from being tuned to the hardware they
run on. If you are using r3.2xlarge EC2 instances,
or similar hardware, the configuration in the distribution is a reasonable starting point.

If you are using different hardware, we recommend adjusting configurations for your specific
hardware. The most commonly adjusted configurations are:

You can add more servers with Druid Historicals and MiddleManagers as needed.

For clusters with complex resource allocation needs, you can break apart Historicals and MiddleManagers and scale the components individually.
This also allows you take advantage of Druid's built-in MiddleManager
autoscaling facility.

If you are doing push-based stream ingestion with Kafka or over HTTP, you can also start Tranquility Server on the same
hardware that holds MiddleManagers and Historicals. For large scale production, MiddleManagers and Tranquility Server
can still be co-located. If you are running Tranquility (not server) with a stream processor, you can co-locate
Tranquility with the stream processor and not require Tranquility Server.