Pages

Elasticsearch (ES) is a distributed, scalable, search and analytics engine that enables fast data retrieval. [1]. It exposes a RESTful API, Java and Python libraries, and extensibility with various plugins like the Zookeeper cluster integration plugin.

In this post I'll deploy a small Elasticsearch cluster consisting of one Nginx loadbalancer, two ES client nodes, three ES master nodes and two ES data nodes.

The role of the three different ES node types is:

- Client nodes: act as load balancer for routing queries and index processes. The client nodes do not hold any data.
- Data nodes: hold data, merge segments and execute queries. The data nodes are the main workers.
- Master nodes: manages the cluster and elects a master node using Unicast. The master nodes hold configuration data and the mapping of all the indexes in the cluster.

A simple one node deployment by default configures the ES server to be both a master and a data node. To be able to further scale however, you'll need multiple data nodes, at least 3 masters (to prevent split brain scenarios) and two client nodes to route the requests and results.

Installing and configuring ES cluster is rather simple. First download the Oracle JRE and install it on all ES nodes:

Next configure the different types of ES nodes, starting with the three masters:

File: gistfile1.txt
-------------------
[es-master-n01,02,03]$ cat /etc/elasticsearch/elasticsearch.yml
cluster.name: test-cluster # must be the same on all ES nodes part of the cluster
node.name: es-master-n01 # replace this with n02 and n03
node.data: false
node.master: true # this is what defines the node as a master
network.host: 10.176.66.106 # replace this with the IPs of n02 and n03
discovery.zen.ping.unicast.hosts: ["10.176.66.106", "10.176.66.108", "10.176.66.113"] # The IPs of all the ES masters in the cluster
discovery.zen.minimum_master_nodes: 3 # minimum number of masters to have a quorum