Introduction to ElasticSearch

In this post, we’ll cover the basics of what you need to know about ElasticSearch. Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable full-text search engine with a RESTful web interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License.

-What is elasticsearch?

It is an open source distributed restful search engine. Elasticsearch uses Lucene under the covers to provide the most powerful full text search capabilities available in any open source product. The relationship between ElasticSearch and Lucene, is like that of the relationship between a car and its engine. Search comes with multi-language support, a powerful query language, support for geolocation, context aware did-you-mean suggestions, autocomplete and search snippets.

Basic Concepts

– Elasticsearch is a near real time search platform.

What this means is there is a slight latency (normally one second) from the time you index a document until the time it becomes searchable.

– Cluster

A cluster is a collection of one or more nodes (servers) that together holds your entire data and provides federated indexing and search capabilities across all nodes. A cluster is identified by a unique name which by default is “elasticsearch”.

– Node

Is a single server that is part of your cluster, stores your data, and participates in the cluster’s indexing and search capabilities.

– Index

Is a collection of documents that have somewhat similar characteristics

-Type

Within an index, you can define one or more types. A type is a logical category/partition of your index whose semantics is completely up to you.

– Document

A document is a basic unit of information that can be indexed.

– Shards & Replicas

An index can potentially store a large amount of data that can exceed the hardware limits of a single node. For example, a single index of a billion documents taking up 1TB of disk space may not fit on the disk of a single node or may be too slow to serve search requests from a single node alone. To solve this problem, Elasticsearch provides the ability to subdivide your index into multiple pieces called shards. When you create an index, you can simply define the number of shards that you want. Each shard is in itself a fully-functional and independent “index” that can be hosted on any node in the cluster.