In a previous post, you saw how to configure one of the built-in analyzers as well as a token filter. Now it’s time to see how we can build our own custom analyzer. We do that by defining which character filters, tokenizer, and token filters the analyzer should consist of, and potentially configuring them. PUT… read more

Elasticsearch ships with a number of built-in analyzers and token filters, some of which can be configured through parameters. In the following example, I will configure the standard analyzer to remove stop words, which causes it to enable the stop token filter. I will create a new index for this purpose and define an analyzer… read more

If you read how analyzers work in Elasticsearch prior to reading this post, then you know how Elasticsearch analyzes text fields. Then you might wonder what actually happens with the results of the analysis process. They must end up being stored somewhere, right, because otherwise what’s the point? The results from the analysis are indeed… read more

In Elasticsearch, the values for text fields are analyzed when adding or updating documents. So what does it mean that text is analyzed? When indexing a document, its full text fields are run through an analysis process. By full-text fields, I am referring to fields of the type text, and not keyword fields, which are… read more

Installing wkhtmltopdf on Linux can be a bit tricky, especially for people who are not so familiar with *nix operating systems. There are various ways in which you can install wkhtmltopdf; use a package manager such as apt-get, compile from source, or download the pre-compiled binary file. We are going to be doing the latter… read more

In order to understand how replication works in Elasticsearch, you should already understand how sharding works, so be sure to check that out first. Hardware can fail at any time, and software can be buggy at times. Let’s face it, sometimes things just stop working. The more hardware capacity you add, the higher the risk… read more

Elasticsearch is extremely scalable due to its distributed architecture. One of the reasons this is the case, is due to something called sharding. If you have worked with other technologies such as relational databases before, then you may have heard of this term. Before getting into what sharding is, let’s first talk about why it… read more

This article is an introduction to the physical architecture of Elasticsearch, being how documents are distributed across virtual or physical machines and how machines work together to form what is known as a cluster. Nodes & Clusters To start things off, we will begin by talking about nodes and clusters, which are at the centre… read more

So far, we haven’t really done anything dynamically yet; we declared and initialized variables and output their values. That’s about to change, because now we are going to be working a bit with the basic math operators that Python provides. Python supports all of the math operations that you would expect. The basic ones are… read more

The code that you have seen so far, has been pretty easy to understand. But imagine that you write a complicated piece of code or just do something where it is not immediately apparent why. Perhaps it totally made sense to you when you wrote that code, but that might not make case when you… read more