This is really a hot topic these days: Chatbots . In this tutorial, we’re diving in the world of chatbots and how they are built.

What are chatbots?

Chatbots are systems that can have a fairly complex conversation with humans. They can go by different names: Conversational Agents or Dialog Systems . As you’ve probably guessed, chatbots use a lot of Natural Language Processing techniques in order to understand the human’s requests.

The Holy Grail of chatbot builders is to pass the Turing Test. This means that a human can’t figure out that he’s talking to an actual human. Although we are pretty far from that, (especially from a Natural Language Generation point of view) great progress has been made.

Most of the chatbots that are built these days are goal-oriented agents. This means that they will steer the conversation towards achieving a certain predefined goal. We can have customer support agents that figure out what the problem the user is facing and then solving it (or just opening a ticket).Read More

Recently we also started looking at Deep Learning, using Keras, a popular Python Library. You can get started with Keras in this Sentiment Analysis with Keras Tutorial. This tutorial will combine the two subjects. We’ll be building a POS tagger using Keras and a Bidirectional LSTM Layer. Let’s use a corpus that’s included in NLTK:Read More

In the previous tutorial on Deep Learning, we’ve built a super simple network with numpy. I figured that the best next step is to jump right in and build some deep learning models for text. The best way to do this at the time of writing is by using Keras .

What is Keras?

Keras is a deep learning framework that actually under the hood uses other deep learning frameworks in order to expose a beautiful, simple to use and fun to work with, high-level API. Keras can use either of these backends:

The purpose of this post is to gather into a list, the most important libraries in the Python NLP libraries ecosystem. This list is important because Python is by far the most popular language for doing Natural Language Processing. This list is constantly updated as new libraries come into existence. In case you are looking for a list of useful corpora, check out this NLP corpora listRead More

Deep Learning is one of those hyper-hyped subjects that everybody is talking about and everybody claims they’re doing. In certain cases, startups just need to mention they use Deep Learning and they instantly get appreciation. Deep Learning is indeed a powerful technology, but it’s not an answer to every problem. It’s also not magic like many people make it look like.

In this post, we’ll be doing a gentle introduction to the subject. You’ll learn what a Neural Network is, how to train it and how to represent text features (in 2 ways). For this purpose, we’ll be using the IMDB dataset. It contains around 25.000 sentiment annotated reviews. Deep Learning models usually require a lot of data to train properly. If you have little data, maybe Deep Learning is not the solution to your problem. In this case, the amount of data is a good compromise: it’s enough to train some toy models and we don’t need to spend days waiting for the training to finish or use GPU.

Introduction

We talked briefly about word embeddings (also known as word vectors) in the spaCy tutorial.
SpaCy has word vectors included in its models. This tutorial will go deep into the intricacies of how to compute them and their different applications.

Bag Of Words Model

In most of our tutorials so far, we’ve been using a Bag-Of-Words model.
Take for example this article: Text Classification Recipe. Using the BOW model we just keep counts of the words from the vocabulary. We don’t know anything about the words semantics.

Updates

29-Apr-2018 – Fixed import in extension code (Thanks Ruben)

spaCy is a relatively new framework in the Python Natural Language Processing environment but it quickly gains ground and will most likely become the de facto library. There are some really good reasons for its popularity:

It's really FAST

Written in Cython, it was specifically designed to be as fast as possible

What is Topic Modeling?

Topic modelling, in the context of Natural Language Processing, is described as a method of uncovering hidden structure in a collection of texts. Although that is indeed true it is also a pretty useless definition. Let’s define topic modeling in more practical terms.Read More

What are Word Clouds?

Word Clouds are a popular way of displaying how important words are in a collection of texts. Basically, the more frequent the word is, the greater space it occupies in the image. One of the uses of Word Clouds is to help us get an intuition about what the collection of texts is about. Here are some classic examples of when Word Clouds can be useful: