Facebook Introduces New Model For Word Embeddings Which Are Resilient To Misspellings

Facebook has been doing a lot in the field of natural language processing (NLP). The tech giant has achieved remarkable breakthroughs in natural language understanding and language translation in recent years.

Now researchers at Facebook are implementing semi-supervised and self-supervised learning techniques to leverage unlabelled data which helps in improving the performance of the machine learning models.

Recently, researchers at Facebook introduced a new model known as Misspelling Oblivious (word) Embeddings (MOE). This model is a combination of fastText and a supervised task which embeds misspellings close to their correct variants.

fastText is an open-source library which is designed to help build scalable solutions for text representation and classification. This library combines successful concepts like representing sentences with the bag of n-grams, using subword information and sharing information across classes through a hidden representation.

How It Works

MOE holds the fundamental properties of fastText and Word2Vec while giving explicit importance to misspelt words. This model can also be said as the generalisation of fastText where the former not only considers semantic loss but also considers an additional supervised loss, also known as spell correction loss. Here, spell correction loss aims to map embeddings of misspelt words close to the embeddings of their correctly spelt variants in the vector space.

Dataset Used

MOE embeddings are trained on a new misspelling dataset which is a collection of correctly spelt words along with the misspelling of those words. The total size of misspellings dataset contains more than 20 million pairs of instances and it is used to measure the spell correction loss.

How Is It Different

This model is different from other well-known word-embedding methods such as word2vec and GloVe. The current methods lack in providing embeddings for words that have not been observed during the training time — or the Out-Of-Vocabulary (OOV) words. This leads to an unsatisfactory result as it allows to deal with the text which contains slangs, misspellings, etc.

Advantages of MOE

In real-world, while searching something on the web or chatting with someone, etc. humans often input text which contains misspellings. This new method will help in improving the ability to apply word embeddings to these real-world scenarios. Now, MOE can be used in various domains like customer-centric enterprises to gain actionable insights from the customers’ reviews and feedbacks, chatbots, music, and video recommender systems, among others.

Importance of Word Embeddings Model

Word embeddings are like miracle drugs in resolving NLP tasks and they have eventually bettered various machine learning models. This method has been used thoroughly in machine translation, named entity resolution, automatic summarization, information retrieval, document retrieval, speech recognition, and others. At the present scenario, one of the most used forms of word embeddings is Word2Vec which is used to analyse the survey responses and gain insights from customer reviews, among others. But with the advent of MOE, there will be a huge difference in the way word embeddings used to work for NLP tasks. The new model will improve the ability and capability to apply word-embeddings in real-life cases.

Key Takeaways

MOE aims to solve the limitations of dealing with malformed words in real-world applications by generating high quality and semantically valid embeddings for misspellings

This model outperforms fastText baseline for the word similarity task when misspellings are involved

MOE outperforms on both semantic and syntactic questions. It preserves the quality of the semantic analogies while improving on the syntactic analogies

Outlook

The researchers at Facebook are optimising and improving natural language processing (NLP) tasks by various approaches such as leveraging Google’s Bidirectional Encoder Representations from Transformers (BERT) and other related approaches to push cutting-edge research in conversational AI, improve content understanding systems, and much more.

Recently, Facebook introduced Robustly Optimized BERT Pretraining Approach or RoBERTa which is an optimised method for pretraining natural language processing (NLP) systems. RoBERTa can be said as the replication project of BERT that improvise the performance of the BERT model. The model produces state-of-the-art results on the widely used NLP benchmark as well as General Language Understanding Evaluation (GLUE).

Provide your comments below

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box. Contact: ambika.choudhury@analyticsindiamag.com