Making Deep Learning Practical with Smaller Datasets

indico -
October 25, 2017

At the recent VB Summit in Berkeley, Jeff Dean, Head of Google Brain discussed a popular challenge in making Deep Learning a practical solution inside the enterprise: “I would say pretty much any business that has tens or hundreds of thousands of customer interactions has enough scale to start thinking about using these sorts of things”. “If you only have 10 examples of something, it’s going to be hard to make deep learning work. If you have 100,000 things you care about, records or whatever, that’s the kind of scale where you should really start thinking about these kinds of techniques.”

Fact: Training a deep learning model from scratch requires large datasets of at least tens to hundreds of thousands of examples.This is a large barrier to entry for integrating deep learning into a business’ workflow, either due to inability to access such a large amount of data, or, if using supervised learning techniques, the monetary and temporal cost of labeling such a huge dataset.

What is Transfer Learning?

What if we didn’t have to start from scratch? What if instead of starting from zero every time you wanted to create a deep learning model, you could instead start with a model that already understood the basics of language? That’s the promise of an area of machine learning known as transfer learning. By looking at a massive corpus of language up front (typically hundreds of millions of labeled examples), transfer learning can create a basic understanding of language. Using this starting point, enterprises can use deep learning to their advantage even if their training datasets are orders of magnitude smaller than what is typically required.

At a low level, transfer learning “teaches” the model basic concepts of language. Things like synonyms, grammar, and basic syntactic structures. The model still needs help to understand specific documents: be it news reports, legal documents, or social media; but because the model already understands the basic structure of language, it can learn specific concepts much more quickly. You’re not going to get a lawyer model off the shelf, but you can at least get a model that speaks english.

Applying Transfer Learning

If you’re working with unstructured text and image data, indico’s Custom Collection API enables you to build custom models with just a few hundred (or at most, a few thousand) labeled examples by taking advantage of our high quality feature embeddings. For ideas on how to use Custom Collection for the task you’re trying to solve, explore some of our use case tutorials:

And before you ask…yes, we needed hundreds of millions of examples to build our transfer learning model that enables indico customers to make use of Deep Learning with only a few dozen examples for their use case!