Edge-based Discovery of Training Data for Machine Learning

The generation of high-quality training data has become the key bottleneck in the use of deep learning across many domains. We describe Eureka, an interactive system that leverages edge computing and early discard to greatly improve the productivity of experts in the construction of a labeled data set. Our experimental results show that Eureka reduces the labeling effort needed to construct a training set by two orders of magnitude relative to a brute-force approach.