EMNIST

Title:

EMNIST: an extension of MNIST to handwritten letters

Authors:

Gregory Cohen, Saeed Afshar, Jonathan Tapson, and André van Schaik

Journal:

arXiv ID: 1702.05373

Year:

2017

Abstract:

The MNIST dataset has become a standard benchmark
for learning, classification and computer vision systems.
Contributing to its widespread adoption are the understandable
and intuitive nature of the task, its relatively small size and
storage requirements and the accessibility and ease-of-use of
the database itself. The MNIST database was derived from a
larger dataset known as the NIST Special Database 19 which
contains digits, uppercase and lowercase handwritten letters. This
paper introduces a variant of the full NIST dataset, which we
have called Extended MNIST (EMNIST), which follows the same
conversion paradigm used to create the MNIST dataset. The
result is a set of datasets that constitute a more challenging
classification tasks involving letters and digits, and that shares
the same image structure and parameters as the original MNIST
task, allowing for direct compatibility with all existing classifiers
and systems. Benchmark results are presented along with a
validation of the conversion process through the comparison of
the classification results on converted NIST digits and the MNIST
digits.