Feature
learning forms the cornerstone for tackling challenging classification
problems in domains such as speech, computer vision and natural language
processing. While traditionally, features were hand-crafted, the modern
approach is to automatically learn good features through deep learning
or other frameworks. Feature learning can exploit unlabeled samples,
which are usually present in larger amounts, for improved classification
performance.
In this talk, we provide a concrete theoretical framework for
obtaining informative features which can be used to learn a
discriminative model for the label given the input. We show that (higher
order) Fisher score functions of the input are informative features,
and we provide a differential operator interpretation. We show that
given access to these score features, we can obtain the (expected)
derivatives of the label as a function of the input (or some model
parameters). Having access to these derivatives forms the key to
learning complicated discriminative models such as multi-layer neural
networks and mixture of classifiers. Thus, the main ingredient for
learning discriminative models lies in accurate unsupervised estimation
of (higher order) score functions of the input. This is joint work with
my students Majid Janzamin and Hanie Sedghi.