Ultraconservative Online Algorithms for Multiclass Problems

Koby Crammer, Yoram Singer;
3(Jan):951-991, 2003.

Abstract

In this paper we study a paradigm to generalize online classification
algorithms for binary classification problems to multiclass problems.
The particular hypotheses we investigate maintain one prototype vector
per class. Given an input instance, a multiclass hypothesis computes a
similarity-score between each prototype and the input instance and sets
the predicted label to be the index of the prototype achieving the highest
similarity. To design and analyze the learning algorithms in this paper we
introduce the notion of ultraconservativeness. Ultraconservative
algorithms are algorithms that update only the prototypes attaining
similarity-scores which are higher than the score of the correct label's
prototype. We start by describing a family of additive ultraconservative
algorithms where each algorithm in the family updates its prototypes by
finding a feasible solution for a set of linear constraints that depend on
the instantaneous similarity-scores. We then discuss a specific online
algorithm that seeks a set of prototypes which have a small norm. The
resulting algorithm, which we term MIRA (for Margin Infused Relaxed
Algorithm) is ultraconservative as well. We derive mistake bounds
for all the algorithms and provide further analysis of MIRA using a
generalized notion of the margin for multiclass problems. We discuss
the form the algorithms take in the binary case and show that all the
algorithms from the first family reduce to the Perceptron algorithm while
MIRA provides a new Perceptron-like algorithm with a margin-dependent
learning rate. We then return to multiclass problems and describe an
analogous multiplicative family of algorithms with corresponding mistake
bounds. We end the formal part by deriving and analyzing a multiclass
version of Li and Long's ROMMA algorithm. We conclude with a discussion
of experimental results that demonstrate the merits of our algorithms.