Conference Review Session

A New Boosting Algorithm Using Input-Dependent Regularizer
ICML-2003Yan Liu

AdaBoost has proved to be an effective method to improve the performance
of base classifiers both theoretically and empirically. However,
previous studies have shown that AdaBoost might suffer from the
overfitting problem, especially for noisy data. In addition, most
current work on boosting assumes that the combination weights are fixed
constants and therefore does not take particular input patterns into
consideration. In this paper, we present a new boosting algorithm,
"WeightBoost", which tries to solve these two problems by introducing an
input-dependent regularization factor to the combination weight.
Similarly to AdaBoost, we derive a learning procedure for WeightBoost,
which is guaranteed to minimize training errors. Empirical studies on
eight different UCI data sets and one text categorization data set show
that WeightBoost almost always achieves a considerably better
classification accuracy than AdaBoost. Furthermore, experiments on data
with artificially controlled noise indicate that the WeightBoost is more
robust to noise than AdaBoost.

We present work in progress on the problem of learning from
labeled and unlabeled data. Following the approach of Blum and Chawla
(2001) we consider algorithms that use pairwise relationships among
examples (both labeled and unlabeled) in order to construct a graph
that can then be partitioned using a mincut algorithm. In our work,
we extend the Blum-Chawla approach by adding noise to the graph
structure and using this noise to provide the algorithm with an
estimate of its own confidence. This gives the mincut approach some
of the properties of the label propagation method of Zhu and
Ghahramani (2002). We present preliminary results of this approach.

This talk presents an overview of our research in developing a PAC
style framework for learning with both labeled and unlabeled data. To
do this we enrich the standard PAC model with a notion of
compatibility between a concept and a distribution, for expressing our
belief that the target concept is compatible with the distribution
over the unlabeled data. In a sense, this is like modifying the notion
of a Concept Class from a set of concepts to a set of (concept,
distribution) pairs. We analyze for different settings how much
unlabeled data and how much labeled data do we need in order to
identify with high confidence a hypothesis that is very close to the
target concept we are searching for. We are mainly considering using
unlabeled data for adjusting our prior, more specifically for ordering
the functions in the hypothesis space (with respect to their degree of
compatibility with the distribution over the unlabeled data) and we
are dealing only with situations where the compatibility can be
estimated as an expectation over unlabeled examples. We also study how
the need for labeled data depends on the accuracy of our belief (that
the target concept is compatible with the distribution over the
unlabeled data). This is joint work with Avrim Blum.