Past sessions 2011 - 12

AbstractLearning
theory and algorithms were originally developed for a chimerical world
in which several critical assumptions related to the distributions and
the sampling would hold. But, real-world data sets quite often do not
meet these conditions: the distribution according to which training and
test points are drawn may differ and the distributions may further be
drifting with time. These problems are not just second-order
effects. Their solution is not needed only to slightly improve the
performance of some learning algorithms. Ignoring them can lead to
dramatically poor results, which can be attested empirically without
difficulty. This talk presents a series of theoretical and
algorithmic solutions to address these issues. It also reports empirical
results in support of these algorithms in several natural scenarios.The talk includes joint work with Corinna Cortes, Yishay Mansour, and Andres Munoz.

Random forests are a scheme proposed by Leo Breiman in the 2000's for building a predictor ensemble with a set of decision trees that grow in randomly selected subspaces of data. Despite growing interest and practical use, there has been little exploration of the statistical properties of random forests, and little is known about the mathematical forces driving the algorithm. In this talk, we will discuss an in-depth analysis of a random forests model suggested by Breiman in 2004, which is very close to the original algorithm. We show in particular that the procedure is consistent and adapts to sparsity, in the sense that its rate of convergence depends only on the number of strong features and not on how many noise variables are present.

Semidefinite Programming with Applications in Geometry and Machine Learning

This tutorial will start by a very brief primer on semidefinite programming followed by a discussion of some recent applications to geometrical problems arising in statistics, graph theory, etc. A second part will then focus on applications of these techniques to performance measures for dictionary matrices in compressed sensing.