This two-day course gives a detailed overview of statistical models for
data mining, inference and prediction. With the rapid developments
in internet technology, genomics, financial risk modeling, and other
high-tech industries, we rely increasingly more on data analysis and
statistical models to exploit the vast amounts of data at our fingertips.

This course is the third in a series, and follows our popular past
offerings "Modern Regression and Classification", and "Statistical Learning and Data Mining".

The two earlier courses are not a prerequisite for this new course.

In this course we emphasize the tools useful for tackling modern-day
data analysis problems. These include gradient boosting, SVMs and kernel
methods, random forests, lasso and LARS, ridge regression and GAMs,
supervised principal components, and cross-validation. We also present
some interesting case studies in a variety of application areas.

This course focusses on both "tall" data ( N>p where N=#cases,
p=#features) and "wide" data (p>N). Typical examples of tall data
are credit risk and churn prediction, and email spam filtering. Topics
include linear and ridge regression, lasso, and LARS, support vector
machines, random forests and boosting. We give in-depth discussion of
validation, cross-validation and test set issues.