Scroll down for CMU 15-859(B) Machine Learning Theory, Spring 2014

UIUC CS-589, Fall 2014

TOPICS IN MACHINE LEARNING THEORY

Wed/Fri 11:00-12:15, SC 1109

(Office hours: Wed 4:00-5:00, SC 3212)

Course description:
This seminar class will focus on new results and directions in machine
learning theory. Machine learning theory concerns questions such as:
What kinds of guarantees can we prove about practical machine learning
methods, and can we design algorithms achieving desired guarantees?
(Why) is Occam's razor a good idea and what does that even mean? What
can we say about the inherent ease or difficulty of different types of
learning problems? Addressing these questions will bring in
connections to probability and statistics, online algorithms, game
theory, computational geometry, and empirical machine learning
research.

The first half of the course will involve the instructor presenting
some classic results and background including regret guarantees,
combining expert advice, Winnow and Perceptron algorithms,
VC-dimension, Rademacher complexity, SVMs, and Kernel functions. The
second half will involve student-led discussions of recent papers in
areas such as deep learning, multi-task learning, tensor methods,
structured prediction, dictionary learning, and other topics

09/10: Do exercises 2,3 and problems 4,5 on this
problem set. If you have time, see if you can solve problem 6
(but you don't need to write it up). Also if you have time, look over
exercise 1 (but you don't need to write it up). See group-work
protocol below.

09/12: Do problems 3,4,5 on this problem
set. Problem 3 is the hardest.
If you have time, do exercise 1 (and think about the total weight).
See group-work protocol below.

Protocol for in-class problem sets: Here is the process for the in-class problem-set work. For each class
we will have 2 "readers/facilitators". Their job is to display the
problems on the projector, read them out to the class, and
coordinate. After each problem is read, the class as a whole works
together on solving it. Once (somebody believes) the problem has been
solved, somebody gets up and explains the solution to the rest of the
class (which then either agrees or finds a bug and the process repeats).
Once the problem has been solved, 1-3 people volunteer to write up the
solution and send it to the readers/facilitators. Please include
names of all those who contributed to the solution in the writeup. The
readers/facilitators will combine the solutions received into a single
document and email it to me (avrim@cs.cmu.edu).

CMU 15-859(B), Spring 2014

MACHINE LEARNING THEORY

MW 10:30-11:50, GHC 4303

Course description:
This course will focus on theoretical aspects of machine learning. We
will examine questions such as: What kinds of guarantees can we prove
about learning algorithms? Can we design algorithms for interesting
learning tasks with strong guarantees on accuracy and amounts of data
needed? What can we say about the inherent ease or difficulty of
learning problems? Can we devise models that are both amenable to
theoretical analysis and make sense empirically? Addressing these
questions will bring in connections to probability and statistics,
online algorithms, game theory, complexity theory, information theory,
cryptography, and empirical machine learning research.
Grading will be based on
6 homework assignments, class
participation, a small class project, and a take-home final
(worth about 2 homeworks). Students from time
to time will also be asked to help with the grading of
assignments.
[2009 version of the course]

Prerequisites: A Theory/Algorithms background or a Machine
Learning background.

Robert Williamson, John Shawe-Taylor, Bernhard Scholkopf, Alex
Smola
Sample
Based Generalization Bounds. Gives tighter generalization bounds
where instead of using "the maximum number of ways of labeling a set of 2m
points" you can use "the number of ways of labeling your actual sample".

Maria-Florina Balcan, Avrim Blum, and Nathan Srebro Improved Guarantees for Learning via Similarity Functions. Gives formulation and analysis for learning with general similarity functions. Also shows that for any class of large SQ dimension, there cannot be a kernel that has large margin even for all (or even a non-negligible fraction) of the functions.