Description:
So you've taken a machine learning class.
You know the models people use to solve their problems.
You know the algorithms they use for learning.
You know how to evaluate the quality of their solutions.

But when we look at a large-scale machine learning application that is deployed in practice, it's not always exactly what you learned in class.
Sure, the basic models, the basic algorithms are all there.
But they're modified a bit, in a bunch of different ways, to run faster and more efficiently.
And these modifications are really important—they often are what make the system tractable to run on the data it needs to process.

CS6787 is a graduate-level introduction to these system-focused aspects of machine learning, covering guiding principles and commonly used techniques for scaling up to large data sets.
Informally, we will cover the techniques that lie between a standard machine learning course and an efficient systems implementation.
Topics will include stochastic gradient descent, acceleration, variance reduction, methods for choosing metaparameters, parallelization within a chip and across a cluster, popular ML frameworks, and innovations in hardware architectures.
An open-ended project in which students apply these techniques is a major part of the course.

Prerequisites:
Knowledge of machine learning at the level of CS4780.
Optionally, knowledge of computer systems and hardware on the level of CS 3410 would be useful, but this is not a prerequisite.

Format:
For half of the classes, typically on Mondays, there will be a traditionally formatted lecture.
For the other half of the classes, typically on Wednesdays, we will read and discuss a seminal paper relevant to the course topic.
These classes will involve a presentation by a group of students of the paper contents (each student will sign up in a group to present one paper) followed by breakout discussions about the material.

Grading: Students will be evaluated on the following basis.

20%

Paper presentation

10%

In-class quizzes — there will be a quiz before each paper presentation on that paper's content

10%

Discussion participation

30%

Paper reviews — students must submit a review of every paper we discuss

30%

Final project

Paper review parameters:
Paper reviews should be about one page (single-spaced) in length.
The review guidelines should mirror what an actual conference review would look like (although you needn't assign scores or anything like that).
In particular you should at least: (1) summarize the paper, (2) discuss the paper's strengths and weaknesses, and (3) discuss the paper's impact.
For reference, you can read the NIPS reviewer guidelines, starting with the Overview section on page 6.
Of course, your review will not be precisely like a real review, in large part because we already know the impact of these papers.

Final project parameters and course calendar may be subject to change.

Final project parameters:[Project Overview Slides]
The final project can be done in groups of up to three (although more work will be expected from groups with more people).
The subject of the project is open-ended, but it must include:

the implementation of a machine learning system for some task,

using one or more of the techniques discussed in the course (or similar techniques),

to empirically evaluate the performance (throughput or wall-clock time) and compare it with some baseline method.

Project proposals are due on Monday, November 13.
The project proposal should satisfy the following constraints:

The main body should be aboue one page in length.

It should describe the project you intend to do.

It should contain at least one citation of a relevant paper that we did not cover in class (but preferably more).

It should include some preliminary or exploratory work you've already done, that helps to support the idea that your project is feasible (this preliminary work can be very minimal, but should indicate that you've got started—or at least have a clear idea how to do so).

In addition to the one-page text proposal, it should contain one short experiment plan per person, which should consist of:

a hypothesis

a proxy statement which describes what metric you are going to use to measure the variables you care about

a short protocol statement describing what you are going to do

the results you expect to get

The experiment plan should not be longer than half a page, and may be much shorter.

The project will culminate in a project report of at least four-pages, not including references.
The project report should be formatted similarly to a workshop paper, and should use the ICML 2017 style or a similar style.
An abstract for the report is due on Monday, November 27, and we will discuss the abstracts in class on that day.
The final project report is due on Wednesday, December 6.

Course Calendar

Wednesday, August 23

No in-person lecture. I am traveling this week. Do not go to the lecture room. No one will be there.