Foundations of Deep Learning :
The advent of big data and powerful computers have made deep learning algorithms the current method of choice for a host of machine learning problems. Over the last few years deep learning systems have been beating with a large margin the previous state-of-the-art systems in tasks as diverse as speech recognition, image classification, and object detection.
Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers. Searching the parameter space of deep architectures is a difficult task, but learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success. This course will discuss the motivations and principles regarding learning algorithms for deep architectures, starting from the unsupervised learning of single-layer models such as Restricted Boltzmann Machines, and moving on to learning deeper models such as Deep Belief Networks.
The course consists of the following eight lectures (3h each), and a lab section.

Foundations of Machine Learning (M1) :
It is essentially the intersection between statistics and computation, though the principles of machine learning have been rediscovered from many different traditions, including artificial intelligence, Bayesian statistics, and frequently statistics. This course gives an overview of the most important trends in machine learning, with a particular focus on statistical risk and its minimization with respect to a prediction function. A substantial lab section involves group projects on data science competitions and gives students the ability to apply the course theory to real-world problems.

Foundations of Signal Processing & Sparse Coding (M1) :
This class will introduce the mathematical concepts and techniques to achieve a solid understanding of the fundamental principles of linear signal processing, as well as recent research on nonlinear signal processing, with a focus on sparse coding. Starting with the fundamentals of linear signal processing, we will see how the main notions of Fourier transforms can be understood in terms of a change of basis, and use this intuition to present both continuous- and discrete- time signal processing. Moving on from the harmonic basis we will then cover the basics of over-complete bases, time-frequency analysis and wavelets. This will lead us to techniques developed around sparse coding with overcomplete dictionaries, involving optimization with sparsity-inducing norms & dictionary learning.

Foundations of Discrete Optimization (M1) :
Discrete optimization is concerned with the subset of optimization problems where some or all of the variables are confined to take a value from a discrete set. In this course, we will study the fundamental concepts of discrete optimization such as greedy algorithms, dynamic programming and min-max relationships. Each concept will be illustrated using well-known problems such as shortest paths, minimum spanning tree, min-cut, max-flow and bipartite matching. We will also identify which problems are easy and which problems are hard, and briefly discuss how to obtain an approximate solution to hard problems.

Foundations of Neural Information Processing :
Neural information processing is the study of computational systems for data understanding. It covers a range of techniques including statistical learning theory, information theory, graphical models, and non-linear and discrete optimization, as well as their application to important prediction problems facing science and industry. Summarizing some of the major results of the machine learning research community of the past few decades, as well as their interrelationships, this course covers fundamental techniques that can be applied to a wide variety of real-world problems.

Foundations of Geometric Methods in Data Analysis :
Data analysis is the process of cleaning, transforming, modeling or comparing data, in order to infer useful information and gain insights into complex phenomena. From a geometric perspective, when an instance (a physical phenomenon, an individual, etc.) is given as a fixed-sized collection of real-valued observations, it is naturally indentified with a geometric point having these observations as coordinates. This course reviews fundamental constructions related to the manipulation of such point clouds, mixing ideas from computational geometry and topology, statistics, and machine learning. The emphasis is on methods that not only come with theoretical guarantees, but also work well in practice. In particular, software references and example datasets will be provided to illustrate the constructions.

Foundations of Polyhedral Combinatorial Optimization :
Polyhedral techniques have emerged as one of the most powerful tools to analyse and solve combinatorial optimization problems. Broadly speaking, combinatorial optimization problems can be formulated as integer linear programs.In this course, we will study the fundamental concepts of polyhedral techniques such as totally unimodular matrices, matroids and submodular functions. Each concept will be illustrated using well-known problems such as bipartite matching, min-cut, max-flow and minimum spanning tree. The course is divided into two parts. In the first part, we will study easy problems (those that admit efficient optimal algorithms). We will use polyhedral techniques to explain why these problems are easy. In the second part, we will study hard problems (specifically, NP-hard problems). We will use polyhedral techniques to obtain provably accurate approximate solutions for various hard problems.

Foundations of Large Scale & Distributed Optimization :
In a wide range of application fields (inverse problems, machine learning, computer vision, data analysis, networking...), large scale optimization problems need to be solved. The objective of this course is to introduce the theoretical background which makes it possible to develop efficient algorithms to successfully address these problems by taking advantage of modern multicore or distributed computing archtectures. This course will be mainly focused on nonlinear optimization tools for dealing with convex problems. Proximal tools, splitting techniques and Majorization-Minimization strategies which are now very popular for processing massive datasets will be presented. Illustrations of these methods on various applicative examples will be provided.