Abstract talk Gregoire Montavon – Deep neural networks, and what they have learned

Deep neural networks are complex nonlinear functions that come with an efficient learning procedure. They are able to convert large amounts of data into highly predictive models. This has made them the state-of-the-art in applications where data is plentiful such as classification of images and texts from the Internet. In many scientific applications, however, the data is limited. This makes the learning procedure more prone to overfitting, and also favors the emergence of biases in the dataset. As a result, prediction accuracy can drop drastically. A good understanding of these effects, and techniques to mitigate them are therefore crucial to achieve practical success. Recently, techniques have been developed to look into a trained deep neural network, and verify if the latter applies a correct reasoning for its prediction. This can be used to efficiently validate a trained model, beyond its test set accuracy. Furthermore, once the model has been properly trained and validated, the same techniques can also be used to mine the deep network for new insights about the modeled task. The talk will start with an introduction to deep neural networks and how to train them. The problem of limited data will then be discussed and standard approaches to mitigate this problem will be presented. A second part of the talk will present recent techniques for interpreting and explaining neural network predictions, and how these techniques can be used for model validation and for learning from the model.

With the availability of large databases and recent improvements in methodology, the performance of deep neural networks is reaching or even exceeding the human level on an increasing number of complex tasks. Impressive examples of this development can be found in domains such as image classification, sentiment analysis, speech understanding or strategic game playing. However, because of their nested non-linear structure, these highly successful models are usually applied in a black box manner, i.e., no information is provided about what exactly makes them arrive at their predictions. Since this lack of transparency can be a major drawback, e.g., in medical applications, the development of methods for visualizing, explaining and interpreting deep learning models has recently attracted increasing attention. This talks presents a principled technique to explaining decisions of a deep net and discusses various applications of the method.

Abstract talk Maurizio Filippone – Deep Gaussian Processes

Drawing meaningful conclusions on the way complex real life phenomena work and being able to predict the behavior of systems of interest require developing accurate and highly interpretable mathematical models whose parameters need to be estimated from observations. In modern applications, however, we are often challenged with the lack of such models, and even when these are available they are too computational demanding to be suitable for standard parameter optimization/inference methods. While probabilistic models based on Deep Gaussian Processes (DGPs) offer attractive tools to tackle these challenges in a principled way and to allow for a sound quantification of uncertainty, carrying out inference for these models poses huge computational challenges that arguably hinder their wide adoption. In this talk, I will present our contribution to the development of practical and scalable inference for DGPs, which can exploit distributed and GPU computing. In particular, after discussing the key challenges in DGP research and drawing connections with Deep Neural Networks, I will introduce a formulation of DGPs based on random features that we infer using stochastic variational inference. Through a series of experiments, I will illustrate how our proposal enables scalable deep probabilistic nonparametric modeling and significantly advances the state-of-the-art on inference methods for DGPs.