Learning in undirected graphical models of large treewidth is
difficult because of the hard inference problem induced by the
partition function for maximum likelihood learning, or by finding the
MAP assignment for margin-based loss functions. Over the last decade,
there has been considerable progress in developing algorithms for
approximating the partition function and MAP assignment, both via
variational approaches (e.g., belief propagation) and sampling
algorithms (e.g., MCMC). More recently, researchers have begun to
apply these methods to learning large, densely-connected undirected
graphical models that may contain millions of parameters. A notable
example is the learning of Deep Belief Networks and Deep Boltzmann
Machines, that employ MCMC strategy to greedily learn deep
hierarchical models.

The goal of this workshop is to assess the current state of the field
and explore new directions in both theoretical foundations and
empirical applications. In particular, we shall be interested in
discussing the following topics:

State of the field:
What are the existing methods and what is the
relationship between them? Which problems can be solved using existing
algorithms and which cannot?

The use of approximate inference in learning:
There are many
algorithms for approximate inference. In principle all of these can be
"plugged-into" learning algorithms. What are the relative merits of
using one approximation vs. the other (e.g., MCMC approximation vs. a
variational one). Are there effective combined strategies?

Learning with latent variables:
Graphical models with latent (or
hidden) variables often possess more expressive power than models with
only observed variables. However, introducing hidden variables makes
learning far more difficult. Can we develop better optimization and
approximation techniques that would allow us to learn parameters in
such models more efficiently?

Learning in models with deep architectures:
Recently, there has
been notable progress in learning deep probabilistic models, including
Deep Belief Networks and Deep Boltzmann Machines, that contain many
layers of hidden variables and millions of parameters. The success of
these models heavily relies on the greedy layer-by-layer unsupervised
learning of a densely-connected undirected model called a Restricted
Boltzmann Machine (RBM). Can we develop efficient and more accurate
learning algorithms for RBM's and deep multilayer generative models?
How can learning be extended to semi-supervised setting and be made
more robust to dealing with highly ambiguous or missing inputs? What
sort of theoretical guarantees can be obtained for such greedy
learning schemes?

Scalability and success in real-world applications:
How well do
existing approximate learning algorithms scale to large-scale problems
including problems in computer vision, bioinformatics, natural
language processing, information retrieval? How well do these
algorithms perform when applied to modeling high-dimensional
real-world distributions (e.g. the distribution of natural images)?

Theoretical Foundations:
What are the theoretical guarantees of the
learning algorithms (e.g. accuracy using the learned parameters with
respect to best possible, asymptotic convergence guarantees such as
almost sure convergence to the maximum likelihood estimator). What
are the tradeoffs between running time and accuracy?

Loss functions:
In the supervised learning setting, two popular
loss functions are log-loss (e.g., in conditional random fields) and
margin-based-loss (e.g., in maximum margin Markov networks). In
intractable models these approaches result in rather different
approximation schemes (since the former requires partition function
estimation, whereas the latter only requires MAP estimates). What can
be said about the differences between these schemes? When is one model
more appropriate than the other? Can margin-based models be applied in
the unsupervised case?

Structure vs. accuracy:
Which graph structures are more amenable to
approximations and why? How can structure learning be combined with
approximate learning to yield models that are both descriptive and
learnable with good accuracy?

This workshop aims to build on the success of the earlier NIPS workshop:
Approximate inference - how far have we come? ,
but with a specific
emphasis on the learning aspect. Through having a series of invited
talks, and a panel discussion, this workshop is intended to bring together
machine learning researchers working on approximate inference in learning,
assess the current state of the field, discuss key challenges, and
identify future promising directions of investigation.