Geoffrey J. Gordon

I'm a professor in the Machine Learning Department at
Carnegie Mellon. I am also
affiliated with the Robotics
Institute. I'm interested in multi-agent planning,
reinforcement learning, decision-theoretic planning, statistical
models of difficult data (e.g. maps, video, text), computational
learning theory, and game theory. My group is called the SELECT
lab (for SEnse, LEarn, and aCT). Here is
its mailing
list.

I spent AY 2003-4 as a visiting professor at the Stanford Robotics Lab.
Before joining CMU I used to work for Burning Glass Technologies, a
company that provided intelligent searching and matching software for
resumes and job postings. The company was headquartered in San
Diego, but I worked at their Pittsburgh office.

You can email me with user_ID@cs.cmu.edu, where user_ID is part of the URL for this page.

Teaching

Fall 2017 I am teaching 10-606 (mini 1) and 10-607 (mini 2), Math Background for ML I & II. This is the same course as 10-600 from Fall 2016, but renumbered since CMU's registration system prefers different numbers for the two minis.

A very simple implementation of an infeasible interior-point method
for linear and convex quadratic programs, as a Matlab
M-file, and an example of its use.
I also have a slightly more sophisticated
implementation (also in Matlab). If you have access to Matlab's
quadprog, I'd recommend using that instead; when I wrote this, I did
not have access to quadprog.

Support vector machines are an interesting use of optimization, and
there is some interior point code for learning SVMs on
my SVM page. This is not really a very good
way to optimize SVMs, and perhaps not the best interior-point
implementation, but it may be an interesting example.

Reinforcement learning

A (very partial) annotated bibliography on
robot learning via MDPs and related methods. I made this as an
initial cut at readings our multirobot planning group might want to go
over.

Notes on conditioning (the dogs and bells kind), in PostScript (50k, 20 slides) or PDF (80k).

Lecture notes for an intro to reinforcement learning, in PostScript or PDF (215k, 43 slides).

Others

A tutorial
on machine learning for educational data that Emma Brunskill and I
gave at NIPS 2012 (or, direct link to the video).

Software for tracking
dots in images. This is a useful primitive for some types of
computational biology experiments: fluorescently tag something, take
pictures of it, and track how it moves. This software isn't very
polished, but we couldn't find anything out there for the purpose; so
we wrote this, and some friends of mine used it to help with the data
for one of the papers
below.

A simple tutorial on the Common LISP
language, written as class material for the AI core course at CMU.

Some publications

This list is approximately in reverse chronological order.
Some of my publications are also available from the CMU SCS tech reports
archive or from arXiv.

2018

Ahmed Hefny, Carlton Downey, Geoffrey Gordon. An Efficient, Expressive and Local Minima-free Method for Learning Controlled Dynamical Systems. In AAAI-18. (sorry, no link yet) (an earlier version of this work was presented at CoRL 2017)

2017

Carlton Downey, Ahmed Hefny, Byron Boots, Geoffrey Gordon, and Boyue Li. Predictive State Recurrent Neural Networks. In Advances in Neural Information Processing Systems (NIPS), 2017. (an earlier version of this work was presented at CoRL 2017; see also the arXiv version below)

Automated Image Analysis of Protein Localization in Budding
Yeast. ISMB/ECCB 2007. With Sam Chen, Ting Zhao, and Bob
Murphy. This paper will also appear in the journal Bioinformatics.
(the code from this paper is available here)

Francisco Pereira
and Geoff Gordon. The
Support Vector Decomposition Machine. To appear in ICML,
2006. (PDF, 8 pages) (there is also an extension of the SVDM
algorithm to be more SVM-like; it was published and presented at the
2006 workshop on Bioimage Informatics at UCSB, and was also the
subject of a talk at the 2006 NIPS workshop on Novel Applications of
Dimensionality Reduction, but I don't have a link up yet)

Geoffrey J. Gordon. No-regret algorithms for
structured prediction problems. Tech report
CMU-CALD-05-112.&nbsp (45 pages, PDF; or try gzipped
postscript.) This is the tech report version of my paper on
Lagrangian Hedging algorithms, which are for online learning in
problems with structured hypothesis and/or output spaces. This
file replaces an earlier draft which had been available on this
website.

The tech report
version of our paper on ARA* (Anytime Repairing A*), with Maxim Likhachev and Sebastian Thrun.
Describes an anytime modification of A* search which produces a
suboptimal solution quickly, then repeatedly repairs the plan until it
runs out of search time or proves optimality. This version
contains the full proofs of correctness for the algorithm. (2.5M
PDF, 26 pages, CMU-CS-03-148)

My NIPS-02 paper with Nick
Roy, Exponential Family PCA
for Belief Compression in POMDPs. It describes a way to find
structure in robot belief states and take advantage of that structure
for planning. Belief states are probability distributions over
physical states, and therefore high-dimensional. In order to
plan, we must reduce the high-dimensional representation to a
lower-dimensional one; so, we applied a nonlinear component analysis
algorithm to find the low-dimensional features which allow us to
reconstruct our belief most accurately in KL-divergence. (66k
gzipped postscript, 8 pages; or try PDF)

My UAI-02 paper, Distributed
planning in hierarchical factored MDPs (with Carlos
Guestrin). It describes a way to decompose a large MDP with
factored dynamics into several smaller MDPs that run in parallel and
are coupled by constraints, and provides a principled distributed
planning algorithm based on this intuition. (384k gzipped
postscript, 10 pages) (or try 458k PDF)

My COLT-99 paper, Regret bounds for prediction
problems, which proves worst-case performance bounds for some
widely-used learning algorithms (379k, 12 pages). You can also
download some slides (493k, 37
pages). Chapter 3 of my thesis is a slightly longer and more
recent presentation of the material in this paper.

The online proceedings
of the workshop on modelling in reinforcement learning, held at ICML-97,
co-organized with Chris Atkeson.

Online
Fitted Reinforcement Learning from the Value Function Approximation
workshop at ML-95: an addendum to the above two papers which extends
some of their techniques to online Markov decision problems.
(81k, 3 pages) (or try PDF)

My NIPS-95 paper. The previous two papers (the one from the ML-95
VFA workshop and the SARSA example) are more recent and cover the same
topics. So, this paper is mostly obsolete. If you want it
anyway, you can click here. (56k, 7
pages)