On the spirit of NIPS 2015 and OpenAI

I just came back from NIPS 2015 which was a clear success in terms of numbers (note that this growth is not all because of deep learning, only about 10% of the papers were on this topic, which is about double of those on convex optimization for example):

In this post I want to talk about some of the new emerging directions that the NIPS community is taking. Of course my view is completely biased as I am more representative of COLT than NIPS (though obviously the two communities have a large overlap). Also I only looked in details at about 25% of the papers so perhaps I missed the most juicy breakthrough. In any case below you will find a short summary of each of these new directions with pointers to some of the relevant papers. Before going into the fun math I wanted to first share some thoughts about the big announcement of yesterday.

Thoughts about OpenAI

Obvious disclaimer: the opinions expressed here represent my own and not those of my employer (or previous employer hosting this blog). Now, for those of you who missed it, yesterday Elon Musk and friends made a huge announcement: they are giving $1 billion to create a non-profit organization whose goal is the advancement of AI (see here for the official statement, and here for the New York Times covering). This is just absolutely wonderful news, and I really feel like we are watching history in the making. There are very very few places in the world solely dedicated to basic research and with that kind of money. Examples are useful to get some perspective: the Perimeter Institute for Theoretical Physics was funded with $100 million (I believe it has a major impact in the field), the Institute for Advanced Studies was funded with a similar size gift (a simple statistic give an idea of the impact: 41 out of 57 Fields medalists have been affiliated with IAS), more recently and perhaps closer to us the Simons Institute for the Theory of Computing was created with $60 million and its influence on the field keep growing (it was certainly a very influential place in my own career). Looking at what those places are doing with 1/10 of OpenAI’s budget sets the bar extremely high for OpenAI, and I am very excited to see what direction they take and what their long term plans are!

Now let’s move on to what worries me a little: the 10 founding members of OpenAI are all working on deep learning. Before explaining further why this is worrisome let me emphasize that I strongly believe that disentangling the mysteries behind the impressive practical successes of deep nets is a key challenge for the future of AI (in fact I am spending a good amount of time thinking about this issue, just like many other groups in theoretical machine learning these days). I also believe that pushing the engineering aspect of deep nets will lead to wonderful technological breakthroughs, which is why it makes sense for companies such as Facebook, Google, Baidu, Microsoft, Amazon to invest heavily in this endeavor. However it seems insane to think that the current understanding of deep nets will be sufficient to achieve even very weak forms of AI. AI is still far from being an engineering problem, and there are some fundamental theoretical questions that have to be resolved before we can brute force our way through this problem. In fact the mission statement of OpenAI mention one such fundamental question about which we know very little: currently we build systems that solve one task (e.g., image segmentation) but how do we combine these systems so that they take advantage of each other and help improving the learning of future tasks? While one can cook up heuristics to attack this problem (such as using the learned weights for one task as the initialization for another one) it seems clear to me that we are lacking the mathematical framework and tools to think properly about this question. I don’t think that deep learners are the best positioned to make conceptual progress on this question (and similar ones), though I definitely admit that they are probably the best positioned right now to make some practical progress. Again this is why all big companies are investing in this, but for an institution that wants to look into the more distant future it seems critical to diversify the portfolio (in fact this is exactly what Microsoft Research does) and not just follow companies who often have much shorter term objectives. I really hope that this is part of their plans.

I wish the best of luck to OpenAI and their members. The game-changing potential of this organization puts a lot of responsibility on them and I sincerely hope that they will try to seriously explore different paths to AI rather than to chase local-in-time advertisement (please don’t just solve Go with deep nets!!!).

There are a few other topics that caught my attention but I am running out of stamina. These include many papers on the analysis of cascades in networks (I am particularly curious about the COEVOLVE model), papers that further our understanding of random features, adaptive data analysis (see this), and a very healthy list of bandit papers (or Bayesian optimization as some like to call it).

To add to your nonconvex optimization section, this year at NIPS there was also the first workshop on “nonconvex optimization” for machine learning. It seems the organizers are interested in continuing that in the future.

There was a lot of people at the NIPS. A simple solution would be to limit the number of attendees to 3000 and live stream everything for the ones who can’t attend. It’s a pretty common format in lots of successful conferences.

[…] is that the entire world has had access to this improved technology.In artificial intelligence, there has been an explosion of interest for a technique called deep learning. Consequently, some have said that 2015 was a breakthrough year for artificial intelligence. What […]

By Hossein Mobahi January 1, 2016 - 3:34 am

To add to your nonconvex optimization section, this year at NIPS there was also the first workshop on “nonconvex optimization” for machine learning. It seems the organizers are interested in continuing that in the future.

[…] is that the entire world has had access to this improved technology.In artificial intelligence, there has been an explosion of interest for a technique called deep learning. Consequently, some have said that 2015 was a breakthrough year for artificial intelligence. What […]

I was at NIPS for the first time, giving a talk at a workshop, and I was also amazed by how busy everything related to deep learning was. I do not agree with your opening statement that the fact that only 10% of the papers was about deep learning shows that the growth is not all because of that. If you have a big music festival with tons of obscure bands and one big name, most of the audience is there for the big name. To me it was clear that the whole NIPS format of a single track conference is completely inappropriate if you have about 4000 attendees. The main hall had about 2400 chairs I estimated, and the overflow room was also pretty busy. Both halls were actually full of people of whom most are working on their laptops during talks and not following the talks. I think at conferences with over 500 attendees single track already does not work anymore.

A music festival was what it reminded me of, as I have been to many of these, I know how to wiggle myself to the front near the stage and I attended several good talks lying on the floor close to the podium. Maybe we can do stage-diving and crowd-surfing next year when the deep learning rock-star demi-gods perform ;).

By John Schulman December 13, 2015 - 1:26 pm

I’m one of the members of OpenAI and a long-time reader of your blog. Our goal is not to be a representative cross-section of what we think is high-quality ML research. Rather, we’ll focus on a small set of important and impactful topics. And the research will mostly be driven by empirical results rather than theory, since in the realm of neural networks, the phenomenology is way ahead of the theory right now. On the theory side, I just hope I can convince my theory-inclined friends to think about some of the problems I’m interested in 🙂

By Sebastien Bubeck December 13, 2015 - 2:29 pm

Hi John,

of course it makes a lot of sense to choose a few directions and focus on them! The point of an entreprise like OpenAI is certainly not to give a “fair” representation of ML topics, NIPS is there for that ;).

However I disagree with your point of view on theory vs practice. Let me take an analogy (which is a bit naive but still…). Imagine that we are back in 1905, and inspired by
the success of the Wright brothers someone decides to invest into perfecting this technique with the objective of getting to the moon. This is not completely crazy, and in fact with only with the knowledge of 1905 (and a few more hundred years of pure
engineering) this could have had some chance of success. However what was really missing was quantum mechanics, which led to efficient transitors, which in turn gave the computers and communication systems that we needed to implement the Kalman filter.

While I believe that regarding “intelligence” we are missing something as fundamental as quantum mechanics, I also agree that the current state of affair is very different from my analogy, as the theory behind the Wright’s brother feat was already well-understood for many decades when it happened…

By Joan December 13, 2015 - 7:46 am

There was a lot of people at the NIPS. A simple solution would be to limit the number of attendees to 3000 and live stream everything for the ones who can’t attend. It’s a pretty common format in lots of successful conferences.

By David Relyea December 13, 2015 - 6:20 am

Nobody has addressed the elephant in the room: NIPS had almost 4000 attendees this year, and it’s set to almost double again next year. There are 100 posters per day. In that room, it was a rugby scrum to see any of the deep learning posters. This process just doesn’t scale.

If everyone at the conference suddenly had the mindset of an early-20-something (including the organizers), all the poster sessions would just be videotaped in advance. Online video does scale to any number of watchers, and questions can then be asked in person during the poster sessions (you could even have an additional online session).

I honestly don’t understand the “a paper is the best way to communicate our results, and thus it should be the only way” mentality in modern-day mathematics. A 3-5 minute overview would save the average reader so much time in the long run, allowing for more efficient knowledge dissemination.