Effective mechanisms for searching the space of machine learning algorithms

For a deeper dive into neuroevolution, check out Kenneth Stanley's upcoming talk, "Evolving neural networks through neuroevolution," at the Artificial Intelligence Conference in San Francisco, September 17-20, 2017. Save 20% with the code BIGDATA20.

The book closes with a case study that hits closer to home—the current state of research in AI. One can think of machine learning and AI as a search for ever better algorithms and models. Stanley points out that gatekeepers (editors of research journals, conference organizers, and others) impose two objectives that researchers must meet before their work gets accepted or disseminated: (1) empirical: their work should beat incumbent methods on some benchmark task, and (2) theoretical: proposed new algorithms are better if they can be proven to have desirable properties. Stanley argues this means that interesting work (“stepping stones”) that fail to meet either of these criteria fall by the wayside, preventing other researchers from building on potentially interesting but incomplete ideas.

Here are some highlights from our conversation:

Neuroevolution today

In the state of the art today, the algorithms have the ability to evolve variable topologies or different architectures. There are pretty sophisticated algorithms for evolving the architecture of a neural network; in other words, what's connected to what, not just what the weight of those connections are—which is what deep learning is usually concerned with.

There's also an idea of how to encode very, very large patterns of connectivity. This is something that's been developed independently in neuroevolution where there's not a really analogous thing in deep learning right now. This is the idea that if you're evolving something that's really large, then you probably can't afford to encode the whole thing in the DNA. In other words, if we have 100 trillion connections in our brains, our DNA does not have 100 trillion genes. In fact, it couldn't have a 100 trillion genes. It just wouldn't fit. That would be astronomically too high. So then, how is it that with a much, much smaller space of DNA, which is about 30,000 genes or so, three billion base pairs, how would you get enough information in there to encode something that's 100 trillion parts?

This is the issue of encoding. We've become sophisticated at creating artificial encodings that are basically compressed in an analogous way, where you can have a relatively short string of information to describe a very large structure that comes out—in this case, a neural network. We've gotten good at doing encoding and we've gotten good at searching more intelligently through the space of possible neural networks. We originally thought what you need to do is just breed by choosing among the best. So, you say, ‘Well, there's some task we're trying to do and I'll choose among the best to create the next generation.’

We've learned since then that that's actually not always a good policy. Sometimes you want to explicitly choose for diversity. In fact, that can lead to better outcomes.

The myth of the objective

Our book does recognize that sometimes pursuing objectives is a rational thing to do. But I think the broader point that's important here is there’s a class of discoveries for which it really is against your interest to frame what you're doing in terms of an objective.

The reason we wrote the book is because … I started to realize this principle that ‘sometimes in order to make discovery possible, you have to stop having an objective’ speaks to people beyond just computer scientists who are developing algorithms. It's an issue for our society and for institutions because there are many things we do that are driven by some kind of objective metric. It almost sounds like heresy to suggest that you shouldn't do that.

It's like an unquestioned assumption that exists throughout our culture that the primary route to progress is to set objectives and move toward those objectives and measure your performance with respect to those objectives. We began to think that given the results we have that are hard empirical results, that it is important to counterweight this belief that pervades society with a counter argument that points out that there are cases where this is actually a really bad idea.

The thing I learned more and more talking to different groups is that this discussion is not being had. We're not talking about this, and I think it's a very important discussion because our institutions are geared away from innovation because they are so objectively driven. We could do more to foster innovation if we recognize this principle. A lot of people want this security blanket of an objective because they don't trust anything that isn't driven by an objective.

Actually, it turns out there are principled ways of exploring the world without an objective. In other words, it's not just random and the book is about that. It's about how smart ways of exploring in a non-objective way can lead to really, really important results. We just wanted to open up that conversation ‘society wide’ and not just have it narrowly within the field of computer science because it is such an important conversation to have.

Ben Lorica is the Chief Data Scientist at O'Reilly Media, Inc. and is the Program Director of both the Strata Data Conference and the Artificial Intelligence Conference. He has applied Business Intelligence, Data Mining, Machine Learning and Statistical Analysis in a variety of settings including Direct Marketing, Consumer and Market Research, Targeted Advertising, Text Mining, and Financial Engineering. His background includes stints with an investment management company, internet startups, and financial services.