Artificial Intelligence Pioneers: Peter Norvig, Google

Peter Norvig, director of research at Google Inc., March 17, 2011. Photographer: David Paul Morris/Bloomberg

Artificial intelligence (AI) got a lot of press in 2016, not least because of the victory of Google’s AI program over Lee Sedol, the world’s best Go player. That triumph of machine over human elicited numerous responses, some enthusiastic and some anxious, all sharing the assumption that the goal of artificial intelligence is to achieve “human-level intelligence” or, as some predict, “superintelligence.”

“I don’t care so much whether what we are building is real intelligence,” says Peter Norvig, Director of Research at Google. “We know how to build real intelligence—my wife and I did it twice, although she did a lot more of the work. We don’t need to duplicate humans. That’s why I focus on having tools to help us rather than duplicate what we already know how to do. We want humans and machines to partner and do something that they cannot do on their own.”

Norvig goes further to make a useful distinction between neuroscience and artificial intelligence research. “Understanding the brain is a fascinating problem but I think it’s important to keep it separate from the goal of AI which is solving problems,” he says. Each field can learn from the other, but “if you conflate the two it’s like aiming at two mountain peaks at the same time—you usually end up in the valley between them.” To avoid ending up in the valley, it’s advisable to be clear about goals and to be careful about using misleading labels: “I think we’d be better off if we had better names other than neural nets and maybe we would be better off if the Google Brain team had a different name. What the Google Brain team provides is programming tools for solving problems—it’s not a tool for understanding the brain and it’s not necessarily linked to how the brain works.”

Developing tools for solving specific problems and teaching others how to do that have been the hallmarks of Norvig’s career over the last three decades, since completing at UC Berkeley his doctoral dissertation on the subject of improving text understanding by computers. In 1995, he co-authored (with Stuart Russell) Artificial Intelligence: A Modern Approach, which became the leading textbook in the field (now in its third edition), and in 2011, he co-taught (with Sebastian Thrun) an online class, Introduction to Artificial Intelligence, attended by 160,000 students from 209 countries.

For these and other accomplishments, Norvig was named a fellow of the Association for the Advancement of Artificial Intelligence (AAAI) in 2001 (“for significant contributions to educational materials, natural language processing techniques, web-based technology, and research management and leadership”) and a fellow of the Association for Computing Machinery (ACM) in 2006 (“for contributions to artificial intelligence and information retrieval”).

Intelligence, of the real, human kind, has been a preoccupation for Norvig from early on. While in sixth grade, he wrote letters to his local newspaper complaining about innumeracy and sloppy language in science reporting. In high-school, a teacher suggested a career as a science reporter, but learning to program in BASIC and a class on linguistics led him to think about using computers to process natural language and a different career path. This interest could have also sprung, Norvig says, from the combined influence of his father, a math professor, and his mother, an English literature professor.

At Brown University, while majoring in applied mathematics, he stumbled upon a psychology class that was taught from a cognitive science perspective and that has led him more specifically to artificial intelligence. So after spending two years working at a spin-off from MIT and getting “exposed to the graduate student-university lifestyle,” he thought it would be interesting to get a PhD and enrolled at the computer science department at UC Berkeley.

This was the period of reduced investment in artificial intelligence research (the so-called “AI Winter”), but Norvig felt that “it was the most interesting field—you’re solving the hardest problems. As a grad student, you are expected to dig deep into a field, you don’t expect the result to be a product that will change the world, so it didn’t bother me too much that it was a wintery period.”

It turned out to be a time of significant transition in AI research. “While I was in grad school,” says Norvig, “the field shifted from an expert system point of view where the idea was to hire a lot of grad students to manually write down logical rules, hoping to accumulate enough of them to get a good result. But you never could—it’s just too hard to write down these rules. So we shifted to a probabilistic approach where you are dealing with uncertainty and the goal is to get the best answer, not to try and duplicate the way an expert think. And that approach was successful at medical diagnosis and speech recognition and other fields so that got things going.” Another related and significant change “was the use of data rather than thinking about all the rules yourself,” applying statistical analysis to uncover the rules embedded in the data. In an influential paper, Norvig and his Google colleagues urged other researchers in fields such as machine translation and speech recognition to shun theory development and instead “embrace complexity and make use of the best ally we have: the unreasonable effectiveness of data.”

For the last fifteen years, Norvig has worked at Google, the company that helped unleash—and succeeded in mining—the Web-era big data wave. Before that, he experienced different kinds of research environments, working at UC Berkeley, Sun Microsystems, two startups, and NASA Ames Research Center. Why did he leave academia? “As an academic it would be really hard to get the resources to do larger projects,” says Norvig. “It’s all bit by bit, one graduate student at a time, the computing resources that you need, finding a partner that could get you data, all of this is difficult as an academic. In industry, you get all the resources you need.”

From industry, Norvig moved to a government research environment, serving as the head of the Computational Sciences Division at NASA’s Ames Research Center, developing self-driving software for robotic spacecraft such as the Mars Rover. As with industry R&D, basic research there was guided by the need to have a practical, real-world result. “But in terms of the organization and the bureaucracy,” says Norvig, “it couldn’t be more different. If your part doesn’t work, then you lose the entire mission and a couple hundred million dollars. Everything must work, so it’s a bit slower.”

This time-stretching also applies to other aspects of the organization’s work. Given what NASA is trying to achieve, testing is typically performed as a simulation. Norvig contrasts this with Google, where real-world testing is constant, the feedback from real users is immediate, and in case of failure, the problem could be fixed the next day.

This state of immediacy of Google’s research work pertains not only to time but also to space, as in where people with different jobs sit. Unlike traditional product teams, observes Norvig, “where researchers create a prototype and then throw it over the wall and engineering will re-do it and make it work,” Google has created a very unique product development environment where “research” and “development” are equal members of the same team. “We felt like we will always be evolving so we wanted people to be involved all the way through so they continue to make improvements,” says Norvig.

An important motivating factor in this “hybrid research model” is developing the kind of production environment tools that researchers would want to use. This attitude extended to developing a world-class IT infrastructure in-house, providing researchers with state-of-the-art tools and opportunities to build them on their own. Early on, says Norvig, there was some resistance to developing rather than buying hardware and software from IT vendors, but “it was usually the right decision to build something—we took it further than what was done before and we could quickly iterate as opposed to a vendor where you have to go to them to request a change and this slows everything down.” Another incentive for researchers is allowing them to publish papers and supporting other channels for collaboration with the academic community, altogether providing them with opportunities to make a broader impact on research in their fields. And there are also opportunities to break from incremental research and work on “paradigmatic changes” or Google’s “moonshots.”

But possibly the greatest motivator and the main attraction to computer scientists is the data that is at Google’s disposal. At Google, they can base their research on analyzing vast quantities of data with real-world constraints, conducting experiments at a scale that is typically unprecedented for research and development projects.

This large-scale, experimental, iterative, focused research, infused and enriched with the data provided, as Norvig’s points out, by the more than three billion people with Internet connection and a “supercomputer in their pocket,” helped Google—and researchers/developers in other companies—invent practical and successful applications of data-centric AI, most recently using deep learning and other new and improved approaches to machine learning.

Over the last few months, I heard Norvig talk on a number of occasions about what he has learned from years of applying machine learning to all the world’s information—at the O’Reilly AI conference, at a meeting of ACM Boston, at an ACM Webinar, and in a phone conversation. He contrasts machine learning with traditional software development, highlighting the former’s advantages and unique challenges. “It’s much less work,” Norvig says about machine learning. “All you have to do is pour some data into the computer instead of coffee and pizza into the programmer. And the output comes out much faster.”

But there are numerous challenges. “Machine learning allows you to go fast,” says Norvig, “but when you go fast, problems can develop and the crashes can be more spectacular than when you are going slow.” Machine learning is more difficult than traditional software because debugging machine learning programs is harder—there aren’t established and proven debugging tools and processes as with traditional software and it’s difficult to isolate a bug; if you change anything, you end up changing everything; there are challenges in deciding how and when to apply human assistance, training the user not to over-rely on the machine, and the need to account for new data or changes in the data you are using for training; and the use of data in machine learning gives rise to many new and confounding issues of privacy, security, and fairness. Still, Norvig’s bottom-line about machine learning borrows from Winston Churchill’s famous dictum about democracy: “It’s the worst possible system except for all the others that have been tried.”

A long list of significant issues, and their enumeration by Norvig in his frequent engagements with diverse audiences, may help speed up the efforts to solve them. Eventually, with the increased use of machine learning in all types of business and government organizations, machine learning will become more like traditional software in terms of our understanding of what it does and how to manage it. With the constant and rapid changes in what computers can do, however, for every solution to one of the current issues, we can safely predict that a few new challenges will arise.

We are already experiencing today one fundamental change in our interaction with computers, which no doubt will bring with it new challenges as well as opportunities. From an interaction dominated by apps, we are going to shift into interacting with digital assistants, sometimes through a new medium of interaction—voice. “Like the pioneers who had to invent the language for what it’s going to look like when you have the mouse and menus, we have to invent what the interaction going to be like when you carry all the conversation within a system. Right now, everything on your computer and your phone is siloed into an app and when you do something, the first thing you need to do is to decide which app to use. But when you go to an assistant model, rather than switching control completely to one app, we will have some combination of services that would be assembled together.” How do we combine services together to perform to our satisfaction will be one of the big challenges in the post-app era where the new type of human-computer interaction and the technology that makes it work will be a focal point for investment, development, failures, and triumphs.

Like most participants in the recent AI resurgence, Norvig ascribes its new-found sexiness and success to increased computer power and the availability of lots and lots of data. But he also ascribes responsibility to two fundamental shifts, one on the supply side of computer research and programming, the other driven by demand, by what we expect computers to do. Norvig quotes MIT’s Hal Abelson who observed that computer science has changed from mathematical science to natural science, from calculating a correct answer to making observations, from traditional computer software to machine learning. Similarly, demand has shifted from expecting computers to perform well-defined tasks such as adding numbers to the type of more amorphous tasks we really care about, says Norvig: “Reading something interesting, getting the right recommendation, sharing pictures and knowing what’s in the picture—these are all AI tasks. There is not a set of definitive answers, there’s uncertainty, we want to optimize or make the best recommendations.”

To help unlock the opportunities for new applications and uses of computers driven by these shifts, Norvig is interested in making machine learning and artificial intelligence “part of the standard repertoire of all programmers rather than a specialized field.” He is concerned about the limited supply of people who can do AI today and has been focusing for the last year on creating easy-to-use tools. “If you are a competent programmer,” Norvig days, “you should be able to learn just enough to be able to start doing machine learning without having to get a PhD first.”

Many computer programmers and PhD candidates today are not waiting for the democratization of AI and are rushing to re-train and re-focus their work on the most exciting and most talked about computer science specialty. “Five years ago, I was interviewed by a reporter that said ‘how come AI is such a failure?’” recalls Norvig. “Today, they say ‘how come AI is going to take over the world and kill all humans or steal all our jobs?’”

Delighted about the sudden 30% annual increase in the performance of various AI tasks, Norvig sees the hype generated by this progress as a distraction from the true goals of his chosen discipline. “I don’t get the singularity view of the world. My view is that the world is complicated and that being smarter is not going to solve many of the world’s problems.” And: “More important than human-like performance is something usable that doesn’t pretend to be a human but makes it clear what can be done and what can’t be done.”

Delivering something useful. Solving hard, specific problems. Not getting carried away or distracted by expectations of machines becoming human-like. Spreading practical knowledge about what works and what needs to be improved when training machines to make decisions in conditions of uncertainty. Focusing on augmenting human intelligence and developing the art of people-machines teamwork. This is what “artificial intelligence” is all about, per Peter Norvig.

I'm Managing Partner at gPress, a marketing, publishing, research and education consultancy. Previously, I held senior marketing and research management positions at NORC, DEC and EMC. Most recently, I was Senior Director, Thought Leadership Marketing at EMC, where I launch...