Foresight’s mission is essentially an educational one. In simplest terms we are here to point out foreseeable technological developments that not only will make the future different from the past, but make it different in ways that aren’t obvious and which everyone isn’t already planning for. Nanotechnology — true nanotech in Drexler’s original sense of having a thorough control over the structure of matter at the atomic scale and thus being able to build productive machinery — is such a development, even though the word “nanotechnology” is widely used for much more mundane, predictable, linear, and non-revolutionary progress.

Similarly, the term “Artificial Intelligence” is widely used for predictable, linear progress in software engineering. The field has come a long way, so that it is getting close to the point that any well-specified human skill, such as driving a car, can be implemented given an appropriate application of talent and resources. Just like “nanotechnology,” though, it originally meant something more revolutionary:

Some years ago, Ben Goertzel coined the term “AGI” — artificial general intelligence — to distinguish the original, revolutionary goal of AI as originally seen by such pioneers as McCarthy and Minsky, from the more mundane, incremental work that the term AI had come to cover. This was very similar in spirit to the term MNT — molecular nanotechnology — coined by Drexler and Foresight for essentially the same reason.

Within the past couple of years, the Productive Nanosystems Roadmap was organized and published, under the names of a wide sampling of people from academia, industry, and the national laboratories. This had the effect of making it clear that the ultimate goal of nanotechnology research is indeed “MNT”-style capabilities, and is one that is ultimately feasible and worth working toward.

While the “diaspora” in AI may have been deeper than the one in nanotech, it was also longer ago — there was no need for the AGI Roadmap to re-establish the possibility of an artificial intelligence in the full sense, but to try and make some sense of the state of the art with respect to it, figure out some milestones and metrics that might be used to judge progress, and so forth.

The meeting last weekend at the University of Tennessee, organized by Ben Goertzel and Itamar Arel, served to bootstrap the process and begin to work out what kind of roadmap might be possible. The main problem, of course, is that we don’t really know how intelligence works, which pieces are essential and which ancillary, or indeed whether there are a few powerful underlying principles or a huge kludge of random techniques.

To that end we began by trying to define the kind of tasks that we felt a general intelligence could do but that no hand-coded “narrow AI” could do. The classic such task, or course, is the Turing Test, which has many points in its favor but is also considered (a) too high a bar, and (b) a test of the wrong thing, since it requires fooling a judge as well as exhibiting basic intelligence.

To give some of the flavor of the scenarios, here’s the one I proposed:

The Wozniak Test

In an interview a few years ago, Steve Wozniak of Apple fame opined that there would never be a robot that could walk into an unfamiliar house and make a cup of coffee. I feel that the task is demanding enough to stand as a pons asinorum for embodied AGI.

A robot is placed at the door of a typical house or apartment. It must find a doorbell or knocker, or simply knock on the door. When the door is answered, it must explain itself to the householder and enter once it has been invited in. (We will assume that the householder has agreed to allow the test in her house, but is otherwise completely unconnected with the team doing the experiment, and indeed has no special knowledge of AI or robotics at all.) The robot must enter the house, find the kitchen, locate coffee-making supplies and equipment, make coffee to the householder’s taste, and serve it in some other room. It is allowed, indeed required by some of the specifics, for the robot to ask questions of the householder, but it may not be physically assisted in any way.

The state of the robotics art falls short of this capability in a number of ways. The robot will need to use vision to navigate, identify objects, possibly identify gestures (“the coffee’s in that cabinet over there”), and to coordinate complex manipulations. Manipulation and physical modelling in a tight feedback learning loop may be necessary, for example, to pour coffee from an unfamiliar pot into an unfamiliar cup. Speech recognition and natural language understanding and generation will be necessary. Planning must be done at a host of levels ranging from manipulator paths to coffee-brewing sequences.

But the major advance for a coffee-making robot is that all of these capabilities must be coordinated and used appropriately and coherently in aid of the overall goal. The usual set-up, task definition, and so forth are gone from standard narrow AI formulations of problems in all these areas; the robot has to find the problems as well as to solve them. That makes coffee-making a strenuous test of a system’s adaptiveness and ability to deploy common sense.

I claim that this test addresses the bulk of the aspects of general intelligence that are missing from AI today. Although standard shortcuts might be used, such as having a database of every manufactured coffeemaker built in, it would be prohibitive to have the actual manipulation sequences for each one pre-programmed, especially given the variability in workspace geometry, dispensers and containers of coffee grounds, and so forth. Transfer learning, generalization, reasoning by analogy, and in particular learning from example and practice are almost certain to be necessary for the system to be practical.

Coffee-making is a good test of generality because, although it would be possible to hand-code most of the skills needed, it would be much cheaper simply to build a coffeemaker into the robot! Thus the only economical way to approach the task is to build general learning skills and have a robot that is capable of learning not only to make coffee but any similar domestic chore.

Coffee-making is a task that most 10-year-old humans can do reliably with a modicum of experience. I would guess that a week’s worth of being shown and practicing coffeemaking in a variety of homes with a variety of methods would provide the grounding for enough generality that a 10-year-old could make coffee in the vast majority of homes in a Wozniak test.

5 Responses to “AGI Roadmap meeting”

If we can create narrow AI to drive a car, why can’t we do the same for making coffee? While programming every model of coffee maker might be prohibitive, programming the range of possible controls and behaviors might not be. Also, the task may not be complex enough to *require* natural language processing and generation at a generally intelligent level.

I suspect that if DARPA put up a million for this, we would crack it in a few years sans AGI.

I’ve proposed the Employee Test: Would a business owner be willing to hire your AI/AGI, for the cost of a salary, to replace a valued employee in positions such as software engineering, accounting and project management? Of course, this hits your “(a) too high a bar” but completely avoids “(b) a test of the wrong thing” since we ultimately want AGIs to do various forms of work for us.

Lowering the bar may always be problematic because it increases the probability that the test could be passed with narrow AI.

Well, the Wozniak test is better than the total Turing test because it measures also doing. And it is better than the Nilsson test because it is short and practical.

But it should include a time limit relative to a human test person. And it does not measure improvements on performing the same task over and over again. It also does not explicitely support the development of learning agents because the top task is fixed. I would prefer a test where an agent has to watch and imitate a human doing some handcraft work. The agent should then be given some time to practice on it’s own to see if it gets better.

If I were setting up an AI prize, the task I’d pick is cleaning (specifically, start with a building containing the usual large set of cleaning subtasks, rate by percentage accomplished, big penalties for breaking or misplacing things).

Of course it’s not AGI-complete — there are already big incentives to write a human level AGI, but that’s moot because nobody knows how to do it. But it is hard enough to advance the state of the art, and it has extraordinary potential leverage. A prize should kickstart a line of development that is expected to be subsequently self-sustaining. The world spends, at a conservative estimate, more than a trillion dollars a year on cleaning (mostly not paid for under that heading, but the cost is the same). Imagine what would happen if even 1% of that could be spent on developing better machines. Imagine what could be accomplished if that much human effort were freed for better purposes.

Robots with specialized cleaning hardware will perform better in your test. Remember that AI engineers will lie, trick, and cheat whenever they can! So in order to prove intelligence you’ll have to chose your task very carefully. Of course, if you’re a gene marionette and trying is to defend your genes agains the androids then chosing a fakeable test is the way for you