September 20, 2007

We can divide the capabilities of an agent into two categories: its skills and its ability to learn. In terms of the agent’s short-term performance it is skills that are the most important, and thus which make the greatest impact on us. However, I would maintain that intelligence is a product of the agent’s capacity to learn, although not every agent that displays some ability to learn would be classified as intelligent. If this is correct then a good deal of work on artificial intelligence may be going down the wrong path, creating agents that possess skills but lack intelligence.

Of course I can’t appeal to any special direct access to the essence of intelligence to say what it really is, but I can point out that the ability to learn will always, in the long run, be superior to skills. No matter what skill an agent possesses an agent that can learn can always acquire that skill, perhaps not directly, but learning has no limits, and so the agent that can learn will, eventually, be able to produce machines to do whatever the agent with skills can. That is the power of the ability to learn, not necessarily to be able to internalize everything, but to abstract and to solve any problem piece by piece, part of which may involve developing tools to solve the problem. Indeed this is how human intelligence works in many ways, we develop intellectual tools, such as math and logic, which allow us to tackle problems that we could never deal with head on. Math and logic are mindless abstractions; in many ways they are like machines we have constructed in our heads to allow us to do with formal rules what we cannot with our native abilities. The fact that we use math to solve certain hard problems is not cheating, and thus doing the same thing in an external fashion, by constructing a machine isn’t either.

But I digress. I consider it sufficient to point out that while skills can solve a fixed number of problems the ability to learn allows any problem to be solved. And thus learning is more appropriately associated with intelligence than skills that can be mastered by “dumb” machines. Of course there are different kinds of learning too, and many of the simplest hardly seem like any kind of intelligence at all. Simple learning we might call single skill learning. Single skill learning is rare in nature, but it is almost the only kind produced by artificial intelligence researchers. Single skill learning is when an agent is able to improve at a specific skill, but does not have the ability to transfer over any of that learning to other areas. For example, a computer that “learns” to play a game by eliminating every losing move it comes across, and never playing that move again, is improving its skill at playing that game, but is getting no better at playing any other game, or learning hot to play games in general. Another example of such learning is developing the ability to sort items into categories by comparing them to examples which experience has shown to be or not to be members. Obviously such a capability results in improved categorization over time, but again this improvement doesn’t extend to other skills. Such a limited ability to learn is not necessarily better than a set of skills, granted it is likely that such single skill learning will, in the long run, result in being the best possible at that skill, but such a narrow ability is not necessarily going to put the agent in a better position to deal with the world than a wide range of skills.

With single skill learning set to one side we can turn our attention to “real” learning. Unfortunately I can’t say with complete precision what “real” learning entails. If I could it would be a short step from there to building human-like artificial intelligence and, as a direct result, being showered with money. I can, however, make some broad observations about some of the capabilities it must include, three of which are: flexible abstractions, meta-learning, and strategies. If an agent has flexible abstractions this means two things. First it is able to develop new mental abstractions, meaning that a category, such as “large red triangles” which first arises as the result of deliberation gradually becomes more and more automatic, until it is eventually a judgment that is produced automatically. Being able to generate such abstractions means that the agent is, over time, able to apply their intelligence at a higher level, at a distance from all the details (even though they maintain the ability to look at the details if they wish). The second thing flexible abstractions entails is the ability to transfer reasoning about one domain over to another. What exactly the claims are about is flexible, allowing reasoning by analogy, the simplest kind of reasoning. If an agent has the capability for meta-learning it means that it is able to observe not only the external world, but its own thinking processes as well, and is thus able to revise those processes on the basis of their success or failure (or to alter them as the situation demands). Meta-learning then entails a capacity to learn to learn, another way in which improvements in one skill can translate to improvements in completely unrelated skills. Finally, we come to the ability to form strategies. Again, this implies two things. First is the ability to mentally evaluate a course of action (which could be a line of reasoning), estimating the likelihood of its success. The second is the ability to follow a strategy (either in action or in reasoning) but, at the same time, have the ability to deviate from it as well if circumstances demand. In other terminology, such an agent might be said to have the ability to create and revise its own abstract goals.

I doubt that list is comprehensive, but it is a start in the right direction for understanding “real” learning, meaning the completely general kind. Of course not even humans work exclusively on the basis of learning. It appears that we come with a number of built in skills, such as 3D mapping and language use, or at least a predisposition for them, making us a combination of skills and the capacity for “real” learning. In purely practical terms it makes a great deal of sense, because learning takes time, and people have only limited life spans. And so evolution will generally lead to the essential skills becoming built-in. But when it comes to computers there is no such restriction, once we have successfully taught a single machine we can just make copies of it. So in terms of reward for our efforts developing machines with the capacity for “real” learning seems infinitely more worthwhile then working on giving them specific skills, or even single skill learning.