pushblog

possibly incomprehensible musings

Monday, October 25, 2004

the meaning of knowledge

I met with some officials from the Department of Defense's National Security Agency today. I was describing our work on commonsense reasoning when the question came up of what did we mean by the word "knowledge." In our systems we usually start off with a corpus of commonsense facts, stories, descriptions, etc. expressed in english, but which are then converted through a variety of processing techniques into more usable knowledge representations such as semantic networks and probabilistic graphical models. My suggestion was that, from the perspective of the computer, only the latter forms should be considered "knowledge" because they could be put to use by an automated inference procedure. But in the long run, as our parsing and reasoning tools get more sophisticated, we may come to be able to use more and more of the collected corpus.

Generally, the word "knowledge" is very inclusive because the AI community has discovered a vast array of knowledge representation forms, and every one of them is useful for some purpose or the other. Thus, the more important questions may not be what is and isn't knowledge, but given some knowledge, questions such as the following:

For what purposes can it be used? When is it applicable? Is it true? According to who? Under what circumstances? Who might find this knowledge useful? Is it expressed clearly enough? Are there other units of knowledge that may be useful in conjunction with this one? How long should we expect this knowledge to stay relevant? How might have this knowledge been acquired, and from where might we acquire more like it? What background might you need to make sense of it?

And so forth. The point, I suppose, is that like most words that point to complex ideas, understanding the word "knowledge" requires that we consider its many contexts of use, and the issues that show up in those contexts.

Sunday, October 24, 2004

Why No Vision?

Why is it that computer vision has proven to be such a difficult problem? The strange thing is that computer graphics, which one might regard as the inverse problem, is rapidly closing in on achieving photorealistic rendering of scenes. I'm also puzzled because recognition problems are typically simpler than generation problems. It's certainly true that computer graphics has benefited from much commercial development and Moore's law, but faster computers should help recognition tasks as well.

One idea is that vision suffers from the same kind of problem as does commonsense reasoning, namely, the lack of large scale knowledge bases about the kinds of objects and materials in the world, what they look like from different angles and under different lighting, and so forth. But if this is the case, and computer graphics has advanced so far, it should not be difficult to generate a suitable such corpus with a moderate investment -- a corpus of images, ground truths in terms of 3d and other types of surface models, and connections to more general commonsense knowledge.

Wednesday, October 20, 2004

the paradox of Wikipedia and Open Mind

I ran across this quote from Larry Sanger, one of the co-founders of Wikipedia, about Wikipedia:

It must be full of a bunch of crank submissions, vandalism, and plain old sophomoric stupidity. But it's not. It's not half bad. In places, and increasingly, it's of very high quality. And that's even more paradoxical.

Our Open Mind Common Sense project was greeted with similar skepticism by many in the AI community. I remember a meeting at IBM two years back where I described how we were trying to build a tool that would let members of the general public collaborate to build a commonsense knowledge base. I should have brought some asbestos underwear! The audience, largely members of the mainstream knowledge representation and logical AI community, was livid at the possibility that we might be able to use facts expressed in natural language contributed by the untrained masses. Some of their concerns were valid, but I was astounded at the level of conservatism they demonstrated. I'm happy to say that since then we've had quite a bit of success using the data collected by our Open Mind project.

It's great to see the success of Wikipedia, which confirms that it's possible to create a useful knowledge base from the contributions of a great many people, and especially, that it's possible to manage the vandalism and to some extent the disagreements. I now wish we had been even more courageous with Open Mind, by allowing people to edit and repair each other's contributions. I was worried about vandalism, but perhaps the problem would never have been as severe as we feared, and perhaps we would have developed strategies to deal with such problems.

Tuesday, October 12, 2004

commonsense and the practice of AI

I had a chance today to spend a little time with a division director of the National Science Foundation. I showed him some of the work we are doing at the Media Lab to give computers more "common sense." I spent much of our time on the commonsense knowledge bases and reasoning tools we are developing, and a little of it on the cognitive architecture we are developing to coordinate the use of these systems.

What we are doing is considered very unconventional, but I hope that the larger AI community will soon come to see the value of the commonsense approach. My belief is that the practice of AI will change dramatically as large-scale commonsense resources become more commonly available. Instead of machine learning from scratch, much machine learning will be concerned with learning in the context of substantial existing knowledge. Instead of hand-coding small knowledge bases for their domain, researchers will augment existing commonsense knowledge bases with the additional knowledge their domain needs. Instead of perceptual systems operating mainly bottom-up from the signal, they will come to use large amounts of top-down knowledge about a wide range of objects, situations, and events, and so forth.