CMU World Wide Knowledge Base (Web->KB) project

Goal:

To develop a probabilistic, symbolic
knowledge base that mirrors the content of the world wide web. If successful,
this will make text information on the web available in
computer-understandable form, enabling much more sophisticated information
retrieval and problem solving.

Approach:

We are developing a system that can be trained to extract symbolic
knowledge from hypertext, using a variety of machine learning methods.

Datasets:

The first experiments consisted in extracting knowledge about computer
science departments. We have assembled two data sets for this task: