How Semantic Web Works

An example of a very small number of the resources and connections that might be found in a Star Wars ontology. You can figure these out on your own from watching the movies and surfing the Web, but a computer must have a clear outline to make sense of it.

Another obstacle for the Semantic Web is that computers don't have the kind of vocabulary that people do. You've used language your whole life, so it's probably easy for you to see connections between different words and concepts and to infer meanings based on contexts. Unfortunately, someone can't just give a computer a dictionary, an almanac and a set of encyclopedias and let the computer learn all this on its own. In order to understand what words mean and what the relationships between words are, the computer has to have documents that describe all the words and logic to make the necessary connections.

In the Semantic Web, this comes from schemata and ontologies. These are two related tools for helping a computer understand human vocabulary. An ontology is simply a vocabulary that describes objects and how they relate to one another. A schema is a method for organizing information. As with RDF tags, access to schemata and ontologies are included in documents as metadata, and a document's creator must declare which ontologies are referenced at the beginning of the document.

Keep Reading Below

Schema and ontology tools used on the Semantic Web include:

RDF Vocabulary Description Language schema (RDFS) - RDFS adds classes, subclasses and properties to resources, creating a basic language framework. For example, the resource Dagobah is a subclass of the class planet. A property of Dagobah could be swampy.

Simple Knowledge Organization System (SKOS) - SKOS classifies resources in terms of broader or narrower, allows designation of preferred and alternate labels and can let people quickly port thesauri and glossaries to the Web. For example, in a Star Wars glossary, a narrower term for Sith Lord could be Darth Sidious and a broader term could be villain. Similarly, alternate labels for Han Solo might be nerf herder and laser brain.

Web Ontology Language (OWL) - OWL, the most complex layer, formalizes ontologies, describes relationships between classes and uses logic to make deductions. It can also construct new classes based on existing information. OWL is available in three levels of complexity -- Lite, Description Language (DL) and Full.

The trouble with ontologies is that they are very difficult to create, implement and maintain. Depending on their scope, they can be enormous, defining a wide range of concepts and relationships. Some developers prefer to focus more on logic and rules than on ontologies because of these difficulties. Disagreements regarding the roles these rules should play may be one potential pitfall for the Semantic Web.

Next, we'll tie it all together by looking at our original example -- those "Star Wars Trilogy" DVDs.

Accessing the Metadata

One of the long-term goals of the Semantic Web is to allow agents, software applications and web applications to access and use metadata. A key tool for doing this is simple protocol and RDF Query Language (SPARQL), which is still in development. SPARQL's purpose is to extract information from RDF graphs. It can look for data and limit and sort the results. One of the advantages of the RDF structure is that these queries can be very precise and get very accurate results.