To link to the entire object, paste this link in email, IM or documentTo embed the entire object, paste this HTML in websiteTo link to this page, paste this link in email, IM or documentTo embed this page, paste this HTML in website

Enabling laymen to contribute content to the semantic web: a bottom-up approach to creating and aligning diversely structured data

ENABLING LAYMEN TO CONTRIBUTE CONTENT TO THE SEMANTIC
WEB: A BOTTOM-UP APPROACH TO CREATING AND ALIGNING
DIVERSELY STRUCTURED DATA
by
Baoshi Yan
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
December 2006
Copyright 2006 Baoshi Yan

This dissertation aims at lowering the entrance threshold for laymen to contribute to the Semantic Web. To be more specific, it aims at lowering the difficulty for laymen to perform two basic and tightly related tasks on the Semantic Web: semantic data creation and ontology alignment.; The content of the current Web -- plain texts mixed with HTML tags -- is difficult for machines to interpret. Researchers from universities and industries have been working on the development of next generation Web -- the Semantic Web. The Semantic Web aims at creating and connecting a web of machine-understandable semantic data, which allows for intelligent processing and could bring the Web to a higher level.; In order for the Semantic Web to succeed, it has to be easy for laymen to make contributions.It has to be easy for laymen to contribute semantic data and operate on heterogeneous data.; Semantic data is structured data marked up with ontology terms. A piece of semantic data specifying one's telephone number could be represented as (JohnSmith, O1: phoneNumber"123-456-7890"), where "phoneNumber" is an ontological term in ontology "O1". Ontology alignment is the process of matching terms from different ontologies. For example, "phoneNumber" in one ontology corresponds to "telephoneNumber" in a second ontology.; The realization of the Semantic Web is dependent on the creation of massive amount of semantic data. Semantic data, being encoded in a structured form and marked up with ontologies, is much more machine-friendly and makes more intelligent processing feasible.; Aligning terms from different ontologies is the key to integrating heterogeneous semantic data in different ontologies. Given the openness of the Web, it is unrealistic to expect that all semantic data is created in the same ontology. Ontology alignment could help semantic data creation as well by supplying existing ontologies and semantic data in the intended domain.; Many tools have been proposed to create semantic data and align ontologies. However, the lack of semantic data is still the biggest problem with the current Semantic Web development. In addition, no alignment tool has gained widespread use among ordinary users, and there is little ontology alignment data available on the current Semantic Web.; In this dissertation, we argue that the conventional tools and techniques for semantic data creation and ontology alignment pose a high barrier of entry to laymen. We argue that conventional tools and techniques share a common characteristic: they are all topdown, ontology-based tools. As a result, they are difficult to use by laymen because of the inherent difficulty in dealing with ontologies: ontologies are abstract, generalized, high-level entities.; Instead, we proposed a bottom-up, data-centric approach to semantic data creation and ontology alignment. In bottom-up data creation, users create weakly-structured data first, and gradually refine their data. An ontology is derived as a summary of created data so far. In bottom-up ontology alignment, laymen, rather than ontology experts, align their semantic data (for their own purposes) within end applications, the inferred implicit and sometimes imprecise ontology alignments are then integrated and mined to produce higher accuracy ontology alignments. In both tasks, the difficulty of carrying out the task is significantly reduced, making it easier for laymen to contribute to the development of the Semantic Web.

ENABLING LAYMEN TO CONTRIBUTE CONTENT TO THE SEMANTIC
WEB: A BOTTOM-UP APPROACH TO CREATING AND ALIGNING
DIVERSELY STRUCTURED DATA
by
Baoshi Yan
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
December 2006
Copyright 2006 Baoshi Yan