Abstract

The goal of Ontology Learning from Text is to learn ontologies that represent domains or applications that change often. Manually learning and updating such ontologies is too expensive. This is the reason for the Ontology Learning discipline's emergence. The leading approach to Ontology Learning from Text is the Ontology Learning Layer Cake. This approach splits the task into four or five sequential tasks. Each of the tasks may use diverse methods, ranging from uses of Linguistic knowledge to Machine Learning. The authors review the shortcomings of the Ontology Learning Layer Cake approach and conclude that the approach is not viable for Ontology Learning from Text. They suggest alternative approaches that may help learning ontologies in an efficient, effective way.

Article Preview

Introduction

An ontology is a formal, explicit specification of a shared conceptualization. The ontology should be machine readable, with explicitly defined types of concepts and constraints on their use are. It should capture consensual knowledge, that is, it is not private to some individual, but accepted by a group.

Ontologies are often created and updated with human intervention. They represent reality, and therefore require frequent updates. Both the creation and the update of ontologies are costly activities. To overcome this problem the discipline of Ontology Learning has emerged. In this paper we refer specifically to Ontology Learning from text.

Surveys conducted since the early days of Ontology Learning show the different methods used to tackle the problem. The majority of methods follow an approach named the Ontology Learning Layer Cake. This approach splits the task into multiple steps towards learning an ontology, namely term extraction, concept formation, creation of a taxonomy of concepts, relation extraction and finally rules extraction.

This strategy does not take into account the conditions of the problem, nor the quality of the results obtained using the Ontology Learning Layer Cake methods. Working with well-formed text (i.e., books, edited journal and other quality sources) is not the same as working with potentially lower quality sources, such as emails.

Moreover, splitting a task into smaller, sequential tasks, often help reduce complexity without undermining the results. As this paper shows, splitting the tasks, as in Ontology Learning Layer Cake methods, results in low quality results, making the methods unviable.