Characterising Representation

The word ontology is highly overloaded and can have a number of interpretations. In addition, there a variety of representations, languages and approaches that can be used to represent knowledge. Each has characteristics that impact on their appropriate usage or the functionalities that they offer. The following attempts to tease out some of the differences between the various representation languages and solutions available. Although some of this may be “well known” or obvious, it is useful to set out clearly the vocabulary that we are using — for example, the term formal is often used to refer to languages which are what we have called here semantically strong.

Content by Sean Bechhofer and Robert Stevens.

Representation Languages and Artefacts

In the following discussion, we distinguish between knowledge artefacts and representation languages. A knowledge artefact (or more simply artefact) captures some particular knowledge about a domain. Artefacts are represented using a representation language which provides syntax (and some degree of semantics) for expressing such knowledge. An example of a representation language is OWL [link], while OBI [link], a particular OWL ontology, is an artefact.

This distinction between the languages and the artefacts is important – for example there is a relationship between expressive languages and rich representations, but one does not necessarily always produce rich ontologies when using an expressive language.

Establishing a vocabulary allows us to articulate the characteristics of our representations and artefacts, and assists in placing them within space. This will help in tasks such as choosing appropriate artefacts for particular purposes or evaluation. For example, FMA [link] and MeSH [link] are both artefacts that are referred to as ontologies, but have radically different representation languages, modelling styles and application usages.

Below we set out a number of characteristics that can apply to representation languages and artefacts.

Syntax and Semantics

Representation languages have a syntax — a way of writing statements down. Syntax sets out rules that tell us how to form expressions using the symbols available, and defines what constitutes well formed statements in the language. For example, to use Manchester OWL Syntax:

(X and Y)

is a well-formed class expression, whereas

and X Y()

is not.

Syntax on its own carries no meaning. The syntax rules simply tell us which combinations of symbols are valid. The semantics of the representation language allow us to interpret those combinations of symbols.

Semantic Strength

A strong representation language is one for which explicit semantics are given for the operators or vocabulary of the representation. The semantics tells us what well-formed statements mean, and are often defined as the set of concrete situations (models) that are consistent with a sentence or set of sentences (this is often called a model theory). A strong representation can reduce the ambiguity or vagueness that might be present in determining whether concrete situations are consistent with the statements being made in the vocabulary.

A weak representation language is one for which explicit (machine-processable) semantics are given for the operators in the language.

For example, to compare OWL and SKOS, OWL is a stronger representation, while SKOS is weaker — the semantics of the OWL subclass relationship are given in terms of class extensions, while in SKOS intended interpretation of the hierarchical and related relationships are given in natural language, viz: A hierarchical link between two concepts indicates that one is in some way more general (“broader”) than the other (“narrower”). An associative link between two concepts indicates that the two are inherently “related”, but that one is not in any way more general than the other. Note that for SKOS, the situation is slightly complicated, as there are some conditions for which a semantics is provided (for example, the broaderTransitive relation is declared to be transitive, which allows some inferences to be drawn).

Note that here the use of the terms “strong” and “weak” are not intended to be used in a pejorative sense. A weaker representation language may be more appropriate than a stronger one in some particular application scenario.

Expressivity

The expressivity of a representation language refers to the ability of the language to distinguish between different kinds of concrete situation. For example, OWL provides different quantifiers that allow us to distinguish between situations where all of the related objects must have some characteristic (universal quantification) and situations where some related object must have a characteristic (existential quantification). Expressivity can be seen as a “sliding scale” — for example, OWL Lite is a less expressive language than OWL DL, as it lacks some of the operators provided in OWL DL (e.g. cardinality restrictions involving numbers other than 1 or 0). Expressivity is related to semantic strength.

There are also differences between languages which are “naturally” and “un-naturally” expressive. For example, negation is not explicitly present in OWL-Lite (there is no complementOf operator, but can be encoded.

Similarly, representation languages may include “syntactic sugar”, for example allowing equivalence axioms between class expressions rather than requiring explicit pairs of subclass axioms.

Cognitive complexity. Do I need to write down a lot in order to express something?

Reasoning

Reasoning refers to the process of answering some semantic based query, such as determining if one statement follows from another. Reasoning is generally performed in the context of a particular artefact, using the characteristics or properties of the representation language (for example the semantics). Reasoning thus relies on a strong representation (e.g. the representation has a semantics that can be used as the basis for that reasoning). Issues of tractability and scalability (related to expressiveness and richness). [Link to elsewhere]

Formality (artefact)

A formal ontology is an artefact in which certain philosophical principles have been applied in order to determine that the underlying content is an “accurate” or faithful representation of the world. Formality is a property of an artefact. There are a number of mechanisms and methodologies for ensuring formality (e.g. OntoClean). Upper ontologies can be used in order to ensure that the organisation of the ontology is consistent with other ontologies (rather than ad hoc). The use of an upper ontology is likely to lead to increased formality.

Formality requires a subjective, qualitative judgement. This may also depends on the ontological modelling style.

Axiomatic Richness (artefact)

The axiomatic richness of an artefact refers to the level of axiomatisation that is present. The term axiomatisation is taken to refer to the expression, in some representation language, of the explicit assumptions being made about the domain described by the artefact. A lack of axiomatic richness limits the possibility of deriving inferences from an artefact. There are no guarantees, however, that an artefact that is axiomatically rich will lead to many new inferences.

Axiomatic richness could be measured in a number of ways. Hayes [ref] for example, in the Naive Physics Manifesto, discusses density. One could also consider coverage, i.e. how much information is being captured about the domain.

Knowledge artefacts may include rich descriptions of their content in terms of natural language descriptions, scope notes etc. For example, a SKOS vocabulary with very detailed descriptions of the Concepts described in it. Such an artefact would not necessarily be considered axiomatically rich: although the artefact contains comprehensive information, this is not being presented in a form amenable to machine processing.

Application/Purpose (artefact)

What is the intended purpose of the artefact? How is it actually used (this could differ from the intended purpose)? Who are the intended consumers? Humans? Machines?

Community/Consensus (artefact)

Does the artefact capture the view of an individual or a community?

Cost

Cost applies to both representation languages and artefacts. Costs are incurred in a number of ways.

Cognitive costs. Understanding a language can be challenging. The articulation of facts using a representation can be difficult.

Management costs. Construction, maintenance, change management, all of which can be hard in collaborative situations.

Examples

This section requires further work.

Language

Strength

Expressivity

OWL

Strong

high expressivity

SKOS

Weak

low expressivity

RDF(S)

Strong

low expressivity

RDF

Weak

?? (it’s just a carrier syntax)

Artefact examples?

GALEN

MeSH

OBI

FMA

GO

Non-rich OWL, something that respects the semantics — there are things around that have been “bent” into OWL, and which thus doesn’t respect the semantics. There is a difference between things that shouldn’t be represented as OWL, because they don’t respect the semantics and things which could be represented as OWL, but there’s no point because they’re not rich enough.

Acknowledgements

This paper is an open access work distributed under the terms of the Creative Commons Attribution License 3.0 (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided that the original author and source are attributed.

The paper and its publication environment form part of the work of the Ontogenesis Network, supported by EPSRC grant EP/E021352/1.