Glossary

This glossary summarizes the terminology of methods and techniques
for defining, sharing, and merging ontologies.
These definitions, which were written by John F. Sowa,
are based on discussions in the
ontology working group
of the NCITS T2 Committee on Information Interchange and Interpretation.

A mapping of concepts and relations between two ontologies A and
B that preserves the partial ordering by subtypes in both A and B.
If an alignment maps a concept or relation x in ontology A to
a concept or relation y in ontology B, then x
and y are said to be
equivalent. The mapping may be partial: there could be
many concepts in A or B that have no equivalents in the other ontology.
Before two ontologies A and B can be aligned, it may be necessary
to introduce new subtypes or supertypes of concepts or relations
in either A or B in order to provide suitable targets for alignment.
No other changes to the axioms, definitions, proofs, or computations
in either A or B are made during the process of alignment.
Alignment does not depend on the choice of names in either ontology.
For example, an alignment of a Japanese ontology to an English
ontology might map the Japanese concept Go to the English concept Five.
Meanwhile, the English concept for the verb go would not
have any association with the Japanese concept Go.

The properties, features, or attributes that distinguish a type
from other types that have a common supertype. The term comes from
Aristotle's method of defining new types by stating the genus
or supertype and stating the differentiae
that distinguish the new type from its supertype.
Aristotle's method of definition has become the de facto standard
for natural language dictionaries, and it is also widely used for
AI knowledge bases and object-oriented programming languages.
For a discussion and comparison of various methods of definition, see the
notes
on definitions by Norman Swartz.

A terminological ontology whose categories are distinguished
by axioms and definitions stated in logic or
in some computer-oriented language that could be automatically
translated to logic. There is no restriction on the complexity
of the logic that may be used to state the axioms and definitions.
The distinction between terminological and formal ontologies
is one of degree rather than kind. Formal ontologies tend
to be smaller than terminological ontologies, but their axioms and
definitions can support more complex inferences and computations.
The two major contributors to the development of formal ontology
are the philosophers Charles Sanders Peirce and Edmund Husserl.
Examples of formal ontologies include theories in science
and mathematics, the collections of rules and frames in an expert system,
and specification of a database schema in SQL.

A partial ordering of entities according to some relation.
A type hierarchy is a partial ordering of concept types
by the type-subtype relation. In lexicography, the type-subtype
relation is sometimes called the hypernym-hyponym relation.
A meronomy is a partial ordering of concept types
by the part-whole relation. Classification systems sometimes use
a broader-narrower hierarchy, which mixes the type and part
hierarchies: a type A is considered narrower than B if A is subtype of B
or any instance of A is a part of some instance of B. For example,
Cat and Tail are both narrower than Animal, since Cat is a subtype
of Animal and a tail is a part of an animal. A broader-narrower
hierarchy may be useful for information retrieval, but the two kinds
of relations should be distinguished in a knowledge base because
they have different implications.

The conditions that determine whether two different appearances
of an object represent the same individual.
Formally, if c is a subtype of Continuant, the identity conditions
for c can be represented by a predicate Idc.
Two instances x and y of type c, which may appear
at different times and places, are considered to be the same individual
if Idc(x,y) is true. As an example,
a predicate IdHuman, which determines the identity
conditions for the type HumanBeing, might be defined by facial
appearance, fingerprints, DNA, or some combination of all those features.
At the atomic level, the laws of quantum mechanics make it difficult
or impossible to define precise identity conditions for entities like
electrons and photons. If a reliable identity predicate
Idt cannot be defined for some type t, then
t would be considered a subtype of Occurrent rather
than Continuant.

The process of finding commonalities between two different ontologies
A and B and deriving a new ontology C that facilitates interoperability
between computer systems that are based on the A and B ontologies.
The new ontology C may replace A or B, or it may be used only as
an intermediary between a system based on A and a system based on B.
Depending on the amount of change necessary to derive C from A and B,
different levels of integration can be distinguished:
alignment, partial compatibility, and unification.
Alignment is the weakest form of integration: it requires minimal
change, but it can only support limited kinds of interoperability.
It is useful for classification and information retrieval,
but it does not support deep inferences and computations.
Partial compatibility requires more changes in order to support
more extensive interoperability, even though
there may be some concepts or relations in one system or
the other that could create obstacles to full interoperability.
Unification or total compatibility
may require extensive changes or major reorganizations of A and B,
but it can result in the most complete interoperability:
everything that can be done with one can be done
in an exactly equivalent way with the other.

An informal term for a collection of information that includes
an ontology as one component. Besides an ontology, a knowledge base
may contain information specified in a declarative language such as
logic or expert-system rules, but it may also include unstructured or
unformalized information expressed in natural language or procedural code.

A knowledge base about some subset of words in the vocabulary
of a natural language. One component of a lexicon is a terminological
ontology whose concept types represent the word senses in the lexicon.
The lexicon may also contain additional information
about the syntax, spelling, pronunciation, and usage of the words.
Besides conventional dictionaries, lexicons include large collections
of words and word senses, such as WordNet from Princeton University
and EDR from the Japan Electronic Dictionary Research Institute, Ltd.
Other examples include classification schemes, such as the Library
of Congress subject headings
or the Medical Subject Headers (MeSH).

An ontology in which some subtypes are distinguished by axioms
and definitions, but other subtypes are distinguished by prototypes.
The top levels of a mixed ontology would normally be distinguished
by formal definitions, but some of the lower branches
might be distinguished by prototypes.

An alignment of two ontologies A and B that supports equivalent
inferences and computations on all equivalent concepts and relations.
If A and B are partially compatible, then any inference or computation
that can be expressed in one ontology using only the aligned concepts and
relations can be translated to an equivalent inference or computation
in the other ontology.

A category of an ontology that cannot be defined in terms
of other categories in the same ontology. An example of a primitive
is the concept type Point in Euclid's geometry.
The meaning of a primitive is not determined by a closed-form definition,
but by axioms that specify how it is related to other primitives.
A category that is primitive in one ontology might not be primitive
in a refinement of that ontology.

A terminological ontology whose categories are distinguished by
typical instances or prototypes rather than by axioms and
definitions in logic. For every category c in a prototype-based
ontology, there must be a prototype p and a measure
of semantic distance d(x,y,c), which computes
the dissimilarity between two entities x and y when they are
considered instances of c. Then an entity x can classified
by the following recursive procedure:

Suppose that x has already been classified as an instance
of some category c, which has subcategories
s1,...,sn.

If d(x, pi , c) has a unique minimum value
for some subcategory si, then classify x as
an instance of si, and call the procedure recursively
to determine whether x can be further classified by some
subcategory of
si.

If c has no subcategories or if
d(x, pi , c) has no unique minimum for any
si, then the classification
procedure stops with x as an instance of c, since no finer
classification is possible with the given selection of prototypes.

As an example, a black cat and an orange cat would be considered
very similar as instances of the category Animal, since their common
catlike properties would be the most significant for distinguishing
them from other kinds of animals. But in the category Cat, they
would share their catlike properties with all the other kinds
of cats, and the difference in color would be more significant.
In the category BlackEntity, color would be the most relevant
property, and the black cat would be closer to a crow or a lump of coal
than to the orange cat. Since prototype-based ontologies depend
on examples, it is often convenient to derive the semantic distance
measure by a method that learns from examples, such as statistics,
cluster analysis, or neural networks.

A test for determining the implicit ontology that underlies any
language, natural or artificial. The philosopher Willard van Orman Quine
proposed a criterion that has become famous: "To be is to be the value
of a quantified variable." That criterion makes no assumptions
about what actually exists in the world.
Its purpose is to determine the implicit assumptions made by the
people who use some language
to talk about the world.
As stated, Quine's criterion applies directly to languages
like predicate calculus that have explicit variables and quantifiers.
But Quine extended the criterion to languages of any form, including
natural languages, in which the quantifiers and variables are not stated
as explicitly as they are in predicate calculus. For English,
Quine's criterion means that the implicit ontological categories
are the concept types expressed by the basic content words in the
language: nouns, verbs, adjectives, and adverbs.

An alignment of every category of an ontology A to some category
of another ontology B, which is called a refinement of A.
Every category in A must correspond to an equivalent category in B,
but some primitives of A might be equivalent to nonprimitives in B.
Refinement defines a partial ordering of ontologies: if B is a
refinement of A, and C is a refinement of B, then C is a refinement of A;
if two ontologies are refinements of each other, then they must be
isomorphic.

The process of analyzing some or all of the categories of an
ontology into a collection of primitives. Combinations of those
primitives generate a hierarchy, called a lattice, which
includes the original categories plus additional ones that make it
more symmetric. The techniques of semantic factoring can be applied
to any level of an ontology from the highest, most general concept types
to the lowest, most specialized types. The methods can be automated,
as in formal concept analysis,
which is a systematic technique for deriving a lattice of concept types
from low-level data about individual instances.

The study of signs in general, their use in language and reasoning,
and their relationships to the world, to the agents who use them,
and to each other. It was developed independently by the logician
Charles Sanders Peirce, who called it semeiotic, and by the linguist
Ferdinand de Saussure, who called it sémiologie; other variants
are the terms semiotics and semiology. Peirce developed
semiotic into a rich, highly nuanced foundation for formal ontology,
starting with three metalevel categories, which he called Firstness,
Secondness, and Thirdness. Specialized examples of these categories
include Aristotle's triad of Inherence, Directedness, and Containment
in Figure 1 and the triad of Independent, Relative, and Mediating in
Figure 6. One of Peirce's most famous examples is the triad
of Icon, Index, and Symbol.

An ontology whose categories need not be fully specified by axioms
and definitions. An example of a terminological ontology is WordNet,
whose categories are
partially specified by relations such as subtype-supertype or part-whole,
which determine the relative positions of the concepts with respect
to one another but do not completely define them.
Most fields of science, engineering, business, and law have evolved
systems of terminology or nomenclature for naming, classifying, and
standardizing their concepts. Axiomatizing all the concepts in any such
field is a Herculean task, but subsets of the terminology can be used
as starting points for formalization. Unfortunately, the axioms
developed from different starting points are often incompatible
with one another.

A one-to-one alignment of all concepts and relations in two
ontologies that allows any inference or computation expressed in one
to be mapped to an equivalent inference or computation in the other.
The usual way of unifying two ontologies is to refine each of them
to more detailed ontologies whose categories are one-to-one equivalent.