An object lesson in choosing between a class and an object

Overview

The Web Ontology Language (OWL) and other knowledge representation languages allow an ontologist to distinguish between classes of individuals and the individuals themselves. It is not always obvious when to choose to use a class and when to use an individual. This kblog seeks to help with this choice by offering a series of questions; no one single solution is offered (though one is, but it is rejected).

Introduction

OWL is all about modelling objects and their properties. Often, an OWL ontology describes only classes, but it can also explicitly mention objects.

Let’s first fix our terminology: Classes are called classes, and stand for sets of “things”. What to call these “things” is a little more problematic – the official OWL spec calls them individuals, but this is really clunky. OWL also has object properties, that relate two things, so we can also call these things “objects”, which is less clunky and much easier to type – also a name used in the OWL specification. The word “instance” is often bandied about, but we really need to say of what a thing is an instance of – “an instance of X”, so we don’t use that one either (unless it is in the form just described). In this kblog we’re going to stay with using “objects” for things, which may or may not be instances of classes.

So the central question we are discussing here is when to introduce a class and when to introduce an explicit object for a thing you want to model (a concept, notion, idea,…). There are at least two approaches to decide this question: attempting to model things as they actually are, and attempting to model things according to the needs of the target application for the ontology.

Representing the field of interest “as it is”

We can choose to believe that there exists an objective reality, and that people who know a lot about a certain area of this reality, say molecular biology, have a similar conceptualisation of this reality in their minds – and then we can attempt to describe this conceptualisation in an OWL ontology.

This appears to be a simple approach to deciding when to use an object and when to use a class for a given thing:

the class Person and the object Robert Stevens

the class Car and the object car1 that robert was driven to the station in

the class red blood cell and the object rbc7 that is the individual red blood cell in the capillary at the end of robert’s finger

the class Haemoglobin and the object hgc24 in the red blood cell rbc7 in that capillary at the end of Robert’s finger

…

The English indicator of this is the use of articles, particularly the definite article; “the entity” suggests an object, where the indefite article “a” or “an” suggests a collection of possible entities or a class of entities. There are other linguistic indicators of class and instances – not wholly reliable, but they can act as a guide.

Given that this seems a rather straightforward approach – can it go wrong? Robert Stevens makes sense (ignoring the fact that Robert Stevens existed at different times with different properties) but the individual haemoglobin in the individual red blood cell etc is probably at far too fine a grain for using named objects. There are indeed “things” where our approach doesn’t seem to help: soup, love, pdf files, the bible, the prime minister, weather forecast, green, etc.

Modelling according to your application’s needs, in a robust way

Given that we may use our ontology in an application, this usage can nicely inform our design choices. So, once reality ceases to work, we can consider the following questions:

are there different manifestations of the thing in question around, and does this matter? Is the love you want to describe always the same as, e.g., the love that Romeo feels – or do you want to distinguish between motherly love, romantic love, and love of ice cream? Does everybody who reads the bible read the same book? And if so, what happens if it is lost? In the latter case, of course we should distinguish between the content of a book and its physical manifestation — and possibly also its different editions. A similar observations holds for pdf files.

are you ever going to refine the thing in question? If you want to talk about green now, but possibly also about lightgreen later, then make both classes, so that you can make lightgreen a subclass of green…and limegreen a subclass of lightgreen, etc.

is the thing in question in a relation to other objects or values? E.g., if the green you consider has the rgb value of, say, 34-139-34, and is related via isColourOf to car1, then you may want to make this green an object – and also an instance of the class Green.

does uniqueness matter? If Robert Stevens has met the queen and Uli Sattler has met the queen, have we met the same person? And has that person met at least two people? If the answer to the last two questions is “yes”, then the queen should clearly be modelled as an object.

Tips

When in doubt, make it a class.

Use punning when needed: punning refers to the practice of using the same name for both an object and a class, and use it “as a” class or object or even both where appropriate; e.g., we could use Queen both as a class and as an object, and then consider its super classes and the classes it is an instance of.

Summary

OWL is all about objects; it’s just that we usually talk talk about classes of objects and the things that are true of all objects in that class. However, we can explicitly talk about the objects themselves and a frequent quesiton is “when do I use a class and when should I use an individual instead”. We offered two routes: A exact representation of the field of interest; otherwise, simply doing what is best for your application’s needs. In the former one can model ad nauseam, and so end up staying at the class level (and that’s fine). an application’s needs also works, but can ultimately lead to high-variation from ontology to ontology. Deciding where the boundary is can be hard, but the default decision is to keep modelling with classes.