Topic Navigation Maps

Scope

This standard provides a mechanism, based on techniques defined in
ISO/IEC 10744:1992, for identifying information objects that share a common
topic. It can also be used to define the relationships between sets of related
topics. It can be used to define:

tables of contents and subject indexes for individual documents,
or related sets of documents

glossaries that can be shared by more than one document

the relationship between topics within a thesaurus

the relationships between multilingual thesauri, glossaries, etc.

Related Standards

ISO 8879:1986

ISO/IEC 10744:1992

Definitions

TBD

Purpose of the Topic Navigation Map Module

The purpose of this Topic Navigation Map
module is to facilitate the maintainability and usability of topic-based
navigational aids for large corpora of documents containing
interrelated information. The fundamental idea is to make a
distinction between highly concentrated and independent topic maps
-- sets of relations between the topics covered in a given corpus --
and the addresses of relevant information within the corpora
themselves. Such topic maps can improve the accessibility of
information, and they can facilitate and, to some extent, automate the
task of providing, and imposing editorial consistency and maintainability, on
navigational resources. The design of topic maps allows the
groupware-supported production of the data from which navigational
aids such as indexes, glossaries, tables of contents, lists
and catalogs can be generated. It can also be used to enhance the
navigability of very large information bases.

This Topic Navigation Map module provides a basis for creating and
maintaining information that, in effect, classifies the information in
documents according to topic, and classifies topics with respect
to each other. It is intended to help increase consistency and
decrease redundancy not only in navigational aids within documents,
but also in navigational aids used with multiple documents, such as
master indexes. The discipline that can be imposed by using the
Topic Navigation Map module will also assist those who create and/or
collect libraries of documents, and who then wish to provide a given
collection with a unified, consistent, and minimally redundant topic
index.

The Standard Generalized Markup Language (SGML) defined in ISO 8879:1986
allows all kinds of documents to become databases. For this facility to be
useful there must be ways to navigate data stores so that parts of
documents that are relevant to a particular topic can be easily found
and organized rapidly by machine. However, the number and
complexity of indexable topics and the relationships between them in
all documents greatly exceeds the number and complexity of relations
normally represented in traditional databases, or, for that
matter, in the kinds of indexes normally found in books. In fact, the
number of topic relationships that might usefully be represented with
respect to any reasonably large collection of documents is, for all
practical purposes, limitless. Moreover, even in archived documents,
new kinds of topic relationships can be expected to appear
from time to time.

Creating and maintaining topic indexes is a difficult and expensive
proposition. Creating a topic index is a complex task, like
planning and building a building, involving myriad assumptions and
artistic decisions. Many indexes are indexes in name only: ramshackle
affairs that are unable to bear the stress of the everyday purposes
for which indexes are presumably intended, they are essentially almost
useless. All too often, however, even when an index is well thought
out, well constructed, and useful, little thought is given to its
maintainability. When the time comes to create an updated or
corrected index, the original documentation for the topic architecture
of the index is no longer available. Indeed, it may never have
existed or have been consciously expressed in any abstract way. Even
an index on which enormous maintenance effort is expended can quite
easily become a self-inconsistent hodgepodge, especially when the size
of the indexing task dictates that it must be a cooperative effort, or
when there have been changes in the responsible personnel.

An application-neutral, internationally understandable, rigorous, and
yet flexible and open way to represent topical indexes, such as the
one set forth in this Topic Navigation Map module, can help to make
indexes easier to make, easier to maintain, and easier to use. As new
relationships are discovered and included as part of the topic
architecture, the architecture changes. Many architects may have to
collaborate and contribute, over the years, to an evolving
architecture, which at any given time must unambiguously and
comprehensibly govern all maintenance activities. Unless those who
are adding and/or maintaining anchors have clear
guidance, the instantiation of that architecture -- the index itself
-- may become unsound and unsafe.

A topic architecture fundamentally consists of topics and the
relations that they bear to one another. There is need, therefore,
for a way to permit:

any number of topics to be defined by those with knowledge of the subject matter,

any number of categories of topics, with subcategorization to any level,

any number of relations between topics, and

any number of categories of relations between topics, with
subcategorization to any level

to be represented, universally interchanged, processed, merged, and
used for data navigation. An international standard for representing (among many
other things) arbitrary relationships between arbitrary pieces of
information wherever they are in situ, exists in ISO/IEC 10744, which
defines the Hypermedia/Time-based Structuring Language known as HyTime.
In this Topic Navigation Map module a HyTime-based approach of linking
topics with information has been developed,
and an architecture is defined that can support applications
that provide:

the ability for many experts in a given field of knowledge to
share in, and jointly contribute to, the evolution of a common map of
topic relationships in each given field of knowledge;

the ability to merge such maps, whenever multiple fields of
knowledge must be used simultaneously, in such a way as to maximize
the meaningful cross-connections between them; and

the ability to use such maps in a variety of ways for a
variety of purposes, such as extracting printed and online indexes and
glossaries for particular documents. Extracted
indexes are able to reflect the relationships between topics and
subtopics represented by maps of topic relationships, and
are extractable automatically or semi-automatically from
the map of topic relationships as part of a formatting,
pre-formatting, and/or authoring process.

Using this Topic Navigation Map module, a particular topic
architecture, designed for some document, some set of documents, or
even for an entire field of knowledge, can be represented in a
topic map. A topic map consists of a set of topics and
a set of topic relationships. Topics are defined using
CApH.semanticAssignment-form elements whose
roles are defined by the user, and CApH.topicRelation-form elements
that identify specific relations between topics.
Categories of topics may be iteractively identified and described by
linking suitable topics to other topics belonging to the category.

A topic is created by linking, using a HyTime independent link,
several pieces of information about
a topic through a semantic assignment link. A topic can be defined by
assigning an anchrole attribute to the link's definition:
whatever anchor corresponds to the definition in the anchrole
attribute, if any, is therefore considered as
the definition of the topic. This notion of definition is very
general: a definition can be any portion of information (no specific
internal structure needed) that is pointed to.

Semantic Assignment -- CApH.semanticAssignment

A semantic assignment (CApH.semanticAssignment) is a specialized HyTime
independent link (ilink) that associates all
the information objects sharing a common semantic. This group of
objects is collectively called a topic. The located objects have
the common property of being anchors of a semantic assignment
element. Therefore, one can distinguish:

the semantic assignment; an SGML element that associates all the
related objects with anchors. Each anchor can itself be an
aggregate, as there is no CApH-imposed limit on the nature of the
addressing mechanism used to address the anchors. Any anchor can also
be shared by different semantic assignments.

the topic itself, that includes the semantic
assignment element and all of its anchors. The term topic
is to be understood as a subject (as in a subject index) as
well as a location, the greek word topos from
which it originates meaning location. In other words, a
topic is a composite object made of several elementary locations about
a subject.

Common examples of topics are index and/or glossary entries: an
index entry is a set of locations sharing common semantics described
by the term that is displayed in the index; they are normalyy displayed in
alphabetical order. A glossary entry is a topic that points to an
occurrence considered as its definition. CApH enables topics that play
at the same time the role of index and glossary entries: one of their
occurrence roles is their definition, the others being the equivalent
of index entries.

The value of the HyTime anchrole attribute is
user-definable and allows the user to distinguish between different
roles of occurrence sets. The only constraint imposed by CApH is that
the first anchor be the semantic assignment. This is the only
way to enable the link to be referred to. All other declared anchors
can be aggregate anchors.

When a semantic assignment is instantiated, its anchrole values have to be
explicitely defined. The role of each anchor is to specify the nature
of the occurrence where the information about a given topic is to be
found. These anchor addresses are called "occurrence roles". There is
no limit to what can be represented and distinguished as occurrence
roles, nor to the number of occurrence roles. The only HyTime
limitation is that occurrence roles are fixed in the DTD for a given
semantic assignment element type. It is entirely the realm of the application
to decide what to do when all anchors are not filled in. (A "null"
address could be interpreted as no occurrence, for example.) The
purpose of differentiating between different kinds of occurrence roles
is to help users distinguish between different kinds of targets and
navigate with more precision in a large set of information objects.

A semantic assignment can be used to instantiate as many element
types as desired in an actual SGML document type defintion (DTD),
allowing for a finer distinction;
each semantic assignment element type can have a different set of
anchors described in the anchrole attribute.

Endterm values can be associated, by the user, with any
instantiated element to allow the application to display information
that enhances the understanding of information to be found at the
anchor. Index subentries found in printed indexes do sometimes play
such a role of specializing under a given topic. HyTime applications
can use the endterm attribute to display this information to
users. Used in the context of a CApH application, the endterm values
point to information whose purpose is to clarify a semantic title,
without adding any extra structural level.

Anchor aggregation may be given special significance in a given
derived type as long as the basic meaning of the CApH form remains
intact. HyTime's aggregate traversal (aggtrav)
attribute may be used with agglink aggregate
anchors independently or as an enhancement of the meaning of a derived
type of an instance.

There is no requirement that the value of the aggloc
attribute of an aggregate anchor of a CApH.semanticAssignment be
agglink; it could be aggloc instead. Moreover,
there is no requirement that any anchor be an aggregate at all. (In
such cases, from a HyTime perspective, the value of the
aggtrav attribute is irrelevant.)

The first anchor, called the topic anchor, must identify the
CApH.semanticAssignment element itself. Making the semantic assignment
one of its own anchors permits users to traverse from some other link (if
any) to the semantic assignment, and thence to either (or part
of, or all) of the semantic assignment's other anchors. HyTime link
traversal is possible only between the anchors of a link; there is no
implicit traversability between a link and its anchor addresses
unless it is itself one of those anchor addresses (anchaddr).

Specification of the link in an anchaddr attribute value can be defaulted. In HyTime
links, if there is one more anchor indicated by the anchrole attribute
than are actually specified via the anchaddr attribute, the
first anchor is always understood to be the link itself; i.e. the link
is the missing anchor.

Each of the other anchors (collectively called
occurrences) may identify any number of information objects.
The full power of HyTime's information addressing
facilities can be used to associate semantic definitions with
literally any pieces of information, identified by whatever
structural, contextual, semantic properties, or other means are
convenient.

The CApH-specific mnemonic attribute allows a brief single-token name
to be given to the semantic definition.

The CApH-specific semanticUniverses attribute specifies
the semantic context(s) in which the definition is valid. The generic
identifier of a semantic assignment element constitutes implicitely a
universe. Other tokens may be added to the default one as values of the
semanticUniverses attribute, as there is no limit to the number of
universes attached to any instance of an element.

Depending on the application, the user can choose to constrain the
tokens used in the semantic universes within a predefined list, shared
by a community of users. The semantic universe can be described as a
HyTime-defined parsing context (parsecxt). A
CApH-aware application will allow users to filter those objects
belonging to one, or several, universes, and discard remaining
elements, as if they did not exist, using the omitprop
attribute of the parsecxt definition. This feature helps
authors and editors of hyperdocuments to create and maintain
concurrent universes while giving users access to a known set of
universes. The possibility of maintaining a unique hyperdocument while
allowing several views on it should considerably enhance its
maintainability.

A CApH engine must be able to suppress an element with a
CApH-defined semanticUniverses attribute. It is processed
in any context in which none of the universes specified by the token
list is found in the value of the semanticUniverses
attribute, so that the element can be disqualified. The question of
what it means, in any particular case, for information to be
disqualified is entirely the realm of the application. In
general, though, the purpose of disqualification by semantic universe
is to avoid wasting the user's time and attention on irrelevant
information. It is the responsability of the application to inform the
CApH application whenever semantic universes (parsing contexts) become
valid or invalid due to changes in user context; this minimizes
transmission of unwanted information. (In some applications, a user
can say that all universes are always valid, and then see everything.
In other applications, universes can be used for separating
access levels depending on the degree of classification for different
parts of the document, as defined by the hyperdocument editor, but
can not be modified by individual end-users.)

The CApH application is responsible for maintaining a namespace of
universes for each mnemonic, and a namespace of mnemonics for each
universe. Given a mnemonic, a CApH engine that supports the semantic
assignments must be able to provide a comprehensive list of
all mnemonics declared in semantic assignments
in the bounded object set (BOS)

<!element CApH.semanticAssignment
-- associates portions of
information sharing
a common semantic. --
- O (semanticTitle*) >
<!attlist CApH.semanticAssignment
CApH NAME CApH.semanticAssignment
id ID #IMPLIED
-- CApH strongly encourages the id of a CApH.semanticAssignment
element to be present, in order to use this topic as an
anchor for a topic relation link. As all topics must
not be anchors, the id is not required. --
HyTime NAME ilink
mnemonic -- the short or key name for the subject matter of
this definition; machine-processable identifier. Can be seen
as a "semantically-loaded identifier"
(which may or may not be unique) --
CDATA #IMPLIED
semanticUniverses
-- Defines the semantic universe in which this topic is
useful. This attribute is generally used to filter out the
non-relevant topic according to a list of universes chosen
by the user. --
anchrole NAMES #FIXED
"Topic OccurrenceRole_1 #AGG... OccurrenceRole_n #AGG"
-- The number of anchroles is not specified in the
architectural form because it is application-specific.--
The anchors can generally be aggregates (#AGG), although
this is not required by CApH, if some application
needs to specify an anchor role in which the address
is not the address of an aggregate. --
anchaddr -- Anchor addresses. (was "linkends").
Constraint: one anchor per anchor role. --
-- CApH constraint: "Topic" anchor must be the
link itself. --
IDREFS #REQUIRED
extra -- External access traversal rule --
-- Constraint: one per anchor or one for all --
-- Lextype(("E"|"I"|"A"|"N"|"P")+) --
NAMES #IMPLIED -- Default: no HyTime traversal --
intra -- Internal access traversal rule --
-- Constraint: one per anchor or one for all --
-- Lextype(("E"|"I"|"A"|"N"|"P")+) --
NAMES #IMPLIED -- Default: no HyTime traversal --
endterms -- Link end term information.--
Constraint: one per anchor or one for all. --
IDREFS #IMPLIED -- Default: none --
aggtrav -- Traversal of agglink anchors: agg or members.
Constraint: one per anchor or one for all.
lextype(("AGG"|"MEM"|"COR"), (s+,
("AGG"|"MEM"|"COR"))*) --
NAMES "MEM AGG MEM"
>

CApH Semantic Title

The optional CApH.semanticTitle element in the content
of a CApH.semanticAssignment element
is intended to contain a brief, single phrase text title for the
semantic: one that is normally longer than the value of the
mnemonic attribute. Generally, a semantic assignment
has one semantic title. But there can be - interesting - cases where
zero or several semantic titles can be useful:

A case where having no semantic title is useful is when a
group of information objects has been gathered under the structure of a topic
without explicitly giving the topic a title. This corresponds to the
common situation of a cross-reference. When two information objects
are linked through a cross-reference, there is no way of knowing what is
the common semantics linking the two anchors of a link expressed
through a relation such as "see also" or "see". It is therefore
possible to consider that cross-references are similar to untitled
topics. The interest of adopting a common description is that it
encourages upgrades by providing a "hole" to be filled in for the
semantic title of a topic. CApH-aware applications can make it easier
to track these situations in documents and help organize them,
by providing a mechanism to retrofit cross-references as topics.

A case where more than one semantic title may be required is when
two equivalent index entries can refer one to the other. In this case the
semantic titles are used to help users to chose one of a set of options.
For example: "Art Museum" and "Museum of Art" can be
alternate semantic titles for the same topic, offering to users
two choices for accessing this topic in an alphabetical list. As
nothing else than the semantic title differs here, this is typically a
case where several semantic titles could be associated to the same
topic. An alternate situation would be to create two topics with a
"identity" topic relation: it would be left to users to decide
whether they want to introduce a difference between these two
situations.

Architectural Form for Topic Navigation

Hypertexts contain cross-reference links, and links of other
kinds, that serve various purposes. Some links have explicit topic
implications, and some do not. Some of those that have topic
implications may nonetheless not be explicitly intended by their
author as an indication of what should be provided to users as an aid
to navigation within a topic space.

CApH-conforming links that aid topic navigation
are recognized by the values of their CApH attributes, which must
be either CApH.topicRelation or CApH.semanticAssignment.

Representation of Relationships between
Topics: CApH.topicRelation

Topics may be linked to one another by means of topic relation
links (CApH.topicRelations). These links express
application-defined relationships, if any, between the topics. Any
number of relationships can exist between any two or more topics.
Each topic is specified in the anchaddr
attribute by means of a unique identifier reference that ultimately
resolves to the unique identifier of a CApH.semanticAssignment.

Topics may be linked to one another to create
abstract topic maps that might be used as skeletal structures
onto which exemplary and/or related instances can subsequently be added.

The exact nature of the relationship represented
by a topic relation link element type may be given in a semantic
definition element which is linked to all instances of links bearing
this generic identifier by means of a semantic assignment link.

The content of a CApH.topicRelation is not specified by the
CApH.

The value of the CApH.semanticUniverse
attribute specifies the name(s) of the universe(s) in which the topic
relationship expressed by the link is valid. A CApH engine
must be able to warn the application
whenever the CApH.semanticAssignment-form element is
processed in a context in which none of the universes specified by the
token list is valid, so that the topic relationship can be disqualified.

Universe(s) need not be specified, but they can be specified by
defaulting them or fixing them in the DTD, or they can be specified
(possibly overrriding a default value given in the DTD) in the start-tag
of each element instance. If there is no default value and none is
specified in the element instance, then the application's behavior
with respect to disqualification is not specified by CApH.