Ontologies are specifications of the concepts in a given field and the relationshipsamong those concepts. The development of ontologies for molecular-biologyinformation and the sharing of those ontologies within the bioinformaticscommunity are central problems in bioinformatics. If the bioinformaticscommunity is to share ontologies effectively, ontologies must be exchanged in aform that uses standardized syntax and semantics. This paper reports on an effortamong the authors to evaluate a number of alternative ontology-exchangelanguages, and to recommend one or more languages for use within the largerbioinformatics community. The study selected a set of candidate languages, anddefined a set of capabilities that the ideal ontology-exchange languageshouldsatisfy. The study scored the languages according to the degree to which theyprovided each capability. In addition, the authors performed several ontology-exchange experiments with the two languages that received the highest scores:OML and Ontolingua. The result of those experiments, and the main conclusionsof this study, was that the frame-based semantic model of Ontolingua ispreferable to the conceptual graph model of OML, but that the XML-based syntaxof OML is preferable to the Lisp-basedsyntax of Ontolingua.

1.

Introduction

Ontologies, as specifications of the concepts in a given field and the relationshipsamong those concepts, provide insight into the nature of information produced bythat field and are an essential ingredient for any attempts to arrive at a sharedunderstanding of concepts in a field. Thus the development of ontologies formolecular-biology information and the sharing of those ontologies within thebioinformatics community are central problems in bioinformatics.

If the

bioinformatics community is to share ontologies effectively, the ontologiesmust be exchanged in some standardized form, such as using a file with a well-defined syntax and semantics. Exchange of bioinformatics ontologies will besimplified if the community can agree on a relatively small number of suchexchange forms---

ideally, on one form.

This paper reports on an effort among the authors to evaluate a number ofalternative ontology-exchange languages, and to recommend one or morelanguages for use within the larger bioinformatics community. The evaluationeffort involved three separate meetings in 1998 and 1999 by the authors, as wellas experiments with the proposed ontology languages. In phase I of theevaluation, the authors selected a set of candidate languages, and a set ofcapabilities that the ideal ontology-exchange language should satisfy.

The authors then scored the languages according to the degree to which theyprovided each capability. In phase II of the evaluation, the authors performedseveral ontology-exchange experiments with the two languages that rated thehighest during phase I, which were OML and Ontolingua.

This paper describes the evaluation process and its results in more detail.

A web site maintained by the authors can be found at http://www-smi.stanford.edu/projects/bio-ontology/.

2.

Motivations

This section discusses the motivations for this work in more detail.

Ontology development is important because every biological database employs anontology, either implicitly or explicitly, to model its data. The morefine-grained

the ontology, the more precisely the database will be able to model the nuances ofthe data that it tries to capture. Acoarse-grained

ontology will model onlysuperficial aspects of the data, and therefore

may not capture data elements thatare important for some problem-solving task. For example, a genome-sequencedatabase that fails to record which genetic code is used to encode a given DNAsequence does not provide the information that users of the database will need toreliably translate each DNA sequence into the corresponding protein sequence. Asemantically malformed

ontology is one that incorrectly models the semantics ofits application domain, and therefore yields a database whose structure corrupts orrestricts the information that it is intended to hold. For example, a metabolicdatabase that defines a one-to-one relationships between enzymes and thereactions they catalyze cannotreliably model the fact that a bifunctional enzymecatalyzes two

separate reactions.

Ontology sharing is important for a number of reasons.First, ontologydevelopment is time consuming. Different bioinformatics groups who wish todevelop ontologies for the same types of biological information will often arriveat a

solution faster by adopting an existing ontology than by developing a newontologyde novo. For example, a group that wishes to define an ontology formicroarray gene-expression data will almost certainly accomplish this task morequickly by consulting one or more existing microarray ontologies.

Second, if different bioinformatics databases that cover the same types of data(e.g., protein sequences) employ the same ontology, they simplify the problem ofdatabase integration, i.e., of processing queries across multiple biologicaldatabases. Different ontologies for the same types of data produce a semanticmismatch that complicates the multidatabase query problem.

Third, bioinformatics databases must make their schemas available to their usercommunities if the users are to have a full understanding of the semantics of thesedatabases.

Fourth, ontology sharing is important because ontologies themselves constitute aform of biological knowledge that is quite valuable when shared within thebioinformatics community. For example, the taxonomy of enzymatic reactionsdeveloped by the Enzyme Commission{EC}, and the taxonomy of gene functiondeveloped by Riley{RileyOntol}

are valuable bioinformatics ontologies.

Fifth, differences between ontologies puporting torepresent the same biologicalprocess may lead to important insights into ways of improving thoserepresentations, and/or new insights into the underlying biology.

3.

Terminology

Ontologies are defined in the literature in a number of ways with varying degreesof formality. One prevailing definition of an ontology is a specification of aconceptualization that is designed for reuse across multiple applications. Byconceptualization, we mean a set of concepts, relations, objects, and constraintsthat define some domain of interest.

One can argue at length about what is and is not an ontology{Gruber,Guarino}.Our view is that ontologies exist at several levels of complexity:



Acontrolled vocabulary

is an ontology that simply lists a set of terms.



Ataxonomy

is a set of terms that are arranged into a generalization-specialization hierarchy. A taxonomy does not define attributes of theseterms, nor does it define relationships between the terms.



An object-oriented database schema defines a hierarchy of classes,

andattributes and relationships of those classes.



A knowledge-representation system based on first-order logic can express allof the preceding relationships, as well as negation and disjunction.

The GeneClinics experiment (seewww.geneclinics.org) illustrates this range ofcomplexity among different ontologies. One of the first steps of the experimentwas to augment the object-oriented schema with a richer set of capabilitiesincluding disjunction, role restriction and other constraints. In the GeneClinicsobject database much of this information was in fact represented in the Javasoftware interacting with the database but was hidden from the end user.

4.

Candidate Languages

In this section we will discuss thecandidate ontology-exchange languages thatwere evaluated by the authors. We discuss the reasons each language wasselected for consideration as a bioinformatics ontology exchange language, we listthe developers of each language and the reasons for its development, and providereferences for each language.

4.1.

Ontolingua

The Ontolingua language was developed by a group at Stanford University for theexchange of ontologies, and was originally funded by the DARPA KnowledgeSharing Effort (Ref). Ontolingua is one of the most significant efforts to comeout of the knowledge representation community and is based on the KnowledgeInterchange Format (KIF), a language specifically built for the sharing ofknowledge among different knowledge representation systems. The authorsbelieved that any evaluation of languages for the exchange of ontologies mustinclude this project.

The semantics of Ontolingua are based on the frame knowledge representationsystems developed by knowledge-representation researchers{Fikes,KarpReview}.

4.2.

CycL

Cyc is perhaps the best-known of the knowledge representation systems and issignificant in its scope and its longevity. Cyc was developed by Doug Lenat atMCC but has since spun-off as a commercial entity, Cycorp. The underlyingrepresentation language for Cyc is called CycL, which derives from first-orderpredicate calculus but with extensions for additional expressivity. Cyc is one ofthe most significant commercial products, if not the most significant, in themarketplace currently. For this reason, as well as it's significance within theknowledge representation community and the rich expressive abilities, it wasselected for evaluation.

4.3.

OML/CKML

Ontology Markup Language/Conceptual Knowledge Markup Language(OML/CKML) is a relatively new effort coming out of Washington StateUniversity that is attempting to base a system for the expression of ontologies onan XML-based syntax. The OML effort was begun in the 1990's and, thoughrelatively young and untested, the authors believed it to have a significantrepresentational power. This representational power combined with theinteroperable nature of an XML-based language was believed to be a combinationworth investigating. In addition, since OML/CKML is currently underdevelopment there is

a potential for co-development to allow the bioinformaticscommunity to influence features and expressive power of the language. There is,though, a possible disadvantage in that the language may evolve in ways that arenot to the advantage of the community or is perhaps not stable or standardized.

4.4.

OPM

OPM was interesting to the authors as a candidate language for exchange ofontologies because of the significance of the OPM system, a product fromGeneLogic used in a number of Pharmaceutical/BioTech organizations. OPM, asa product, is used for the integration of multiple information sources, and uses anunderlying object-oriented federated schema for this purpose.

the W3C. The current standard for the XML Schema Language iscontrolled by the XML Schema Working Group of the W3C. (RDF) is intended toencode metadata concerning web documents. XML/RDF were investigated as apart of the evaluation effort because of the significance of the web and web-basedapplications. It is clear that the web is rapidly becoming the primary method forthe exchange of information and data, and that XML is currently the leadingcandidate for a generic language for the exchange of semi-structured objects.XML/RDF as is, without a higher level formalism that encompasses theexpressivity present in frame-based languages does not go far enough to allow thekind of modeling needed in the bioinformatics community.

4.6.

UML

The Unified Modeling Language (UML) provides a set of notational conventionsthat can be used by software application designers/developers to model theirsoftware system. UML was developed by Rational Software and is currentlybacked by Rational, Microsoft and the OMG. UML was

selected for evaluationbecause it is another widely-used system for the representation of objects andtheir relationships.

4.7.

OKBC

The Open Knowledge Base Connectivity (OKBC) is an API for accessing andmodifying multiple, heterogeneous knowledge bases. OKBC is not actually anontology exchange language–

it is a programmatic API. This group considered itbecause its knowledge model was designed to capture ontologies. The OKBCeffort began as a part of the recent DARPA High Performance Knowledge Base(HPKB) program, and is the successor of Generic Frame Protocol (GFP), a framerepresentation system developed at the Artificial Intelligence Center at SRI.OKBC was created because it provides a uniform model that can be understoodacross a number of knowledge representation systems. The work on OKBC iscurrently being overseen by a working group lead by Richard Fikes at Stanford.Voting members in this group are; ISI, Stanford KSL, SRI International, Cycorp,SAIC and Teknowledge.

4.8.

ASN.1

ASN.1 was included in

this evaluation because of it's historical significance as anearly language for the exchange of datatypes and simple objects. The ASN.1standard was developed as part of the OSI networking stack. It has been, and stillis, being used in a number of bioinformatics applications from the NationalCenter for Biotechnology Information. ASN.1 was also used in conjunction withthe Unified Medical Language System (UMLS) project at the National Library ofMedicine (NLM), however, production of ASN.1 encodings of the UMLS hasbeen discontinued because of low demand for ASN.1 by UMLS users.

4.9.

ODL

The Object Definition Language (ODL) is a relatively new standard coming outof the Object Database Management Group (ODMG) in the early 1990's. ODLwas selected for evaluation because it is currently a de facto standard for acommon representation of objects for object-oriented databases and programminglanguages and so has the potential to become a standard supported widelythroughout the industry. The ODMG member companies include almost allorganizations in the ODBMS/ODM industry and is very closely aligned with theOMG.

5.

Evaluation

5.1.

Evaluation Part I: Initial Evaluation

5.1.1.

Selection of Candidate Languages

The evaluation process began with the selection of known languages forexpressing ontologies. Our selection process relied on an informal review ofcurrent literature and prior knowledge of participants, but, we believe, covers themost viable candidate languages for the exchange of ontologies. The languages,once selected, were then divided among the authors for evaluation.

5.1.2.

Selection of Evaluation Criteria

In order to evaluate the languages in a consistent fashion the authors arrived at aset of questions over which each candidate language would be evaluated. Thequestions that were distributed to members of the working group is included inAppendix A. The questions were divided into the following five majorcategories;

1.

Language Support and Standardization: This section includes generalquestions about the depth of support for the language, including technicalsupport and relationship with standards efforts

2.

Data model/capabilities: This section asks about the richness of the expressivecapabilities of the language.

3.

Querying: This section poses questions about the capabilities of querylanguages available for a representation language.

4.

Performance: Though not related to issues of the expressiveness of thelanguage, the authors wanted to capture some notion of what might beexpected in terms of performance if we were to usea given language.

5.

Other Issues: This section is more concerned with pragmatics, such as currentuse of the language and representation of, or connectivity to, non-ontologysources.

5.1.3.

Evaluation Matrix

The final judgement of the authors for the initial evaluation phase was guided by amatrix of the aspects of an exchange language that were considered key to it's useby members of the Bio-Ontology Consortium (http://www-smi.stanford.edu/projects/bio-ontology/) and other groups who may want to buildontologiesin the area of molecular biology. This evaluation matrix is included inAppendix B.

5.1.4.

Selection of Languages for further Evaluation

The authors decided that there was not a single language that stood out as the onlyappropriate candidate for recommendationas a language for the exchange ofontologies. It was clear that representational expressiveness was not adequate insome languages, and so they were eliminated from consideration. For example,some languages were unable to encode ground facts (instance objects). Also,some languages were in part or in whole proprietary, or had a significant costassociated with them. This was considered prohibitive to the successful adoptionand use of the languages and so these languages were also eliminated . It wasdecided that two languages, Ontolingua and OML/CKML, provided enoughexpressivity to warrant a more in-depth evaluation.

5.2.

Evaluation Part II: OML and Ontolingua

The second phase of the evaluation process focused on the two candidatelanguages that were deemed most interesting from the initial evaluation:Ontolingua and OML/CKML.

The authors decided that it would be useful to create a small model in eachlanguage in order to judge the utility and the representational richness of eachlanguage. A set of experiments were developed to perform this detailedevaluation. Three sets of experimenters were undertaken. The three experimentsand their results are discussed below.

5.2.1.

OML Representation of the EcoCyc Gene Ontology

Experiment:

Peter Karp's group at Pangea Systems performed an experiment to betterunderstand the OML language by translating the EcoCyc gene ontology intoOML. The gene ontology is a taxonomy of 150 classes that classify microbialgenes according to their functions, and that was developed by Dr.

Monica Rileyas part of the EcoCyc project.{Riley,EcoCyc}

Within EcoCyc, the ontology can be accessed athttp://ecocyc.panbio.com:1555/class-subs?object=Genes The OML encoding ofthe ontology can be accessed at: http://ecocyc.panbio.com/~pkarp/omlgenes.txt

Results:

Our findings were that OML was able to capture most aspects of the geneontology. However, we identified what we consider to be a number of limitationsof OML during the course of this experiment.

1.

A number of aspects of the terminology used inthe tags in OML files are notat all intuitive, and are not consistent with the terminology used in the moremainstream ontology community. This terminology will interfere with theacceptance and understanding of the language in the bioinformaticscommunity. We suggested that OML could allow several alternatives for eachtag to allow the language to be accepted by different communities that usedifferent terminology.

2.

The OML definitions are not modular in the sense that the OML definition ofa given Class

is spread out into several parts of the file, making OML filesless human readable.

3.

OML has a number of limitations in its expressive power:

a)

It cannot express facets directly (attributes of attributes), but R. Kentsuggested that N-ary relations can be used to express facets.

b)

It cannot express annotations.

c)

It cannot handle multiple collection types--

sets only.

d)

It cannot express cardinality or numeric-range constraints.

5.2.2.

Ontolingua Representation of the EcoCyc Gene Ontology

Experiment:

Results:

5.2.3.

Representation of GeneClinics data model as an ontology

Experiment:

Peter Tarczy-Hornoch at the University of Washington in collaboration with LucaToldo and Robert Kent performed an experiment with the general goal of usingthe existing GeneClinics OODB model as the basis for an ontology to assessOML/CKML and Ontolingua for ontology creation/exchange. The specific goalwas to develop a small representative ontology in both Ontolingua andOML/CKML that represents key clinical and molecular entities and theirlinkages. design of the experiment was:

1.

A distributed e-mail based experiment involving three investigators at threesites

OML/CKML’s XML syntax makes it easier to learn than Ontolingua with itsLISP syntax.

5.

Neither language has the type of documentation of its syntax and semanticsthat would be needed for a tutorial for a bioinformaticist. Ideally thetutorial/documentation would need to include both formal representation ofsyntax with modified BNF format as well as selected examples drawn frombiology building in complexity. For example, how do you represent abiological entity like aprotein, how do you express the concept that asequence of DNA codes for that protein, how do you express that proteinshave one or more of the following list of functions, etc.

6.

Both languages very expressive–

Ontolingua’s expressivity is easier to see in

both LISP and in the Ontolingua ontology-development tool because it isexposed even in simple case examples. OML/CKML expressivity is rich butharder to determine since a) it is not apparent is simpler examples, b) thingslike local theories and other concepts are powerful but harder to understand(documentation in conceptual graph paradigm, documentation andspecification both evolving). In principle the OML/CKML conceptual graphmodel may be richer and more expressive than the frame model; an exactcomparison of the two models would be useful.

7.

Both languages able to handle needs of GeneClinics sample ontology (not acomplex ontology).

Though not per se anattribute of the languages themselves it is important tonote that software tools and applications, such as editors, browsers, parsers,translators, and query systems, exist for Ontolingua but not for OML/CKML.

10.

OML/CKML is "an uninstantiated formalism" at

some level.

11.

The availability of the developer of OML/CKML (R. Kent) for collaborationon this project was immensely helpful.

Conclusions: The expressive power of the two languages is similar and more thanadequate for the purposes of expressing a part of

the GeneClinics data model asan ontology. OML/CKML is however theoretically more powerful being based ona conceptual graph methodology. The Ontolingua frames semantics/paradigm onthe other hand may be easier to learn since it is less of a leap from object databaseand object programming paradigms. The LISP syntax of Ontolingua could presenta challenge to many bioinformaticians and the XML syntax of OML/CKML islikely to be more intuitive. Ideally an ontology exchange language would have aneasy to learn basic semantics and syntax (like XML) but be very expressive (likeOML/CKML and Ontolingua). Neither language as it stands quite achieves thisideal though a more frame-based version of OML/CKML or an XML encoding ofOntolingua might come closer. Finally, for the general bioinformatics community(not versed in ontology representation) it might be helpful to createdocumentation and tutorials that use biological examples.

5.3.

Evaluation Part III: Recommendations

At its last meeting, the BioOntology Core Groupreached the followingconclusions and recommendations.

The core group reached two major decisions for the selection of a language for theexchange of ontologies for molecular biology:

1)

A traditional frame-based approach for representation of biological entities issufficient for current needs. In addition, frame-based systems have been inuse for a significant period of time and are, in general, stable representationsystems. Among frame-based systems Ontolingua is clearly one of the mostprominent and has

had extensive use for many years.

2)

XML has tremendous momentum with significant interest from commercialorganisations and a serious standardisation effort. We anticipate that XML-based tools and web servers supporting XML are beginning to appear andmore

are on the horizon.

The belief of the group was that the language that the bioinformatics communityneeds for the exchange of ontologies should be based on frame-based semanticswith an XML expression. However, the group also believed that we did not havesuch a language before us since Ontolingua is frame-based but without an XMLexpression and OML does have an XML expression, but is based on conceptualgraphs, not frames.

At the meeting Peter Karp presented preliminary work that he and VinayChaudhri, from SRI, had done on producing an XML expression based on theOKBC knowledge model, which in turn is very closely related to Ontolingua (theOntolingua developers were also involved in the development of OKBC).

The consensus of the group was that we recommend the use of a frame-basedlanguage with an XML syntax for the exchange of ontologies, and, to that end, thegroup requested that Karp and Chaudhri complete their work on the XMLexpression of Ontolingua, so that the group could complete its evaluation ofexchange languages.

6.

Summary

Over the last two decades, the knowledge representation and object-orienteddatabase communities have developed a number of languages that may be usedfor the expression of semantic database models. These languages share many

elements in common, and are exemplified by the frame knowledge representationsystems used in the knowledge representation community. Frame systems havebeen used in many different bioinformatics projects, and the authors believe thatframe systems provide the necessary representational constructs to modelontologies for molecular biology. Furthermore, frame systems have a significantamount of history and use, so that they provide a stable representational paradigm.

The authors also believe that the explosion of the web and the languagesassociated with it simply cannot be ignored. Acceptance of an exchange languagethat is expressed in a Lisp syntax will be limited within the bioinformaticscommunity, even though the underlying representational system may be identicalto that expressed in a web-based language. For this reason the authors believethat an XML-based syntax must be used for a bioinformatics ontology exchangelanguage to increase the likelihood that the language will see widespreadacceptance.

In summary, the results of this evaluation suggest two directions for future work:development of an XML expression for the Ontolingua model, or adaptingOML/CKML to include a frame-based semantic model.

7.

Future Directions

The authors support the use ofa frame-based exchange language using an XMLsyntax. Several researchers on the evaluation team are currently developing aspecification of XML expression of Ontolingua using OKBC. A separate set ofresearchers on the team are pursuing a frame-based version of OML.

The exchange language evaluation team will meet again to consider the questionof whether either, or both, of these efforts provide an acceptable exchangelanguage that meets the groups requirements.

References

EC

Edwin C. Webb, "Enzyme Nomenclature, 1992: Recommendations of the

nomenclature committee of the International Union of Biochemistry and

Molecular Biology on the nomenclature and classification of enzymes",

Eur. J. Biochem., Academic Press, 1992.

Fikes

Fikes, R. and Kehler, T., "The Role of Frame-Based Representation in

Reasoning", Communications of the Association for Computing Machinery,

1985, 28(9):904-920.

Gruber

Gruber, T.R., "A translation approach to portable ontology

specifications", Knowledge Acquisition, 1993 5:199-220.

Guarino

Guarino, N. and Giaretta, P., "Ontologies and knowledge bases towards

a terminological clarification", in Towards very large knowledge

bases, IOS Press, Amsterdam, 1995, N.J.I. Mars, pp25-32.

KarpReview

Karp, P.D., "The design space of frame knowledge representation

systems", SRI International AI Center, 1992, #520, URL

ftp://www.ai.sri.com/pub/papers/ karp-freview.ps.Z.

RileyOntol

Riley, M., "Functions of the gene products of Escherichia

coli", Microbiological Reviews, 1993, 57:862-952.

8.

Appendix A-

Evaluation Questions

The following questions were asked of each candidate language during the PhaseI evaluation process.

Language Support and Standardization:

1.

Is a formal specification of the syntax of the ontology language available?How complex is its syntax? Please present that formal specification of thelanguage at the meeting.

2.

What parsers are available for the language? What translators are available toconvert between language L and other ontology-description languages? Howcomplete are those translators?

3.

What other software is available that operates on the language, such as forweb-based publishing of ontologies or browsing/editing of ontologies?

4.

What support (documentation, training, tutorials, e-mail) is available for thelanguage?

5.

Does it have any development/usage standards? Who controls this standard?

6.

Does a stable release of the language exist (i.e. one that will notfundamentally change in 6 months)?

Data model/capabilities:

1.

What assumptions does the language make about the ontology to berepresented?

2.

Which of the following does the language support:



negation



conjunction



disjunction



recursion



relations



multiple inheritance



multi-valued slots



number restrictions on roles



role hierarchies



transitive roles



axioms



template/default values



method slots (calculated values?)



constraints

3.

If the language supports constraints, how rich is the constraint language? Isthe constraint language formally defined?

4.

What are the primitive data types in the language?

5.

What database data model(s) does the language support?

6.

Does the language encode instances as well as classes (data as well asschema?)

Querying:

1.

Are there tools for query an ontology expressed in this language? If so, ...

2.

How are queries expressed?

3.

Which of the following queries can beexpressed in the query language:



what are the parents of concept C?



what are the children of concept C?



what could I say about concept C (e.g. what roles are legally applicable toC)?



is concept C satisfiable?



what role-fillers can a role have for a concept C?



what English expression does C have?



is C a kind of D?



what is the least common parent of C and D



what is the greatest common child of C and D



are C and D equivalent?



Can queries be translated/compiled into a standard programming/querylanguage?

Are there any limits (or the limits of available translators/parsers) in the sizeof the ontology, the length of names/values, etc. (theoretical or practical).

2.

What is the overhead (bytes) for a language parser? interpreter?

3.

For resources which depend on an information service for support (such asOntolingua), does the service have the capacity to support all of the users ofthe technology?

Other Issues:

1.

What example applications exist which utilize the language? How many ofthese are from or representative of the bioinformatics domain?

[The two questions below are asking about the ability to express non-domainrelevant information in the ontology, so that,for example, one could include usermodel information (preferences for viewers, etc) or database access information(for access to persistent instance-level information) in the domain model.]

2.

Can the ontology be partitioned, for example, into biology and bioinformatics(e.g. a protein has an accession number)?

3.

Can the core ontology be extended to include other information, e.g. mappingsto functions in databases, control information for showing the ontologythrough interfaces

9.

Appendix B-

Evaluation Matrix, Part I

The table below shows the evaluation of candidate languages over generalinformation.

Property

ASN.1

ODL

Onto

OML/CKML

OPM

XML/RDF

UML

FormalSyntax?

Yes

No

Yes

Yes

Yes

Yes

Yes

Translators

No

Yes

Loom,IDL,KIF,CLIPS,etc

No

Relational,ASN.1,XML,HTML,ER

No

No

SoftwareTools

Parsers

Parsers

WWWbrowsers,editors,comparisontools

No

Yes

XMLtoolkits

Rational Rose

Support

??

yes

WWWdocumentation,FAX,tutorial,supportstaff

WWWgrammars,WWWexamples

Documentation,training,tutorials

WWWsites,mailinglists,books

Formalcourses,books,tutorials

Controlling Org

ISO

ODMG

Stanford U

WSU

GeneLogic Inc

W3C

OMG

Stability

Stable

Stable

Stable

Evolving

Stable

Evolving

Stable

Users

Yes

OOVendors

WWWusers

Intelapps

Yes,Bix andothers

WWWdevelopers

manyparts ofindustry

BioinfoUsers

NCBI

Yes

SB,StanfordRoboWeb

Yes

PDB

No

SB,(probably manyotherpharmas)

Developers

??

OOVendors

Stanford

WSU

GeneLogic

many,many

Rational Rose

10.

Appendix C-

Evaluation Matrix, Part II

The table below shows the results of evaluation over detailed properties of therepresentational expressiveness of candidate languages.

Property

ASN.1

ODL

Onto

OML/CKML

OPM

XML/RDF

UML

Negation

No

No

Yes

Yes

??

No

No

Conjunction

No

No

Yes

Yes

??

No

No

Disjunction

No

No

Yes

Yes

??

No

No

Relations

Yes

Yes

Yes

Yes

Yes

Yes

Yes

MultipleInheritance

No

Yes

Yes

Yes

Yes

Yes

No

Inverses

No

Yes

Yes

Yes

Yes

No

No

Multi-valuedslots

Yes

Yes

Yes

Yes

Yes

Yes

No

Multiplecollectiontypes

Yes

Yes

No

No

??

Yes

No

Numberrestrictions

No

No

Yes

Yes

??

No

Yes

Slothierarchies

No

No

Yes

Yes

??

No

No

Facets

No

No

Yes

Yes

??

No

No

DefaultValues

No

No

No

Yes

??

Yes

No

Other slotconstraints

No

No

Yes

Yes

No

No

No

PrimitiveDatatypes

Standard

Standard

Standard

Standard

Standard

None

N/A

DataModel

Objectw/oinheritance

Object

Object/Logic

Object/Logic

Object

SemiStructureddata

Object

Instancesandclasses

No

No

Yes

Yes

No

Yes

No

Comparison of the expressive power of the ontology-exchange

languages. The meanings of the rows are:

1.

Negation:Does the language allow the assertion that a relation does not holdbetween x and y.

2.

Conjunction: Does the language allow the assertion that a relation holds bothbetween (x, y) and between (x, z).

3.

Disjunction: Does the language allow the assertion that arelation holds bothbetween (x, y) or between (x, z), but not both.

4.

Relations: Does the language allow the mapping of the elements of a set A tothe elements of a set B.

Instances and classes: Can the language encode information aboutinstanceobjects as well as class objects?

Appendix D-

Evaluation Matrix

The table below was used by the authors to evaluate the initial candidatelanguages after our evaluations over the questions was complete. This tableshows the desired attributes of

an exchange language, and how each language canbe rated along those aspects. A plus sign, '+', indicates a positive. More than oneplus sign indicates more significant positives. The minus sign, '-', indicates anegative evaluation of a criteria. Also, AF indicates that the language/product isfree to academic organizations, and

Onto

XML/RDF

OML

OKBC

OPM

CycL

UML/XMI

classes &instances

+

+

+

+

-

+

+

multipleinheritance

+

+

+

+

+

+

+

constraints

++

-

++

+

+

+

+

defaults

+

+

+

+

+

+

expressive power

+++

+

+++

++

++

+++

+

tools available*

lisp(AF)

Java

lisp, Java,C

Java,C++

lisp, Java,C(AF)

stability

+

-

+

+

+

+

-

support

+

++

+

+

+

+

-

translators

++

+

?

+

+

KIF. Loom

-

many applications

+

+

+

+

+

+

-

open language

+

+

+

+

+

+

+

simplicity: human

good

low

low

good

good

low

simplicity: formal

good

good

good

good

good

open tocollaboration

+

++

++

+

STATUS

out

out

out

out

out

* By "tools available" the authors mean browsers and editors for the language.