Data or Metadata

(from LSID best practices)
Data is defined as a sequence of unchanging bytes. Examples of data are microscope images, a protein sequence, a text file, etc. Metadata is usually information that describes the data either literally (date created, MD5 check sum, size) or contains information describing the relationship between the data and other objects.
If you cannot determine what should be data and what should be metadata from your data model, follow this rule of thumb: Large byte sequences are easier to manipulate as data, while short byte sequences can be included as data, metadata, or made available in both forms.

Abstraction Hierarchy

Part - simple biological function encoded in DNA

Device - simple logical function; collection of parts

System - collection of devices

Device is_a part in context of the system but also device has_a part.

Device is_a subclass of Part, System is_a subclass of Device

How to represent barriers and interfaces betwee levels of abstration?

Genetic, protein and cell devices

:RBS :subclassOf :BasicPart OR :RBS :typeOf :BasicPart (instance)

Basic parts: detailed specs and sequence data

Composite parts: basic parts plus assembly (composite parts have are the same if they have the same basic parts)

Reasoning: logical inference, "processing knowledge" (implicit knowledge has to be made explicit)

Expressive Power of representation language - able to represent the problem

Correctness of entailment procedure - no false conclusions are drawn

Completeness of entailment procedure - all correct conclusions are drawn

Decidability of entailment problem - there exists a (terminating) algorithm to compute entailment

Complexity - resources needed for computing the solution

Logics differ in terms of their representation power and computational complexity of inference. The more restricted the representational power, the faster the inference in general.

First-order logic: we can now talk about objects and relations between them, and we can quantify over objects. Good for representing most interesting domains, but inference is not only expensive, but may not terminate.

An ER conceptual schema can be expressed in a suitable description logic theory.

The models of the DL theory correspond with legal database states of the ER schemas.

Mapping ER schema in DL theory:

Reasoning services such as satisfiability of a schema or logical implication can be performed by the corresponding DL theory.

A description logic allows for a greater expressivity than the original ER framework, in terms of full disjunction and negation, and entity definitions by means of both necessary and sufficient conditions.

Knowledge Bases

Enumerate Objects. As a bare list of elements of the KB; they will became individuals, concepts, or role.

Distinguish Concepts from Roles. Make a first decision about what object must be considered role; remember that some could have a "natural" concept associated. The remaining objects will be concepts (or maybe individuals). Also, try to distinguish roles from attributes.

Develop Concept Taxonomy. Try to decide a classifcation of all the concepts, imagining their extensions. This taxonomy will be used as a first reference, and could be revised when definition will be given. It will be used also to check if definition meet our expectations (sometime, interesting, unforeseen (re)classifications are found).

Devise partitions. Try to make explicit all the disjointness and covering constraints among classes, and reclassify the concepts.

Individuals. Try to list as many as possible generally useful individuals. Some could have been already listed in step 1. Try to describe them (classify).

Properties and Parts. Begin to define the internal structure of concepts (this process will continue in the next steps). For each concept list:

intrinsic properties, that are part of the very nature of the concept;

extrinsic properties, that are contingent or external properties of the object; they can sometime change during the time;

parts, in the case of structured or collective objects. They can be physical (e.g., "the components of a car", "the casks of a winery", "the students of a class", "the members of a group", "the grape of a wine") or abstract (e.g., "the courses of a meal", "the lessons of a course", "the topics of a lesson").

In some cases some relationships between individuals of classes can be considered too accidental to be listed above (e.g., "the employees of a winery"; but the matter could change if we consider Winery as a subconcept of Firm).

In general, the above distinctions depend on the level of detail adopted.

Some of the listed roles will be later considered defnitional, and some incidental.

After this and the next steps check/revision of the taxonomy could be necessary.

Cardinality Restrictions. For the relevant roles for each concept.

Value Restriction. As above. Also, chose the right restriction.

Propagate Value Restrictions. If some value restrictions stated in the previous step does not correspond to already existing concepts, they must be defined.

Inter-role Relationship. Even if hardly definable in DL, they can be useful during the populating and debugging phases.

Definitional and Incidental. It is important distinguish between definitional and incidental properties, w.r.t. to the particular application.