Definitions

Lot of concepts used by KEES refer to the well known Semantic Web Standards published by the World Wide Web Consortium (W3C).

What is data? According with common sense, KEES defines data as words, numbers or in general any string of symbols. This concept is equivalent to the definition of "literal" in the RDF (Resource Data Framework). Example of data is the string xyz, the numbers 123, 33.22 or the URI http://LinkedData.Center. Note that the data is usually associated with a data type it is just a name that states a set of restrictions on symbols string that build up the data; data type is not the data meaning.

What is information? KEES defines information as data with a meaning. The meaning can be
learned from the context where a data is found or explicitly defined. From a practical point of view, because KEES adopts the RDF standards, an information is defined by three data that build up a triple (also known as a RDF statement): a subject, a predicate and an object. The data type for the first two triple elements (subject and predicate) must be an URIs, the last element of the triple (object) can be anything. A triple can be also rapresented as an unidirected labeled graph

KEES defines knowledge as a graph of linked information (i.e. linked data). This graph is possible because, in RDF, any URI can be both the object of a triple and the subject of another one or even a predicate for another.

KEES defines knowledge base (or knowledge graph ) as a container of linked data with a purpose, that is a related set information that can be composed to provide answer to some questions.

From a theoretical point of view, a knowledge base is composed by information (i.e. fatcs), plus a formal system of logic used for knowledge representation, plus the Open-world assumption.

The information are partitioned in two set: TBox and ABox. ABox statements describe facts, TBox statements describe the terms used to qualify the facts meaning. If you are familiar with object-oriented paradigm, TBox statements sometimes associate with classes, while ABox associate with individual class instances.
TBox statements tend to be more permanent within a knowledge base and are often grouped in ontologies that describe a specific
knowledge domain (e.g. business entities, people, goods, friendship, offering, geocoding, etc, etc).
ABox statements associate with instances of classes defined by TBox statements. ABox statements are much more dynamic in nature and
are populated from datasets available in the web or by reasonings.

For practical purposes, KEES assumes that the knowledge base can be defined in a SPARQL service.

The Language Profile (or Application profile) is the portion of the TBOX that is recognized by a specific software application.

The language profile can contain axioms . An axiom describes how to generate/validate knowledge base statemensts using entailment inferred by language profile semantic and known facts. For example an axiom can be described with OWL and evaluated by a OWL reasoner or described with SPARQL QUERY constructs or with SPARQL UPDATE scripts and evaluated in a SPARQL service.

The KEES Language Profile is the set of all terms, rules and axioms that a software application that want to use a knowledge base should to understand.

Trust is another key concept in KEES. The Open-world assumption and RDF allow to mix any kind of information, even when information that are incoerent. For instance, suppose that an axiom in your knowledge base TBOX states that a property "person:hasMom" has a cardinality of 1 (i.e. every person has just one "mom"), your knowledge base could also contains two different facts (:jack person:hasMom :Mary) and (:jack person:hasMom :Giulia), peraphs extracted from different datasources. In order to take decision about who is jack's mom you need trust in your data. If you are sure about the veridicicy of all data in the knowledge base, you can deduct that :Mary and :Giulia are two names for the same person. If you are not so sure, you have two possibility: deduct that the data source is wrong, so you have to choose the most trusted statement with respect some criteria (even casually if both statemenst have the same trust rank) or to change the axiom in TBOX , allowing a person to have more than one mom. In any case you need to get an idea about your trust on each statement, both in ABox and in Tbox, in the knowlege base. At least you want to know the provenance and all metadata of all information in your knowledge base because the trust on a single data often derives from the trust of its source or in the creator of the data source.

KEES Specification

KEES Vocabulary

The KEES vocabulary defines few new terms in the http://linkeddata.center/kees/v1# namespace ( usual prefix kees:).
It consists of some OWL classes and properties, mainly derived from existing ontologies.

A kees:KnowledgeBase is defined as a subclass of a sd:Dataset that, in turn, is a specialization of a dataset as described in the VoID. A kees:KnowledgeBase is
a collection of sd:namedGraph that contain the linked data.

The main class introduced by KEES vocabulary is the kees:Plan that describes how to create and update a named graph in the knowledge base, extracting facts extracted from a data source or derived from axioms.

A kees:KnowledgeBaseDescription is a document that contains the description of the knowledge base with the purpose of
publishing and trasferring knowledge base bulding information.
Think it as a subclass of a foaf:Document that allows to attach license and other
metadata.

The kees:Question represents the purpose for the the knowledge base existence. In other words, the knoledge base exists to answer to questions. Question are natural language expressions that can be expressed as a query on a populated knowledge graph. The answer to a question results in tabular data, structured document, logic assertion or a translation of these in a natural language sentences.

A kees:Agent describe a processor that understands the KEES language profile and that it is able to do
actions on a knowledge base starting from its descripion documents according with KEES specifications.
It should be able to learn data, reasoning about data and to answer some questions starting from learned fact.

The KEES vocabulary is expressed with OWL RDF in kees.rdf file. The file was edited with Protégé editor.

Besides few classes and properties, KEES vocabulary defines some individuals:

kees:guard a SPARQL service description feature that states that the RDF store supports KEES guard specifications (see below)
kees:trustGraphMetric defines a metric that allows to evaluate a trust rank for selected information in daq framework.

kees:append and kees:replace state two possible graph accrual policies: append policy affirms that, if new facts found,
they are to be appended to existing data. The replace policy affirms that new data must replace all existing information.

The foaf:primaryTopic property is used to link a kees:KnowledgeBaseDescription document to a kees:KnowledgeBase individual.

RDF Store requirement

To Know the provenance of each statement, it is of paramount imortance to get an idea about data quality. For this reason, KEES requires that all statements must have a fourth element that links to a data source. This means that, for pratical concerns, the KEES knowledge base is a collection of quads, i.e. a triple plus a link to a metadata.

Any RDF Store that provides with a SPARQL endpoint and QUAD support is compliant with KEES.

During knowledge base building and update the knowledge base could be in an inconsistent state.
If a statement with the subject urn:kees:kb and the predicate dct:valid exists,
then it means that the Knowledge base is safe to be queried. Otherwhise queries the knowledge base should be considered not safe.

To declare that a RDF Store is ready to be safely queried execute following SPARQL UPDATE statement

SPARQL service requirements

A KEES compliant SPARQL service SHOULD expose the kees:guard feature.
If a this feature is present,
the SPARQL endpoint MUST return the http 503 Error when someone try to query a RDF Store that is not in the safe state. A KEES compliant SPARQL endpoint SHOULD disable the guard feature if the http header "X-KEES-guard: disable" is present in the HTTP request.

KEES agent requirements

Workflow

A KEES Agent SHOULD perform actions on a knowledge base on a logical sequence of four temporal phases called windows:

a startup phase (boot window) to initialize the knowledge base starting from one or more knowledge base descriptions

a time slot for the population of the Knowledge Base and to link data (learning window). It consists in the
execution of a plan that requires the downloading of at least an external resource.

a time slot for the data inference (reasoning window). It consists in the execution of a plan that requires only axioms and
learned facts.

a time slot to access the Knowledge Base and to answering questions (teaching window)

The steps 2 and 3 can be iterated

This sequence is called KEES workflow and it is a continuous integration process that starts on user request, scheduled time or
after triggering an event (e.g. a dataset change).

Plan target graph

A target graph is a named graph in the knowledge base referenced by the property kees:build. In a KEES knowledge base,
every named graph shoud be referenced by exactly one plan through the kees:builds property.

Plans MUST provide enough information to describe how to build a new named graph or to update an existing one.

A very smart KEES agent (for instance a human person) could be able to understand higt level instructions (for instance, plain
English senteces) deducting missing information from the agent context or from experience.
Not smart KEES agent could be able just to interpereter low level language instruction.

Plan pre-conditions

Plans MUST be evaluated only if all pre-conditions are satisfied. If just one pre-condition fails, the plan execution
MUST be skipped or postponed without changing the knowledge base.

There are two kinds of preconditions, related with two properties: kees:accualPeriodicity and kees:requires .

Required URI pre-conditions

The kees:requires range MUST be an URI that represents a resource in the knowledge base. Multiple kees:requires are allowed.

If no kees:requires is present, then the pre-condition is always satisfied.

If the target graph is not jet created, then the pre-condition is always satisfied.

If just one required URI is not present in the knowledge base, then the precondition is not satisfied and
the rule execution MUST be potsponed.

If all required resources exists and are older than the last target graph creation date, then the precondition is not satisfied
and the rule MUST be skipped. A KEES agent SHOLUD be able to use the dct:modified and dct:created properties to compare
modification date.

If one required URI has not modification date, then the pre-condition is satisfied.

A possible implementation on an algorithm that decides if all kees:requieres is satisfied :

Accrual periodicity pre-condition

kees:accualPeriodicity expects exactly a URI that describes a frequency (i.e. once a month, once a year).
The KEES agent SHOULD recognize at least all concepts in sdmx-code:freq scheme for dct:accrualPeriodicity and use these information to
decide if executing a plan or not.

The accrual periodicity pre-condition is satisfied if

no kees:accualPeriodicity properties is defined or

no target graph exists

The accrual periodicity pre-condition is not satisfied and the rule MUST be skipped if the last update date of the target graph plus
the accrual ferquency is less than current time.

Default URI space prefix

A KEES agent SHOULD recognize void:uriSpace pattern in a knowledge base
making it available with the reserved prefix res_ in graph constructor.

Error conditions and error management

A KEES agent MUST update the RDF store safe statement when it enters or exits the teaching window.

If a KEES agent was unable to complete succesfully a plan, it MUST abort if the target named graph was partially builded, otherwhise
it MUST annotate the named graph with the prov:InvalidatedAtTime property and continue.

If the KEES agent aborts its execution, the knowledge base MUST resulting in a "not safe" state.

In normal operations, the existence of prov:InvalidatedAtTime in a named graph MUST prevent the KEES agent to enter the teaching window.
A KEES agent COULD provide a way to force a safe state when some named graph contains a prov:InvalidatedAtTime property.

The existence of prov:InvalidatedAtTime in SHOULD be signaled by KEES agent. How to signal is implementation dependent

The existence of activities without a plan SHOULD be signaled by KEES agent. This condition does not prevent the
KEES agent to enter the teaching window.

A KEES agent MUST abort in case of semantic inconsistences in KEES knowledge base definition.

Graph constructors

A constructor is a resource referenced by the kees:from property that MUST provide enough information to a KEES agent to populate
the target named graph. A constructor can be a script in some language (i.e. SPARQL) or a data provider.

KEES does not impose any requirement for a constructor, but expects that a KEES agent SHOULD be smart enough to
recognize and manage at least following kind of constructors:

a dereferenceable URL that provides RDF triple using one of the standard RDF serialization. In this case
the Agent SHOULD be able to download the resource content from the URL following HTTP(s) GET protocol specification (e.g. managing
redirection and content negotiation) and to extract from it the information serialized according with one of the RDF standards:
RDF/XML, turtle, json-ld, RDFa, Microdata, n3, N-Triples.

an object of type sp:Construct . In this case the KEES agent
should be able to run a SPARQL query described in the object and injecting the results in the target graph specified in
kees:builds property.

an object of type sp:Update . In this case the KEES agent
should be able to execute the SPARQL Update script y described in the object in the knowledge graph database.
The update script MUST NOT modify any graph described in other plans.

KEES agent protocol

A KEES Agent MUST be able to accept in input one or more URL dereferencing to kees:KnowledgeBaseDescription resources.
The input method is implentation dependent: a KEES agent can be implemented as a web service or as a command or as a job.

There is no direct way to detect if an agent is running or is aborted. But you can always check if the knowledge base is in a
safe state.

KEES agent implementation SHOULD add some log and monitor features.

KEES agent should be able to infer types from functional properties.

A KEES agent MUST implement a this process schema:

exit teaching window

execute axioms

ensure the integrity of the knowledge base descriptions. Abort if errors;

get a plan that matches all pre-conditions, then take the appropriate action looking the kees:from attribute and
assert all post-conditions, aborting if one them return false or there if a computation error.

repeat steps 2-3-4 until exists a matching plan

check if there are unexecuted plan (i.e. plans with unsatisfied pre-condition). If yes abort.

check no named graph was invalidated. If yes abort.

enter teaching window

(Optional) print an execution report

Example

[** WARNING: THIS SECTION IS INFORMATIVE AND SUBJECTED TO MAYOR CHANGS **]

Contributing to this site

A great way to contribute to the site is to create an issue on GitHub when you encounter a problem or something. We always appreciate it. You can also edit the code by yourself and create a pull request.