The PROV-CONSTRAINTS document complements the first three, and is focused on the notion of valid provenance. The intent of provenance validation is to ensure that a set of PROV statements represents a history of objects and their interactions which is consistent, and thus safe to use for the purpose of logical reasoning and other kinds of analysis.

Thus, the document can be used to design a validator that can be used to check the consistency of a PROV statements.

What is in the CONSTRAINTS document?

Three types of constraints are defined.

Uniqueness constraints. These include key constraints, stating for instance that identifier e is key for statement entity(e,attrs), but also constraints that state the uniqueness of events such as the generation of an entity by an activity. Constraint 25for example states that only one generation event can be associated to a generated entity and a generating activity:IF wasGeneratedBy(gen1; e,a,_t1,_attrs1) and wasGeneratedBy(gen2; e,a,_t2,_attrs2), THEN gen1 = gen2.

Event ordering constraints. These specify the possible orderings of events (generation, usage, invalidation of entities, start and end of activities) that correspond to a sensible history. For example, an entity should not be used before it is generated (Constraint 39):IF wasGeneratedBy(gen; e,_a1,_t1,_attrs1) and used(use; _a2,e,_t2,_attrs2) THEN gen precedes use.

Impossibility constraints. These are used to state for example that the same identifier cannot be used in two different relation types (i.e. entity(foo) and activity(foo) is an illegal combination), but also to state property of relations, for example “specialization is irreflexive” (Constraint 54): IF specializationOf(e,e) THEN INVALID.and “the set of entities and activities are disjoint” (Constraint 57):IF 'entity' ∈ typeOf(id) AND 'activity' ∈ typeOf(id) THEN INVALID.

Example

We now show an inference process involving ordering constaints, which leads to concluding that all the events involved in the provenance must all be simultaneous. Although logically this is a possibility, this is most likely an indication of some of the statements disrupt the consistency of the entire history. The example involves a case of mutual derivation of an entity from another. Consider the following statements:

Adding this new statement, however, creates a circular derivation between e1 and e2, an invalid situation. We therefore expect that our constraint system be able to tell us something interesting. Indeed, by application of the same Constraint 44, this new statement entails:

Conclusion

This example was simple and may not have required an automated validator to detect invalidation. However, when graph patterns become more complex, an automated validator turns out to be an essential component for provenance user, whether they intend to publish provenance, or whether they intend to consume it. The prov-constraints document defines a set of constraints that validators are expected to implement.

We encourage developers to implement these constraints. Several
people are already working on validators and we encourage you to do so
as well.