Person 'Mike' lives in Country 'UK' ; Person 'Mike' lives in Country 'US'

Person 'Jeff' is domiciled for tax purposes in Country 'CH'

to fact type verbalisation including constraints:

Each Person was born in exactly one Country

Each Person lives in at least one Country

Each Person is domiciled for tax purposes in at most one Country

But for unary facts there are no constraints to qualify fact type verbalisation; this leaves the fact type verbalisations "plain":

fact instance: Person 'Jeff' is a smoker

fact type: Person is a smoker

This tends to confuse business people when validating the model -- they think it means "Each (every) Person is a smoker" rather than "facts of the form 'Person <X> is a smoker' may be asserted.

I was thinking of verbalisations such as "Person is (or is not) a smoker" but there are couple of problems with that:

i) there are practical difficulties caused by having to identify and correctly negate the verb to form the negative verbalisation -- this could be worked around with a little clumsiness (e.g. "Person is a smoker (or not)") and/or by having the modeler provide alternate readings (as we do for reverse readings on binaries)

ii) this leads directly into the issue of CWA vs OWA vs OWN and how to verbalise them.The "(or not)" verbalisation could represent CWA, for example

I've read IM&RDB 2ed [Halpin2008] and can't recall (or find in the index) any discussion of issues with unary verbalisation, or of how to verbalise OWA/OWN facts. Can anyone point me in the right direction?

Re: Verbalisations for unary facts (CWA, OWA and OWN)

Thanks for your query about verbalizing unary facts and unary fact types (e.g. Person smokes). This is the first I've heard of any business people misinterpreting NORMA's unary fact type verbalization as implying the fact type role is mandatory. If they read the fact type that way, then I'm surprised they don't read binary and longer fact types the same way (e.g. treating "Person drives Car" to have its first role mandatory). I think our current verbalization (e.g. Person smokes) is OK in this regard, since it treats all fact types consistently, regardless of their arity (number of fact roles).

The main problem with your suggestion of "Person is (or is not) a smoker" is that is tautologous. For example, if you instantiate, it it doesn't really say anything, e.g. Person 'Fred Bloggs' is (or is not) a smoker.

You do however have a point in that NORMA currently does not support the distinction between CWA, OWA and OWN interpretations of unaries. We plan to add support for this in the future.

Regarding the user supplying negative readings for fact types, this is currently an option we are considering, mainly to support clean verbalizations of formal derivation rules (for subtypes and fact types), textual constraints, and queries. For example, given the fact type "Person drives Car", you might want to define the subtype NonDriver by selecting Person and negating the drives path to car. Rather than verbalizing the definiens clumsily as "Person not drives Car", it is desirable to verbalize this more naturally, e.g. "Person who drives no Car" or "Person who does not drive any Car". Until we add sophisticated linguistic support to automatically generate negation verbalizations, we could allow the user to manually supply that for us when they enter the fact type.

Re: Verbalisations for unary facts (CWA, OWA and OWN)

I think I was a little unclear about my tentative "(or not)" proposal -- my idea was that the "(or not)" qualifier would only be added in the fact type verbalisation (or, equivalently, would be removed in the fact instance verbalisation. That would avoid the grossly tautological fact instance verbalisation -- but you are right that it would be inconsistent with the way that other fact types are verbalised, which is a good reason to reject it.

I think the problem I've found with business user acceptance may be due to the way that I've been presenting the more common binary (and higher arity) fact types. Because these fact types almost always have some constraints (mandatory, uniqueness) I've often verbalised the fact type just using the constraint verbalisation

e.g.

> Each Person was born in exactly one Country

rather than first "defining" the fact type using the unconstrained verbalisation, and then qualifying it

> "Person was born in Country"

> :--- Each Person was born in exactly one Country

(And even where there are no constraints, we can verbalise the lack of constraints as "has zero or more ...").

Of course, unaries don't often have such constraints, so their verbalisation is just the fact type verbalisation, which then "feels" different. Taking the latter (define then constrain) approach for the more common binaries would highlight the similarity in form with the unaries, and should reduce confusion.

Thinking further, the idea of providing negative readings (we could use the existing / notation, since unaries do not need "reverse order" reading) would be essentially a short-hand for the re-factoring of a single unary into two mandatory disjunctive mutually-exclusive unaries. If that was done, then the constraint-readings would be available, allowing the fact types to be implied from the constraint readings.

> Each Person is a smoker or is a non-smoker but not both.

I'm particularly interested in the use of OWA and OWN in ORM -- for binary and higher facts as well as unaries. I have an application in mind that must deal with missing information (information that should in principle be available but in practice isn't because data becomes available at different times -- alethically OWN but deontically CWA, you might say).

The "negative unary reading" idea, and its equivalence to a mutually-exclude two-unary model, might help with modelling OWN too (since the CWA two-unary model is easily translated to the OWN equivalent by removing the disjunctive mandatory constraint, and perhaps automatically adding an "It is known that ..." prefix to the fact type and instance verbalisations).

Re: Verbalisations for unary facts (CWA, OWA and OWN)

Thanks for your quick response. As you indicate, our approach has always been to first declare the fact type, and later add constraints to it. The ability to speak of a fact type independent of its constraints is helpful in many ways. For example, with multi-model applications, the same fact type may appear in different models but with different constraints. While multi-model support has not yet been added to NORMA, we do plan to add it. The fact-oriented modeling tools Dogma Studio and Collibra already do this, using the term "lexon" for a constraint-free fact type used as an atomic unit of exchange.

Your alethically OWN but deontically CWA example is similar to a general feature we are hoping to support that allows you to initially prototype with incomplete data (where various constraints, especially simple mandatory constraints, are treated as deontic) while requiring strict enforcement at a later stage (where relevant, formerly deontic constraints are now alethic).

Re: Verbalisations for unary facts (CWA, OWA and OWN)

A possible way of handling verbalisation of unary fact types might be found from the outer join syntax for CQL, namely the addition of the prefix maybe:

maybe Person is a smoker

This could be expanded to the SVBR form:

It is possible that Person is a smoker;

Regarding negative fact type invocations (for example within a query). Rather than attempt to build sophisticated linguistics, I simply ban the use in readings of the words no and not, and if those words appear in an invocation anywhere (but prior to the first term, where they're disallowed), the invocation is interpreted in the negative.This doesn't help in providing automatic verbalisation, but CQL doesn't do that; and this rule works for all the cases I've tried. For example, "Person is not a smoker". I don't use the form "Person smokes", which eliminates the need to handle "Person does not smoke". "Person drives no Car" is adequate, though we might normally say "Person does not drive a Car".

Terry:

The "embedded" constraints in CQL are not part of the reading, and the fact type is later invoked without them. The ability to inject simple presence constraints into readings avoids the *much* more verbose syntax for external constraints. It is important to ensure that the fact type readings make sense with the quantifier removed!

In relation to deontic constraints, CQL uses the alethic syntax, but appends the interjection (otherwise verb agent), indicating that the constraint is not necessarily enforced, and instead signifying some enforcement action to take when it's violated. Typical verbs would be email, SMS, alert, log, etc, and the agent is a designation of an intended recipient or target of that action. I have always felt that deontic constraint, though semantically significant, are rather useless without indicating the severity or the required enforcement response to a violation; even though the response is not a part of the conceptual model.