A new tool, the Constellation Query Language

A new tool, the Constellation Query Language

Some of you know that I am working on an implementation of a structured-text language in the ORM/SBVR family. After more than a year of work, I'd like to report some recent milestones.

The CQL Data Definition Language is complete but for most external constraints, which I'm working on completing now. I can dump any nORMa model to CQL, and I can generate Ruby code for an object-oriented API to manage the fact populations allowed by the model. I can also compile the CQL, emitting either the same CQL or the same Ruby code, except where the Ruby code depends on constraints that were not converted in the first instance. The generated Ruby code doesn't yet enforce any constraints other than those implied by its structure.

In some cases, nORMa models require additional readings or changes in the wording of readings to allow CQL to parse the input. For example, it's not legal to use names of existing concepts (entity types or value types) as linking words in a reading; such names may only be used where the implied concept is intended.

The generators make use of Ruby code to store a compiled model, and
this code is itself generated from a nORMa (or CQL!) meta-model. My metamodel adopts relevant terminology from SBVR, so rather than use "model" or "schema", I use "vocabulary". I also use the term "concept" to denote a object type (entity type or value type). The metamodel also includes a number of extensions to the core ORM2 concepts, such as units for value types, ability to import (and alias concepts from) external vocabularies. These features aren't yet implemented in the CQL language yet. However, I believe that my metamodel is a suitable basis for evolving a shared understanding of fact orientation, and I'd welcome this discussion to be re-opened by anyone with an interest in improving it.

After completing the external constraints, I will implement relational mapping and conceptual queries (most query syntax is already recognised), and the generated Ruby data management code will be wired up to retrieve, modify and save data from the relational database.

In the meantime, I plan to make the nORMa->CQL converter available as a web service through my website http://dataconstellation.com. Until I do that, feel free to make contact with me by emailing cjh@dataconstellation.com, and send me an nORMa file if you want to know how your model will look in CQL.

At present, almost all of the implementation of CQL is open source, hosted on rubyforge.org as the ActiveFacts project. The Ruby version of these tools will always be open source, but in future, I envisage commercialising application generators for languages other than Ruby, so if you would take objection to helping me in such an endeavour, please refrain! I need some way to rationalise the many months of effort that have gone into the tools so far.

I'll follow up with a couple of simple examples of CQL, and leave it to you to comment.

The CompanyDirector model

/* * Value Types */CompanyName is defined as VariableLengthText(48);Name is defined as VariableLengthText(48);

/* * Entity Types */Company is identified by CompanyName where Company is called exactly one CompanyName, CompanyName is of at most one Company;Meeting is where Company held meeting on Date;Meeting is board meeting;

Person is identified by given-Name and family-Name where Person has exactly one given-Name, given-Name is of Person, family-Name is of Person, Person is called at most one family-Name;Person was born on at most one birth-Date;Attendance is where Person (as Attendee) attended Meeting, Meeting was attended by Attendee;Directorship is where Person (as Director) directs Company, Company is directed by at least one Director;Directorship began on exactly one appointment-Date;

CQL - the OilSupply sample model

/* * Value Types */Cost is defined as Money();Month is defined as VariableLengthText(3);Product is defined as VariableLengthText(80);Quantity is defined as UnsignedInteger(32);RefineryName is defined as VariableLengthText(80);Region is defined as VariableLengthText(80);Season is defined as VariableLengthText(6) restricted to {'Spring', 'Summer', 'Autumn', 'Winter'};TransportMethod is defined as VariableLengthText() restricted to {'Rail', 'Road', 'Sea'};Year is defined as SignedInteger(32);

/* * Entity Types */Refinery is identified by RefineryName where Refinery has exactly one RefineryName, RefineryName is of at most one Refinery;TransportRoute is where TransportMethod transportation is available from Refinery to Region at at most one Cost per kL, TransportMethod transportation is available to Region from Refinery at Cost per kL;

SupplyPeriod is identified by Month and Year where SupplyPeriod is in exactly one Month, SupplyPeriod is in exactly one Year;ProductionCommitment is where Refinery has committed to produce Quantity of Product in SupplyPeriod at at most one Cost;RegionalDemand is where Region will need at most one Quantity of Product in SupplyPeriod;

/* * Fact Types */Month is in exactly one Season;AcceptableSubstitutes is where Product may be substituted by alternate-Product in Season [acyclic, intransitive], alternate-Product is an acceptable substitute for Product in Season;

Re: A new tool, the Constellation Query Language

I'm just now taking a first look at your "Introduction to CQL" draft version pdf file. My first impression is that it has a familiar ring to ORM trained ears. The reasons why that is are obvious (at least to me, as you've talked about the ORM theory basis to your work). That familiarity may be a two edged sword. On the one hand, it makes it easier for ORM users to get up to speed; but on the other, the concepts (if all new to a reader), may be a lot to take in. Looks like you've taken this into account. However, consider making a reference to Object Role Modeling right up front - I think that would help both ORM and non-ORM users. Where a term or concept has an understood meaning in ORM, use a form of notation to indicate if it is as it is in ORM, or has a different meaning (perhaps with a footnote to explain the difference). Given that many of the early adopters for CQL will an ORM background, the reference notation would be helpful.

White space and comments like C and C++ are allowed: /* comment may

span lines */ and // introduces a comment to end of the current line.

...comment indicators as used in C and C++..., is clearer

Each CQL file must start with a vocabulary definition.

This looks to be an important point; seperate and add emphasis.

An import definition imports concept names from another vocabulary, possiblyusing the alias syntax to rename some terms:

No reference yet to Vocabulary. If these are in the SBVR context, mention that, as I suggested for ORM terms.

Technically, where a fact type isn’t named, it isn’t treatedas a concept, as it cannot play roles in other fact types. Syntactically thoughit’s more convenient to discuss these together.

Indicate how this is related to Objectification, you mentioned earlier.

Well, that's about a third way through. I'll take shot at the rest, another time. Glad you're putting together such a document. I'll bet writing it has helped refine your own thoughts about CQL.

Re: A new tool, the Constellation Query Language

I forgot to mention this earlier, but .orm (nORMa tool files), uploaded to the Library section here, ought to make good test fodder for your converter (as in the model fragment I uploaded about caller preferences: http://www.ormfoundation.org/files/folders/norma/entry1191.aspx ) Posting the concerted file up for comparison couldn't hurt.

Re: A new tool, the Constellation Query Language

CQL for your caller preferences model follows. It lacks subtype exclusion between the CommunicationPath subtypes, by the way. The uniqueness constraints at the end of the file would have been embedded in a reading in the fact type definition, if a reading in the correct role order was provided. This CQL file compiles and re-emits identical CQL, meaning the converter works.

*** Heath.

vocabulary Com;

/* * Value Types */CommunicationPath_code is defined as FixedLengthText();Contact_name is defined as VariableLengthText();Device_type is defined as VariableLengthText();TimeSlot_type is defined as VariableLengthText();Use_type is defined as VariableLengthText();

/* * Entity Types */CommunicationPath is identified by CommunicationPath_code where CommunicationPath has exactly one CommunicationPath_code, CommunicationPath_code is of at most one CommunicationPath;

Contact is identified by Contact_name where Contact has exactly one Contact_name, Contact_name is of at most one Contact;

Device is identified by Device_type where Device has exactly one Device_type, Device_type is of at most one Device;

EmailAddress is a kind of CommunicationPath;Contact has EmailAddress;

PhoneNumber is a kind of CommunicationPath;Contact has PhoneNumber;PhoneNumber is for at most one Device;

TimeSlot is identified by TimeSlot_type where TimeSlot has exactly one TimeSlot_type, TimeSlot_type is of at most one TimeSlot;ContactPrefersCommunicationPathDuringTimeSlot is where Contact prefers at most one CommunicationPath during TimeSlot;

Use is identified by Use_type where Use has exactly one Use_type, Use_type is of at most one Use;PhoneNumberIsForUse is where PhoneNumber is for Use;

/* * Constraints: */each EmailAddress occurs at most one time in Contact has EmailAddress;each PhoneNumber occurs at most one time in Contact has PhoneNumber;