Dates

There are two distinct date representations in STEMMA: the
date-value and the date-entity. These are equivalent for many purposes but the
date-entity affords greater flexibility and scope. A date-value is encoded in a
single text string, while a date-entity is a combination of elements and
attributes.

The general representation of date-values is mentioned under
Locale-independence.
STEMMA must accommodate multiple calendars but, at the time of writing, no
international standard yet exists. It therefore introduces a practical
date-value string representation for world calendars and differentiates the ISO
and STEMMA forms accepted in the STEMMA syntax as follows.

iso-datetime— full ISO 8601 Gregorian date
and time, i.e. yyyy-mm-ddThh:mm:ssZ.

std-date— STEMMA date referenced in specific calendar, analogous to iso-date. Calendar can be
specified by optional prefix, as described under Dates
and Calendars research notes, or implied by the syntactic context. The
reserved value “?” indicates not
known.

std-fulldate— full STEMMA date reference in
specific calendar. That is, with a granularity of only one day. The
reserved value “?” indicates not
known.

When a date is referenced in an element, it may have
both a granularity and an imprecision (i.e. a margin of error)
associated with it. The granularity is implicit in the date-value string (see
under Dates
and Calendars). The imprecision can be represented by a +/- offset or more
explicit min/max limits in a date-entity.

The default calendar name is “Gregorian”. The calendar may
be specified explicitly in the Calendar attribute or in the date-value (as
described above), but they must not conflict.

This date-entity effectively allows a specific date to be
represented using a value, or a range of values, from one-or-more calendars,
and this is used in modelling synchronised
dates (aka: dual dates). The Calc attribute indicates that the value for
that calendar was calculated as opposed to being recorded as part of the original
information. A discussion of this, with examples, may be found at: Synchronised
Dates.

A date-value may imply a granularity other than one day
using truncated forms. For Gregorian dates, this includes the normal yyyy-mm and
yyyy, as in the ISO standard, but also yyyy-mm:xx and yyyy:xx. For comparative
purposes (e.g. sorting and collation) the truncated variants are equivalent to
a corresponding pair of <Min> and <Max> elements. The default error
margin is ± 0. The margin units depend on the granularity of the date-value.
Hence, a full yyyy-mm-dd specification would expect a margin in days. If the
date-value is truncated to yyyy-mm: then any margin would be in months. If the
date value is truncated to yyyy: then any margin would be in years. The
<Min> and <Max> must always be full-length dates (e.g. yyyy-mm-dd
in the Gregorian case).

A representation of yearly quarters (e.g. Q1 = January to
March) is noticeably absent from the ISO 8601 standard. Given the way that it
represents week numbers, it should have made provision for the format yyyy-Qq,
e.g. 1956-Q2. STEMMA acknowledges the importance of this granularity for
certain records and so accommodates it in its own world-calendar syntax. The
units of any margin would then be quarters of course.

The following table indicates how a Margin specification is
interpreted in the context of the date units to yield equivalent Min/Max
values.

Date form

Margin units

Equivalent Min

Equivalent Max

yyyy-mm-dd

Days

The day - margin

The day + margin

yyyy-mm

Months

First day of (month - margin)

Last day of (month + margin)

yyyy

Years

First day of first month of (year - margin)

Last day of last month of (year + margin)

yyyy-mm:03

Quarters

First day of first month of (quarter - margin)

Last day of last month of (quarter + margin)

When deterministic dates, such as our normal Gregorian ones,
are loaded into some type of indexing system, like a database, it is expected
that they will all be stored as pairs of internal 'timestamp' values, i.e. one
each for the upper and lower limits. Timestamps represent points-in-time along
an absolute timeline, starting at some arbitrary base date (aka: epoch). Since
these are usually represented as binary long-integers then it means issues such
as the external date representation, imprecision, TZ, etc., all become
irrelevant and the values can all be handled efficiently in the same manner.

The following table indicates how comparisons should be implemented
between dates when either one of them may be a simple discrete date (e.g. A) or
an inclusive date range with an upper and lower limit (e.g. [A1,A2]). In the
context of a date range, equality is roughly translated as “some degree of
overlap”.

A op B

[A1,A2] op [B1,B2]

A op [B1,B2]

[A1,A2] op B

A > B

A1 > B2

A > B2

A1 > B

A < B

A2 < B1

A < B1

A2 < B

A = B

A2 >= B1 & A1 <= B2

A >= B1 & A <= B2

A2 >= B & A1 <= B

A >= B

A1 >= B1

A >= B1

A1 >= B

A <= B

A2 <= B2

A <= B2

A2 <= B

A <> B

A2 < B1 or A1 > B2

A < B1 or A > B2

A2 < B or A1 > B

Q: Should the Dataset header specify a default Calendar or
simply assume Gregorian as the default? Most Datasets will involve dates from
one predominant Calendar and so it would be more convenient to specify a default
for cases when no explicit one has been provided. See Locale-independence
for potential Calendar names.