VOUnits 1.0 TCG Review: 2014 March 1 to 2014 March 31

The VOUnits Proposed Recommendation is now in its TCG review period.

The last VOUnits RFC implied a small collection of document and implementation changes which, it turned out, had consequences more far-reaching than was immediately apparent. As well, some new participants appeared in the discussion at this point. The result is a set of document and syntax changes which, it has been decided, are substantial enough to warrant a new RFC.

The differences between version 1.0-20131224 and 1.0-20140226 are minor wording and layout changes. There are a few subsequent typo-level changes which can be found in the repository version, and which will be included in the next published release (after TCG review).

The main visible changes in version 1.0-20131224 are

the appearance of 'quoted units', thus permitting for example km/'martianDay' (see the document for discussion; this was the principal syntactic change which ended up necessitating the new version)

recognition for unknown functions, so that for example dBA(ct/s) includes an unknown function dBA; this is explicitly permitted by the standard, though a parser should make available the information that the function is not a recognised one

the appearance of binary prefixes (Ki, Mi, ...) on certain units

more explicit statements that the indication that units are unknown is the special string unknown, and that the indication that a quantity is dimensionless is that the units string is the empty string

various textual clarifications to the document.

The editors believe we have accommodated or acknowledged each of the concerns on the original RFC page, either in the document or on that page. Please feel free to disagree.

We do not believe these are deep changes in principle, so those WG chairs who were happy with the document in its previous incarnation may not need to examine it too closely this time.

Changes are reported in Appendix D of the new version of this proposed recommendation, available below.

Comments from TCG (WG and IG chairs)

WG chairs or vice chairs must read the Document, provide comments if any and formally indicate if they approve or not the Standard. IG chairs or vice chairs are also encouraged to do the same, although their inputs are not compulsory.

Actually DAL protocols have little explicit dependency on the way units are written.

This is well stated in 1.1. Queries (PQL or ADQL) are made assuming consistency with some implicit unit rule defined in the protocols and responses are generally in VOTABLE or FITS where VOunit definitions can perfectly apply.

Other formats for responses or retrieval (native xml) could easilly be made consistent with the unit specification.

The inclusion of section: "2.10 The numeric scale factor", opens a door to many different use cases. I would have preferred to split units as scaling factor and base units (equivalent to a dimensional equation) as two elements in a DM (so the software does not need to parse strings) but, including this section, the spec is a lot more useful, in my view, than previous version. Also, there are some libraries to parse units strings.

However, it is not clear to me the need of so many warnings about the use of the scaling. As said in the text:

"The advantage of doing so is that the data consumer can translatethe column data into well-known physical units without further information,and the data source is thus self-contained. The disadvantage of doing sois (i) that the intention might be obscured (this is a type of provenanceinformation); and (ii) that the measurements may be relative to the actualjupiter mass rather than merely expressed in those terms, so that they shouldchange if the actual mass were to be refined as a result of a recalibration."

Why does this affect only to the example (jupiterMass)? e.g. solMass is also an accepted unit and it is not clear if this is the current Solar Mass or after "recalibration". Another accepted unit is Crab, that it is time varying (and it also depends on the spectral coordinate) so we could ask if the data provider is using Crab as the Crab emission at the time of creating the measurement or now. And, using which calibration for the Crab emission? Similar arguments can be done to many of the accepted units.

As we are accepting scale factors, it would be simpler to say:

"Data consumer can translate the column data into well-known physical units without further information and the data source is thus self-contained"

or, but this is cumbersome, we should add similar warnings on the rest of "problematic" accepted units.

Apart from this and as I said to previous version, the document is well written and clear. I approve it.

Thanks for these observations, Jesus. There are indeed many subtleties involved in this, and this general question was discussed at some length (on the semantics list, only) without coming to a clear conclusion. Our intention as authors was to avoid the question (in fact the multiple questions) 'what does jupMass mean?' because, as you illustrate, it's a complicated question. Instead, we decided to focus purely on the unit-string syntax, requiring such semantic subtleties to be communicated through some other channel. The prompt for that discussion was whether 'unknown' units should be completely forbidden, on the grounds that they could all in principle be substituted by a scalefactor+'known unit'; we decided that was going too far. Is there a particular point we should clarify? I take the point about this not being just about jupiter masses: I've adjusted the text to try to make this clearer (cf repo).

Thanks for the clarification in the text. My main point was that I see clearly the benefict of the scaling factor (to have a more flexible definition of units). I do not consider so strong the disadvantages to have so relevant space in the text as are, in my view, minimal. I do not think there is something obscure using a scaling factor as if you are using it it will be, in 99% of the cases, because your units are not in the predefined list. After the clarification added in the text, it is clear that there is always some ambiguity when you use units (obviously greater if you need to use units not clearly defined in the spec) so it is OK for me.