[freed from spam trap -rrs]
Date: Tue, 21 May 2002 18:28:43 -0400 (EDT)
Message-Id: <a0510030bb9107995001b@[65.217.30.61]>
To: Sergey Melnik <melnik@db.stanford.edu>
From: pat hayes <phayes@mail.coginst.uwf.edu>
Cc: w3c-rdfcore-wg@w3.org
>At the last telecon we briefly discussed the issue related to the
>semantics of literals.
>
>Per F2F decision, the literals have three components (unicode
>string, language tag, and a bit). This representation may not be the
>best. Here are several concerns:
>
>(1) Interpretation
>
>It is unclear what the literals represent. It seems that a literal can denote
>
> a) a character string
> b) a word in a natural language
> c) an XML tree
> d) an abstract structure that consists of a string,
> a tag, and a bit.
>
>Choice d) seems ugly if we think of RDF as a foundation for the SW.
>If we go for a)-c), then the literals become polymorphic...
>Furthermore, defining rules for comparing trees and words seems
>counterproductive.
>
>(2) Extensibility
>
>The language tags keep evolving. How do we accommodate new language
>encoding schemes gracefully?
>
>The current XML standard may be surpassed. How do we indicate what
>particular XML encoding or canonical form (or maybe a completely
>different graph-like structure) is used?
>
>
>In short, I think that we might be doing a bad job on literals. I'm
>afraid that additional difficulties may arise in datatyping (e.g.,
>we might need to deal with XML trees in lexical spaces of datatypes).
>
>BTW, did TimBL and DanC, the original issue raisers, finally take a
>position to the F2F decision (comp. [1])? Unfortunately, I missed
>that F2F.
I also missed it. My understanding of the decision was that a literal
is best thought of as a unicode character string plus some additional
decorations whose function is to record XML-specific syntactic
information which has no RDF semantic content but which RDF
nevertheless needs to record in order to properly permit
round-tripping from XML. In particular, both your b and c are ruled
out: the correct syntactic answer is d, but the model theory can
treat d as though it were a.
> A cleaner solution might be/have been to leave literals as strings
>and to use bNodes with special properties for representing words and
>XML structures.
That would be cleaner in the RDF graph but would probably break the RDF/XML.
Pat
--