On May 13, 2011, at 10:00 AM, Richard Cyganiak wrote:
> On 13 May 2011, at 15:33, Alex Hall wrote:
>> It's for this reason that I'd prefer to keep rdf:PlainLiteral out of the core RDF specs and reserve it for exchanging language-tagged literals with systems that don't support that notion. Having to deal with the extraneous '@' for literals without language tags seems like needless complexity for what should be a simple string manipulation.
>
> Strong +1. Earlier I tried to work out the changes to the spec that would be required to make rdf:PlainLiteral the unified representation of strings, and it's a bloody mess and I really don't want to go there.
I agree, but if we have to (a) include lang tags and (b) fit within the current RDF description of a datatype (which mentions a mapping from a string to a value, not from a pair to a value) then this is about the best that can be done, I think. (I was part of the debates that led to this design, and tried very hard myself to get rid of the trailing @ at the time, but couldn't find a way to do it.) Actually, I don't think it ALL that much of a mess: one trailing @ character isn't a bone-breaker, surely, to anyone who has to take a URI apart every now ant again. BUt it would be neater without it, for sure.
HOWEVER... I think we do have another way out. Unlike the designers of rdf:PlainLIteral, who were obliged to work within the constraints of the current RDF design, we can re-design RDF. See below.
> I kept my notes on the wiki anyways:
> http://www.w3.org/2011/rdf-wg/wiki/StringLiterals/SyntacticSugarProposal
>
>> If we're going to say that everything has a datatype, I'd prefer to see "foo" get normalized to "foo"^^xsd:string. But my reasons there are more aesthetic; it just seems wrong to single out that one particular primitive datatype and say that it should not be used.
>>
>
>> FWIW, my preferred approach would be to:
>> 1. Say that every literal has *either* a datatype *or* a language tag.
>> 2. Say that the datatype of the surface form "foo" is xsd:string.
>
> This feels weird. Ok, "foo" is of type string, even though the type is implicit, I can understand that. But why is it no longer a string if I tag it as English? Shouldn't it still have an implicit type of string?
The string itself is still a string, but the literal is not just that string, its that string plus a tag, ie a pair. Which is why it â€“ the literal rather than the the string â€“ can't be typed with xsd:string. Sigh.
But try this for size. Plain literals are a very special case, unique to RDF, and it is the language tag which makes them so special and strange. Datatypes are defined currently as mappings from a string to a value (so the rdf:PlainLiteral had to smush the tag into the string, hence all the @ business.) But we can define a special datatype which maps pairs into values, just for this purpose. We can even call it rdf:PlainLIteral without contradicting the current specs.
It applies to two kinds of lexical forms: strings (these will be the ones with the @ in them), and pairs of a string with a lang tag. The lang tag may be the empty tag, but still we distinguish between S and <S, empty>. This, every plain literal is assumed to have a lang tag in it, even when there is no @ in the syntax.
Its value space is the set of strings containing at lest one '@' character, and pairs of a string and a language tag. The mapping follows the current rdf:PlainLiteral spec when applied to strings, so that "foo@en"^^rdf:PlainLiteral maps to <"foo", "en"> ; but in addition, it applies to current plain literal syntax, treated as being a pair of a string and a lang tag, so that "foo"@en also maps to <"foo", "en">. Here is the complete mapping as a table:
Lexical form value
"foo@" "foo"
"foo@tag" <"foo", tag>
"foo", empty "foo"
"foo", tag <"foo", tag> when tag =/= empty
and the plain literal syntax is understood thus: "foo" parses to "foo", empty and "foo"@tag parses to "foo", tag .
The reason for this empty-tag shuffle is to keep a plain literal string distinguished from the rdf:PlainLIteral string with the trailing @ added, of course. If we could ignore the current rdf:PlainLIteral specs, this would be easier and we could simply map "foo" to itself and "foo"@en to <"foo", en>. But I think the shuffling is worth doing to avoid having even more inter-specs contradictions in this area.
Advantages: Gives a type to plain literals; preserves rdf:PlainLIteral specs (extending them, but not contradicting them); allows people to use plain literals without getting involved with trailing @; and allows xsd:string to be deprecated in favor of plain literal syntax (or the reverse, of course.)
Disadvantages: might be thought too complicated; takes the notion of type slightly outside the current RDF datatype specs.
Thoughts?
Pat
> So you have replaced one weird thing (multiple ways of representing a string) with another weird thing (a notion of string datatypes that doesn't make sense).
>
> I think the sensible way would be:
> 1) every literal has *both* a datatype and a (possibly empty) language tag;
EVERY literal? What about numbers and dates and times and ... ?
> 2) of the built-in datatypes, only xsd:string can have non-empty language tags;
> 3) plain literals and rdf:PlainLiterals don't exist;
> 4) "foo" in concrete syntaxes is syntactic sugar for "foo"^^xsd:string.
> 5) "foo"@en in concrete syntaxes is syntactic sugar for "foo"^^xsd:string@en.
>
> This *might* work better than the rdf:PlainLiteral mess when translated into spec changes, but raises BC issues, and requires changes to syntax specs to add the syntactic sugar, so I prefer the proposal that says implementations MAY unify to plain literals, as it doesn't require changes to the abstract syntax.
>
>> As long as the surface forms "foo" and "foo"^^xsd:string get normalized to the same thing (or systems have permission to do such normalization) then I'm happy.
>
> Good to hear that.
>
> Best,
> Richard
------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32502 (850)291 0667 mobile
phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes