From: "Eric Prud'hommeaux" <eric@w3.org>
Subject: Re: problems with concise bounded descriptions
Date: Thu, 30 Sep 2004 22:17:32 -0400
[...]
> On Thu, Sep 30, 2004 at 07:39:32PM -0400, Peter F. Patel-Schneider wrote:
> >
> > In the DAWG message archive I came across a reference to a W3C member
> > submission from Nokia on Concise Bounded Descriptions
> > http://www.w3.org/Submission/CBD/.
> >
> > The notion of Concise Bounded Descriptions (CBD) in this note has a number
> > of problems.
> >
> > The initial description of a CBD is severely underspecified. According to
> > the note, ``A [CBD] of a resource is a body of knowledge about that
> > resource which does not include any explicit knowledge about any other
> > resource which can be obtained separately from the same source.''
> >
> > Problem 1: Which source?
>
> The query service.
This doesn't help (much), as a query service could have lots of
information, much of it not in the form of an RDF graph.
> > Problem 2: What is ``explicit'' knowledge?
>
> I'm not sure I would have chosen ``explicit'', but I believe this is
> the set of arcs-out from a resource which is reached in a CBD
> traversal. All arcs-out from R1 are included in the CBD. If that graph
> involves R2 (and R2 isn't a literal or bNode), the client can ask
> about R2 in a separate request. Thus, arcs-out from R2 are not
> included in R1's CBD.
>
> Perhaps ``minutiae'' would be better?
Well, yes, of course the ``answer'' is the result of the CBD process, but
this isn't very helpful. What makes the result of the CBD process any
better than any other answer? For example, why exclude arcs-in? Are they
not explicitly about the resource just as much as arcs-out?
> > Problem 3: What is ``obtain separately''?
>
> Subsequent query.
Using what query process? There have been some suggestions in DAWG to
allow b-node identifications in query responses in a way that subsequent
queries can get other information directly about bnodes in the response.
> > Problem 4: A function that always returns nothing satisfies this
> > description, as it certainly does not include any knowledge (explicit or
> > not) that be obtained (separately or not) from the same source (or indeed
> > any source at all).
>
> Yes, but it is not compiant with the recipe in the
> specification. Perhaps the description could be amended to make it
> more clear, but I wouldn't expect it to stand on it's own as the
> definition.
Again, of course, this is not the same as the process given in the note.
However, what makes my process any worse than the given process? In fact,
my process is better than the CBD process! My process satisfies the
description whereas the CBD process does not.
> > The definition of CBD in terms of a procedure on RDF graphs also has
> > serious problems.
> >
> > Problem 5: Given a node in an RDF graph, there is no general way of
> > determining which nodes in the graph are co-denotational with that node.
> > Consider, for example, the RDF graph:
> > _:a ex:b _:c .
> > _:d ex:e _:f .
> > What is the CBD of _:a in this graph?
>
> Being a pragmatist (for which I recieve the occasional slap), I would
> say we are responding with a CBD of what we *do* know about _:a, and
> thusly return only the first arc. If we later learn that _:a and _:d
> are the same arc, and the client queris again, they get more arcs, but
> nothing contradictory.
Ok, this is a possible answer, but the note does not define the CBD this
way.
> > Problem 6: This definition does not satisfy the initial description of a
> > CBD. Consider, for example, the RDF graph:
> > ex:a ex:b ex:c .
> > ex:r rdf:type rdf:Statement .
> > ex:r rdf:subject ex:a .
> > ex:r rdf:predicate ex:b .
> > ex:r rdf:object ex:c .
> > the CBD of ex:a in this graph is the graph itself, but it includes explicit
> > information about ex:r, a potentially different resource.
>
> I haven't really explored CBDs of reifications. Patrick, do you have
> any fun use cases for this? Regardless, Peter, do you have any
> suggested words for Patrick to include the reification arcs in the
> initial description?
I do not have any intuitions in this area and thus have no ``words of
wisdom''. All I can really do here is point out flaws (technical and
otherwise) in other's ``words of wisdom''.
> > Problem 7: This definition does not provide enough information to
> > distinguish the node from other distinguishable nodes in the graph.
> > Consider, for example, the RDF graph:
> > ex:r rdf:type owl:InverseFunctionalProperty .
> > _:a ex:r _:b .
> > _:b ex:r _:a .
> > _:a ex:s "NODE A" .
> > _:b ex:s "NODE B" .
> > Then the CBD of _:a in this graph is
> > _:x1 ex:r _:x2 .
> > _:x2 ex:r _:x1 .
> > which is the same as the CBD of _:b in this graph but _:a and _:b are
> > distinguishable in the graph and thus should have different CBDs.
>
> Yeah, but nothing else sovles that either. They're ambiguous to the
> server and they're ambiguous to the client. The only additional info
> that the server has is that there exists in the domain of discourse
> another bNode. I don't think it's worth telling the client about it.
The whole point in this example is that _:a and _:b are *not*
distinguishable to the server. _:a is "NODE A" and _:b is "NODE B". As
the two nodes are distinguishable to the server, they should have different
CBDs.
There are lots of other examples that give rise to similar problems. For
example, consider
_:a ex:r _:b .
_:b ex:r _:a .
ex:c ex:s _:a .
ex:d ex:s _:d .
_:a and _:b are distinguishable here but have the same CBD.
[...]
> --
> -eric
Peter F. Patel-Schneider