Based in the UK and working globally, Cloud of Data's consultancy services help clients understand the implications of taking data and more to the Cloud.
If you'd like to discuss how we can help your organisation, get in touch.

More Linked Data and RDF

Thank you to everyone who took the time to share a wide range of views in response to yesterday’s post in its comments, on Twitter, and out on your own blogs. Although reduced to silence throughout the day because of other commitments, I have been reading and learning from all of you. And, despite the sometimes intemperate language of my original post, your contributions have all been thoughtful, measured, and informative.

Several comments raised the duality of RDF; RDF the model and RDF the format (which can itself be expressed in more forms than the RDF/XML of which most might think). Kingsley’s right, of course, when he asks;

“Is RDF a Data Model or a Format re this discussion. The answer to this question is of utmost importance re coherence.”

Honestly, I’m not sure that I know which it was meant to be… but I can fairly safely suggest that the concerns I expressed become increasingly pronounced as we move from ‘model’ toward ‘format.’ I’m still worried about insisting upon the RDF model in anything other than its loosest sense, but can at least see a glimmer of justification for doing so… whereas insisting upon the format seems several steps too far.

“Surely the critical issue is whether the semantics are available, not whether they are in RDF. If a csv file is published AND suitable semantics are available, then you know which columns are URIs or whatever else.”

and

“if you publish data on the web and a suitable semantics for interpreting that data and linking it to other data, then why isn’t it Linked Data? It just so happens that RDF has a clear(er) semantics describing the interpretation of its data elements (URIs in particular) than a spreadsheet does; it doesn’t mean you couldn’t apply similar semantics to a spreadsheet if you were so inclined.”

Indeed.

Although I actually agree with every single word, Justin Leavesley‘s comment possibly gets close to the nub of things;

“Yes the same mistake was made with the rise of the web.

Once you had URIs and HTTP you already had plain text which is a perfectly good way to encode content. By adopting the STANDARD convention of HTML, all sort of existing text based formats with their various mark ups were locked out. That locked out a lot of content that already existed and required anyone who wanted to play to convert existing content into a html format.

Of course it did have the small side effect that to consume web content you only needed a browser that understood one convention i.e. html.

The same is true of RDF. XML is the equivalent of ascii in this regard. Sure it is a good way to write down data, but it isn’t sufficient to actually use that data unless you understand the various special conventions.

RDF gives you a standard way to understand TYPES of data that you have never seen before. You simply cannot do that with XML alone. You must build a convention at a different level from syntax, which can be expressed as XML. We have, its called RDF!

Ask yourself the question. Why hasn’t the linking of data taken off before? If there is all this data out there, why didn’t it just get linked together?

Because linking between different conventions isn’t very useful.

The problem has never been the linking of data, that is easy as soon as you have URIs. It is meaningfully linking different data so that you have something useful not just a mess. This itself then pulls in more data. Just as we have seen with the growth of the web and just as we are now seeing with the growth of ‘linked data'”

There are surely far more failed attempts to prematurely constrain in the name of ‘standardisation’ than successful ones. If we’re trying to grow and nurture a market (in more senses than just the commercial,) shouldn’t we be more permissive? I’d far rather be engaged in ‘selling’ (to maintain the market metaphor) the benefits of RDF than apologising for its imposition, wouldn’t you?

RDF (definitely the syntax, possibly the model) is a point in time solution to a set of problems that we collectively consider worthy of resolution. The problems will still be there — and hopefully still worthy of resolution — long after the next technical solution has come along.

A lot of the comments, too, talk about ‘converting’ to RDF. Toby, for example, writes;

“Linked data certainly needs to be *linked*, and after that, it’s pretty important to describe the relationship that each link between resources represents (i.e. “this is a link to a parent resource”, “this is link to a resource that represents a place nearby to this place”).

Once you have that, the idea of a triple emerges almost by itself, and what you have is suddenly starting to look very much like RDF. If your format is not RDF, then it’s likely to be convertable to RDF fairly trivially.”

Yes… but if, for the sake of argument, I happen to have ‘the idea of a triple’ in some other form, I may not want or need to convert. RDF is a solution, not the end-goal.

Alan Morrison says something similar, again assuming (?) RDF to be something more than it necessarily is;

“The RDF family provides a metadata umbrella that non-RDF can fit under. It’s possible to avoid religious arguments by allowing alternatives as long as they can be converted to fit under the umbrella.”

“The microdata in HTML5 discussions suggests to me that the first thing that goes out the window when you accept RDF as optional (or more typically, a more pejorative unneeded overkill) is ironically the feature most important to both RDF and linked data: the URI (microdata allows one to use string or reverse DNS identifiers instead for property names and types).”

I’d like to learn more about that, and understand the forces at play there…

And after all the comment and discussion… I’m still convinced that RDF’s model and format are important and useful, and still convinced that they should not be mandatory for Linked Data. Mandatory for ‘Linked Data in RDF,’ yes. Mandatory for ‘Linked Data,’ no.

Paul, if RDF were simply a triples format, there’d be no discussion. I hear you that making RDF a precondition for linked data is problematic, but without what Mike Bergman calls a “canonical data model,” you’re not solving the metadata problem, and therefore not the data interoperability problem either. You don’t have to trust me on this–Mike Bergman provides ample clarification: http://www.mkbergman.com/?p=483.

http://alan-morrison.blogspot.com/ alan-morrison

Paul, if RDF were simply a triples format, there’d be no discussion. I hear you that making RDF a precondition for linked data is problematic, but without what Mike Bergman calls a “canonical data model,” you’re not solving the metadata problem, and therefore not the data interoperability problem either. You don’t have to trust me on this–Mike Bergman provides ample clarification: http://www.mkbergman.com/?p=483.

http://ansell.pip.verisignlabs.com/ Peter Ansell

I don’t think RDF is eternally necessary for Linked Data, but if another model supercedes that of RDF, it should allow for unambiguous definitions of things which can have more than trivial properties, and can be universally recognised.

If another model is designed that fixes some of the current issues with RDF it should be necessary for it to be as flexible as RDF. If the next generation model can’t be as easily mashable as RDF is currently then it won’t provide much of an improvement in my opinion and likely won’t take off like RDF has.

Unless there is another model that has been designed that fixes some of the issues with RDF and still provides for the above, then Linked Data can comfortably stay as synonymous with RDF for the moment. Is there another model that fits the bill currently?

http://ansell.pip.verisignlabs.com/ Peter Ansell

I don’t think RDF is eternally necessary for Linked Data, but if another model supercedes that of RDF, it should allow for unambiguous definitions of things which can have more than trivial properties, and can be universally recognised.

If another model is designed that fixes some of the current issues with RDF it should be necessary for it to be as flexible as RDF. If the next generation model can’t be as easily mashable as RDF is currently then it won’t provide much of an improvement in my opinion and likely won’t take off like RDF has.

Unless there is another model that has been designed that fixes some of the issues with RDF and still provides for the above, then Linked Data can comfortably stay as synonymous with RDF for the moment. Is there another model that fits the bill currently?

danny.ayers

My first cent: I agree with your basic premise, but noting that virtually all data can be mapped to the RDF model (often not completely, but enough to be useful).

But again it’s worth noting that the value of data in terms of integration and reuse can be significantly increased by exploiting the RDF model, making a mapping will usually be worthwhile (the only major exceptions being purely numeric/binary datasets – and in those cases RDF metadata is useful).

danny.ayers

My first cent: I agree with your basic premise, but noting that virtually all data can be mapped to the RDF model (often not completely, but enough to be useful).

But again it’s worth noting that the value of data in terms of integration and reuse can be significantly increased by exploiting the RDF model, making a mapping will usually be worthwhile (the only major exceptions being purely numeric/binary datasets – and in those cases RDF metadata is useful).

http://myopenlink.net/dataspace/person/kidehen Kingsley Idehen

Paul,

Clearly your original thoughts where about RDF/XML the format. I discern that with clarity from your comments.

As I indicated in response to Ross, I can bear with current protraction of RDF model and data representation conflation problem; especially, since the version 1.0 of the RDF story is fully to blame for this confusion.

Today though, we do have a clear distinction between the model and its data representational formats (RDF/XML, RDF/JSON, Turtle, N3, TriX, TriG, and others).

If any data space that connects to the Web uses HTTP URIs as identifiers for records, record attributes, and record attribute values, we have Linked Data on the Web as espoused by the Linked Data meme.

Does the model need to be literally “RDF”?

Answer: No.

Does the model need to provide a machine readable mechanism for describing Entities via a constellation of characteristics (properties) that coalesce around the Identifier of the entity being described?

Answer: Yes.

What I described above is simply a “Metadata Model”. A model that enables granular Linkage a on the Web between data items via their de-referencable metadata meshes.

The Linked Data meme is fundamentally about a single HTTP URI as vehicle for three vital things:

If any Web friendly mechanism delivers the above, then it will be compatible with the Linked Data meme.

Evan after all of the above, I still don’t see the need for adding “RDF” and “SPARQL” to the updated Linked Data meme rules. Instead, they should have remained confined to the realm of “suggested implementation options and details”.

Clearly your original thoughts where about RDF/XML the format. I discern that with clarity from your comments.

As I indicated in response to Ross, I can bear with current protraction of RDF model and data representation conflation problem; especially, since the version 1.0 of the RDF story is fully to blame for this confusion.

Today though, we do have a clear distinction between the model and its data representational formats (RDF/XML, RDF/JSON, Turtle, N3, TriX, TriG, and others).

If any data space that connects to the Web uses HTTP URIs as identifiers for records, record attributes, and record attribute values, we have Linked Data on the Web as espoused by the Linked Data meme.

Does the model need to be literally “RDF”?

Answer: No.

Does the model need to provide a machine readable mechanism for describing Entities via a constellation of characteristics (properties) that coalesce around the Identifier of the entity being described?

Answer: Yes.

What I described above is simply a “Metadata Model”. A model that enables granular Linkage a on the Web between data items via their de-referencable metadata meshes.

The Linked Data meme is fundamentally about a single HTTP URI as vehicle for three vital things:

If any Web friendly mechanism delivers the above, then it will be compatible with the Linked Data meme.

Evan after all of the above, I still don’t see the need for adding “RDF” and “SPARQL” to the updated Linked Data meme rules. Instead, they should have remained confined to the realm of “suggested implementation options and details”.

Paul, it would be really interesting to see one example that you consider to be Linked Data but is not RDF.

http://claimid.com/bes Bernhard Schandl

Paul, it would be really interesting to see one example that you consider to be Linked Data but is not RDF.

ryefriday

Paul,

I find this statement confusing: “Yes… but if, for the sake of argument, I happen to have ‘the idea of a triple’ in some other form, I may not want or need to convert.”

All RDF is, is a way of talking about linking URIs with URIs – in a form of triple statements. Everything else – N3, RDF/XML, turtle, etc. are just formats for writing it down.

When you understand that, Toby’s comment is simply correct, and nothing else needs to be said. If you have triples and URIs it is hard to come up with something different from RDF.

If someone comes up with something different, it will have to be so similar to RDF that arguments for using this (to be invented) approach versus RDF are not likely to be very substantial. And in any case, it will be quite easy to convert between it and RDF.

Irene

ryefriday

Paul,

I find this statement confusing: “Yes… but if, for the sake of argument, I happen to have ‘the idea of a triple’ in some other form, I may not want or need to convert.”

All RDF is, is a way of talking about linking URIs with URIs – in a form of triple statements. Everything else – N3, RDF/XML, turtle, etc. are just formats for writing it down.

When you understand that, Toby’s comment is simply correct, and nothing else needs to be said. If you have triples and URIs it is hard to come up with something different from RDF.

If someone comes up with something different, it will have to be so similar to RDF that arguments for using this (to be invented) approach versus RDF are not likely to be very substantial. And in any case, it will be quite easy to convert between it and RDF.

Irene

http://myopenlink.net/dataspace/person/kidehen Kingsley Idehen

Ryefriday,

The answer to your question lies in simply understanding that RDF is not the Atom re. graph data models and metadata.

>>There are surely far more failed
>>attempts to prematurely constrain in
>>the name of ’standardisation’ than
>>successful ones. If we’re trying to
>>grow and nurture a market (in more
>>senses than just the commercial,)
>>shouldn’t we be more permissive? I’d
>>far rather be engaged in ’selling’ (to
>>maintain the market metaphor) the
>>benefits of RDF than apologising for
>>its imposition, wouldn’t you?

No. It doesn’t work to argue about hypothetical analogies. Linked Data means all databases are connected (“linked”). As soon as you allow multiple formats, you might as well stick with web 2.0’s APIs. Whether RDF or not, there needs to be a single format (or at least a few, standard formats) that a technology such as SPARQL can parse through. As soon as computers can’t *automatically* understand the semantics of a piece of data, it is no longer the semantic web. As soon as computers can’t *automatically* link to and from a piece of data, it is no longer linked data.

If multiple formats was argued for by the community strongly enough, then I suppose we could use the idea of ontologies to have a way that SPARQL-like software could look up the way to interpret the software to find out how to interpret the implied triple. But wouldn’t that just be more of a headache than requiring everyone to use RDF? Front-end web designers (the thousands who will need to be convinced for this to really take off) are used to a one language per function paradigm (HTML for structure, CSS for style, Javascript for behavior). As soon as we give them choice, we’re just asking for frustration from all sides.

micahherstand

>>There are surely far more failed
>>attempts to prematurely constrain in
>>the name of ’standardisation’ than
>>successful ones. If we’re trying to
>>grow and nurture a market (in more
>>senses than just the commercial,)
>>shouldn’t we be more permissive? I’d
>>far rather be engaged in ’selling’ (to
>>maintain the market metaphor) the
>>benefits of RDF than apologising for
>>its imposition, wouldn’t you?

No. It doesn’t work to argue about hypothetical analogies. Linked Data means all databases are connected (“linked”). As soon as you allow multiple formats, you might as well stick with web 2.0’s APIs. Whether RDF or not, there needs to be a single format (or at least a few, standard formats) that a technology such as SPARQL can parse through. As soon as computers can’t *automatically* understand the semantics of a piece of data, it is no longer the semantic web. As soon as computers can’t *automatically* link to and from a piece of data, it is no longer linked data.

If multiple formats was argued for by the community strongly enough, then I suppose we could use the idea of ontologies to have a way that SPARQL-like software could look up the way to interpret the software to find out how to interpret the implied triple. But wouldn’t that just be more of a headache than requiring everyone to use RDF? Front-end web designers (the thousands who will need to be convinced for this to really take off) are used to a one language per function paradigm (HTML for structure, CSS for style, Javascript for behavior). As soon as we give them choice, we’re just asking for frustration from all sides.