Linkin' Park

As XML camp followers, you'll know that discussion in the XML
community is characterised by a serious of dominant recurring
topics, so-called "permathreads." Many of these topics have
their roots in XML pre-history, and none more so than the topic
of linking. One of the original trinity of XML work areas,
along with XML itself and XSL, XML linking has progressed
inconclusively, never quite finding the happy resolution of its
fellows.

A recent discussion on XML-DEV turned to the question of what
was left undone in the great XML project, and thus conversation
turned again to linking. In this column, I'll attempt to
summarise the debate, which contained plenty of resonance for
those who've been following XML awhile.

Can We Make Linking Work?

Jonathan Robie picked up on my comment in last week's column
that XQuery was the "last great project of XML standardization,"
and asked XML-DEV if
they thought there was anything left to do. Bob DuCharme
quickly
noted that some thought XML linking was still an unattained goal.

I think part of the problem with XLink's deployment on the Web
is that it didn't have clear relationship to HTML and to
web-based multimedia. There are lots of use cases for
"linkbases," not least in connection with distributed
annotations, but the ability to take an arbitrary attribute in
an XML or XHTML document, or the content of an arbitrary
element, and say, "use this URI as the destination for *this*
link, with this text in French and this in Italian, and if the
user clicks here, offer a choice of these seven links with their
associated titles" was a step beyond the horizon of most web
developers, and old hat to the SGML hypertext crowd.

I don't know how to bridge that gap, and I'm not sure who
does, if anyone, but until it's bridged, I don't think we'll see
great leaps forward in the XLink area. It's cultural and
political more than technical.

Quin goes on to provide a list of what he would have liked to
have seen on the Web by now. It's not strictly related to the
linking discussion, but worth noting just to reinforce the lowly
state of browser technology.

The corporate desktop isn't somewhere you'd expect to see innovation,
and indeed, now that the corporate desktop seems to drive the Web, we're
not seeing innovation.

I'd love to see a more sophisticated XSL-FO being used in browsers. I'd
love to see web browsers making use of embedded RDF to display
information about web pages. I'd love to see multi-channel audio being
used on the Web, both for accessibility and for the user experience.

So, that's linking from the Web's point of view. What about
another big user of linking, SVG? Robin Berjon provided
feedback on XLink's deployment in that specification.

On the good side: contrary to a lot of what has been heard, people don't
complain about the namespace thing. Having discoverable
links is generally liked ... with the increased integration of SVG with arbitrary
namespaces, having clearly identified links is turning out to be quite
useful.

On the bad side: you can't have an XLink without specifying at least two
attributes, xlink:href and xlink:type, even for simple links. That's
something I would call a glitch and would suggest fixing with an erratum
... Another downside is that the simple web
authoring community didn't get the href/src/ref distinction that it's
used to dealing with anywhere near as easily as without XLink, while the
link junkies get nowhere near where they would like to go.

So much for an assessment of XLink's success this far. Not
everyone's on board with the idea that XLink's a necessity
anyway. No, thank you, said Michael Kay.

Hyperlinks belong in the user interface space; XML should
represent information independently of the user interface. It
was always architecturally wrong to do hyperlinking at the XML
level and the attempt should not be repeated.

"Modelling relationships in XML" -- that would be different.

What consummate skill in setting the stage for the debate! In
just two paragraphs Kay not only baited the acolytes of Nelson,
but provided a foothold for the RDF permathread to get started
too. Surely a classic discussion is in prospect.

Let's take the user interface point first.
Len Bullard pointed out that practically all implementations of linking in the
past have at least partly used queries and procedural link
building anyway. So is there any point in pursuing a
declarative language for putting links in documents, when we can
do what we want with mechanisms such as JavaScript. In
response, Quin outlined some persuasive points for including links in documents.

Google can't follow procedural links, and isn't likely to.
Procedural (ECMAscript) links are hard to manage and maintain.
They're hard to reason about. They're hard for archiving bots
to follow. They're often not made accessible, because the web
designer made them rather than the browser/UA developer. They
can be hard to internationalize.

Now onto modeling relationships. The notion of relationships
between data items tends to subsume links, which become
relationships that are made navigable in some way by an
application. Enter stage-left XML linking's prodigal son, Ben Trafford.

I think that modeling relationships sets a foundation for
linking. Linking behavior belongs specifically
in the user interface space. And we've seen that CSS and
XSL-FO can be used hand-in-hand with XML quite nicely to
determine document behavior.

Trafford continues to note that the separation of links from
the content authoring is all very well from the modeling point of
view, but it provides real difficulties for content authors
themselves.

I may have a specific reason when authoring content to want a
link to appear in a certain way, but I may have no skill or
ability to write the stylesheet that makes sure that happens. I
need to know how to write my tag to specify the linking behavior
I want.

In other words, I need for that preference
to be somehow displayed at the XML level.

Kay sniffed out Trafford's document-centricity, and accepted those use
cases, but returned to the problems of XLink as a solution to the problem.

Part of the problem, I think, is the focus on URIs as
identifiers (and links). I've heard a number of talks recently
advocating that we should use URIs whenever we want to identify
anything, and I simply don't think that's the right
direction. To my mind <postcode>RG4
7BS</postcode> is a perfectly good identifier (for
a small piece of geography in which my house is found), and any
technology that requires me to write it differently if I'm going
to use it for linking purposes is too constraining.

So what would Kay do? Not so fast, he was only defining the problem, not a
solution. But the unease certainly runs deep:

I'm not even comfortable that the hierarchic relationships
should be special. Why can't we have multiple hierarchic views
of the same network? Why do all my queries have to change
depending on whether my footnotes are inline, out-of-line
referenced by IDREFs, or in external documents referenced by
URI? What happened to the old doctrine of data independence?

Speaking of RDF, doesn't it solve all the relationship-modeling
problems XLink could anyway? DuCharme suggests that once user
interface is out of the picture, RDF should do.

I think that modeling relationships is a bit ambitious,
though; a "model" makes me think of an ordered structure of components.
Perhaps "representing relationships" would be the lower-hanging
fruit, but we don't need a new standard to say that resource X
has relationship Y to resource Z; we've got RDF for that.

You're proposing we say, "Look, you want links? Well, first
write yourself a DTD. Then a stylesheet. Oh, and by the way,
you'll need to write an RDF vocabulary to represent the data
relationships, too. Hope you like colons!"

Somehow, I just don't think people will be jumping on that
bandwagon.

Taken on the level of using the RDF/XML syntax, Trafford has a
point. However, integration of any linking technology with RDF
would make lots of sense on the Web, given that RDF definitely has
been a success within its sphere. Earlier in the discussion Quin again affirms this in the context of web metadata.

Embedding metadata makes a lot of sense -- e.g. the author of this
document. Doing so in a model that's compatible with RDF also makes
some sense. I don't care about the syntax.

Most RDF aficionados would agree with this viewpoint these
days: it's the model that counts.

So where will a solution be found that works effectively across
documents? Mike Champion echoes Quin's suggestions that query could be an answer,
picking up on Kay's link-by-value use case for his postcode.

Maybe the lesson here is that the relational model approach of
defining links dynamically based on relationships on
the values of information items rather than predefined
links really is the way to do what XLink tried to do.

The thing I always liked best about XQuery is simply the
addition of a Join operation into the XML corpus. Until this
thread I hadn't thought of this as a replacement for XLink, but
that idea is starting to take root in my head ...

Eric van der Vlist opined
that RDF was in just the right place to do this, clarifying that
he meant of course the data model and not the syntax! One of his
reasons was that he didn't think Champion's addition of the
Join operation was enough.

... the Join operation isn't enough if all you can join
are tree fragments that are by nature not "merge friendly" and one of
the biggest (and usually underestimated) benefit of RDF is its ability
to "auto-merge" information from multiple sources.

And there we will leave the linking debate for now. In short:
more questions and few answers. Personally I think there's
distinct promise in a link-by-query approach. One of the things
that makes me think this is the increasing use of search on the
Web and now on the
desktop for locating data. Search will never be as good as
explicit linking, but it may well be the "good enough" solution
for many uses that the Web was to hypertext.

The Last Word on XQuery

Last
week's XML-Deviant covered most of the debate about the length
of time XQuery was taking to develop, but I wanted to bring you
some of the closing words in the thread. Hindsight may be 20:20,
but people still seem to see different things.

Bah. If the WG had followed the advice they got three years ago
to just abandon all the type-based elaborations and the tight
coupling to XSD, then they would have shipped XQuery two years
ago, and they would have discovered that the market apparently
cares more about update facilities than it does about
schema-driven querying, and they could be building that right
now.

An open source documentation tool for W3C XML Schema
based on XSLT. With xsddoc you can generate documentation
of your XML Schema in a JavaDoc-like visualisation.

Scrapings

If an element type falls in a forest and nobody has a name for it ...
DuCharme's metadata fling helpfully augmentingmail headers ... 164 messages to XML-DEV last week, XLink
quotient 20% ... XML Spy continues to attract ire in its interpretation of W3C XML Schema ...
Quotes and reportage from a conference focusing on the Microsoft
and web services
view of XML ... Apparently Don Box kept his clothes on.