Innovation and best practices for the Web

About this Blog

The blog is written by Brian Kelly. Brian is the Innovation Advocate based at CETIS, University of Bolton.

This blog functions as an open notebook which provides personal thoughts, reflections and observations on the role of the Web in higher and further education which I hope will inform readers and stimulate discussion and debate, both on this blog and elsewhere, including on Twitter.

Microformats and RDFa: Adding Richer Structure To Your HTML Pages

Revisiting Microformats

If you visit my presentations page you will see a HTML listing of the various talks I’ve given since I started working at UKOLN in 1996. The image shown below gives a slightly different display from the one you will see, with use of a number of FireFox plugins providing additional ways of viewing and processing this information.

This page contains microformat information about the events. It was at UKOLN’s IWMW 2006 event that we made use of microformats on the event Web site for the first time with microformats being used to mark up the HTML representation for the speakers and workshop facilitators together with the timings for the various sessions. At the event Phil Wilson ran a session on “Exposing yourself on the Web with Microformats!“. There was much interest in the potential of microformats back in 2006, which was then the hot new idea. Since then I have continued to use microformats to provide richer structural information for my events and talks. I’ll now provide a summary of the ways in which the microformats can be used, based on the image shown above.

The Operator sidebar (labelled A in the image) shows the Operator FireFox plugin which “leverages microformats and other semantic data that are already available on many web pages to provide new ways to interact with web services“. The plugin detects various microformats embedded in a Web page and supports various actions – as illustrated, for events the date, time and location and summary of the event can be added to various services such as Google and Yahoo! Calendar.

The RDFa in Javascript bookmarklets (labelled B) are simple JavaScript tools which can be added to a variety of different browsers (they have been tested on IE 7, Firefox, Safari, Mozilla and Safari). The License bookmarklets will create a pop-up alert showing the licence conditions for a page, where this has been provided in a structured form. UKOLN’s Cultural Heritage briefing documents are available under a Creative Commons licence. Looking at, for example, the Introduction to Microformats briefing document, you will see details of the licence conditions displayed for reading. However, in addition, a machine-readable summary of the licence conditions is also available which is processed by the Licence bookmarklet and displayed as a pop-up alert. This information is provided by using the following HTML markup:

The power is in the rel=”license” attribute which assigns ‘meaning’ to the hypertext link.

The link to my Google Calendar for each of the events (labelled C) is provided by the Google hCalendar Greasemonkey script. Clicking on the Google Calendar icon (which is embedded in the Web page if hCalendar microformatting markup is detected – although I disable this feature if necessary) will allow the details to be added to my Google Calendar without me having to copy and paste the information.

The additional icons in the browser status bar (labelled D) appear to be intended for debugging of RDFa – and I haven’t yet found a use for them.

The floating RSS Panel (labelled E) is another GrreaseMonkey script. In this case the panel does not process microformats or RDFa but autodetectable links to RSS feeds. I’m mentioning it in this blog post in order to provide another example of how richer structure in HTML pages can provide benefits to an end user. In this case in provides a floating panel in which RSS content can be displayed.

RDFa – Beyond Microformats

The approaches I’ve described above date back to 2006, when microformats was the hot new idea. But now there is more interests in technologies such as Linked Data and RDF. Those responsible for managing Web sites with an interest in emerging new ways of enhancing HTML pages are likely to have an interest in RDFa: a means of including RDF in HTML resources.

The RDFa Primer is sub-titled “Bridging the Human and Data Webs“. This sums up nicely what RDFa tries to achieve – it enables Web editors to provide HTML resources for viewing by humans whilst simultaneously providing access to structured data for processing by software. Microformats provided an initial attempt at doing this, as I’ve shown above. RDFa is positioning as providing similar functionality, but coexisting with developments in the Linked Data area.

The RDFa Primer provides some examples which illustrate a number of use cases. My interest is in seeing ways in which RDFa might be used to support Web sites I am involved in building, including this year’s IWMW 2010 Web site.

The first example provided in the primer describes how RDFa can be used to describe how a Creative Commons licence can be applied to a Web page; an approach which I have described previously.

The primer goes on to describe how to provided structured and machine understandable contact information, this time using the FOAF (Friends of a Friend) vocabulary:

In previous year’s we have marked up contact information for the IWMW event’s program committee using hCard microformats. We might be in a position now to use RDFa. If we followed the example in the primer we might use RDFa to provide information about the friends of the organisers:

However this would not be appropriate for an event. What would be useful would be to provide information on the host information for the speakers and workshop facilitators. In previous year’s such information has been provided in HTML, with no formal structure which would allow automated tools to process such institutional information. If RDFa was used to provide such information for the 13 years since the event was first launched this could allow an automated tool to process the event Web sites and provide various report on the affiliations of the speakers. We might be then have a mechanism for answering the query “Which institution has provided the highest number of (different) speakers or facilitators at IWMW events?“. I can remember that Phil Wilson, Andrew Male and Alison Kerwin (nee Wildish) from the University of bath have spoken at events, but who else? And what about the Universities which I am unfamiliar with? This query could be solved if the data was stored in a backend database, but as the information is publicly available on the Web site, might not using slightly more structured content on the Web site be a better approach?

Really?

When we first started making use of microformats I envisaged that significant numbers of users would be using various tools on the browser to process such information. However I don’t think this is the case (and I would like to hear from anybody who does make regular use of such tools). I have to admit that although I have been providing microformats for my event information, I have not consumed microformats provided by others (and this includes the microformats provided on the events page on the JISC Web site).

But before investing time and energy into using RDFa across an event Web site the Web manager will need answers to the questions:

What benefits can this provide? I’ve given one use case, but I’d be interested in hearing more.

What vocabularies do we need to use and how should the data be described? The RDFa Primer provides some example, but I am unsure as to how to use RDFa to state that, for example, Brian Kelly is based at the University of Bath, to enable structured searches of all speakers from the University of Bath.

What tools are available which can process the RDFa which we may chose to create?

We considered using RDFa (with the Bibo ontology) to enable sharing of ‘references’ between students (and staff) as part of the TELSTAR project. Having looked at the area, we eventually decided not to do this – there were a number of reasons:

1) Lack of (consumer focussed) tools available to use RDFa. In terms of Referencing Zotero is either going to consume it, or already consuming it (can’t find the latest status of this at the moment), but I’m not aware of any other of the popular referencing tools doing this (although I’d love to be contradicted on this)
2) Problems with publication of RDFa (a). To publish pages with RDFa you need to publish documents as XHTML+RDFa – this isn’t happening (in general) in consumer level publication platforms (e.g. wordpress.com), so it wasn’t possible to offer a ‘cut and paste this into your blog’ type function
3) Problems with publication of RDFa (b). Our main use case was the sharing happening within the Moodle VLE – which is also not published at XHTML+RDFa, and we weren’t in a position to change this for this single use within the project

These issues, plus the fact that our main use case (sharing references within the Moodle environment) could be handled without the need to go to RDFa, meant we decided it wasn’t worth it for the project. I was disappointed as I had really liked this idea – I think if we revisted this in a years time we might find the balance had moved a bit.

Hi Owen
Thanks – a summary of the reasons why it was decided not to use a particular technology is very useful (and such information is often not shared as much as it could be).
The lack of tools to consume RDFa is a main concern that I have too. The dependencies that RDFa has on XHTML is interesting – and how will this relate to the interest in HTML 5?

and that “defines rules and guidelines for adapting the RDF in XHTML: Syntax and Processing (RDFa) specification for use in the HTML5 and XHTML5 members of the HTML family. The rules defined in this specification not only apply to HTML5 documents in non-XML and XML mode, but also to HTML4 and XHTML documents interpreted through the HTML5 parsing rules.”

In short, work is well advanced to try to ensure that RDFa is usable in HTML5.

I’ll be writing more about RDFa on eFoundations, err, some time real soon….

Thanks Pete. You point out (and it may be worth emphasising as it’s not apparent from the URIs) the “RDFa in XHTML: Syntax and Processing” is an official W3C Recommendation, which is appropriate for mainstream deployment. The “HTML+RDFa: A mechanism for embedding RDF in HTML” document is a working draft and may be subject to change – or even rejected as a W3C Recommendation (although I suspect this is very unlikely). Anyway my point being that real world deployment of RDFa in HTML 5, whilst valuable for testing and gaining experiences, may require subsequent modifications once the draft is approved.

We’ve provided a microformat vcard in the ECS staff profile page for years. Nobody cares and it makes our HTML ugly.

I hate the mix of visual & data markup. Better to just have blocks of RDF (in N3 for preference) in an element next to the item being talked about, or just in the page . Like a <link rel='alternate' header but embedded in the page?

Nate Solassaid

I had high hopes to have working examples by now, but have been seriously derailed with other work. Still, I think the possibilities of RDFa for museums – and particularly our collection sites – are exciting. And I also really think a good way to pitch this to people is to point at the enhanced search results from Google and Yahoo. If they can start to understand our metadata, it’s only a matter of time before they or someone else starts to present it better.

As for Christopher’s comment about mixing data and visual markup: to me that’s part of the appeal. I only have to annotate my existing markup, and it’s done. No fancy redirects to an RDFa version of the document, etc.

You might be interested in two resources that should help understand RDFa (publication and consumption) concerning Linked Data a bit better. In [1] we described the first Linked Data set in RDFa, and [2] is the RDFa/Linked Data Tutorial, currently in preparation. Will take the input from here to integrate it in the tutorial.

Thanks Michael, those resources are very useful. I also found the RDFa Tools wiki which provided some links to tolls for creating and consuming RDFa – although I was disappointed at the small numbers of tools and the shortage of shrinked-wrapped products.

Hiya Brian – interesting reading. Do you have a link that would point specifically to an RDFa syntax used to describe events? I get hints of it here and there (esp. eg. http://www.jisc.ac.uk/events.aspx (p.s. why is that link obscured with go2.wordpress.com detrius?)) but haven’t found a complete element list in any of the documentation thus far.

My own approach thus far is an adaptation of RSS 2.0 but I’d like to ensure compatibility with RDFa events metadata. More details very shortly.