As mentioned in this recent post I’ve been experimenting with WordPress for presenting OER and have been testing a pre-release version of a WordPress plug-in, developed by the Triton project at the University of Oxford to facilitate a dynamic collection of OER in a WordPress blog.

Developer @patlockley describes the overall functionality of the plug-in here and also covers some of the limitations posed by the broader OER infrastructure here emphasising that “no standard API exists across repositories so as to facilitate a single approach to aggregation for an aggregation creator” - as well as a seperate post here considering limitations of the WordPress platform itself used in this context and associated technical considerations.

Here I’ll briefly describe my experiences of using the plug-in – fairly candid in the hope that it will be useful feedback to Pat and Triton albeit with the initial caveat that any issues I’ve encountered are just as likely to be associated with my limited experience of WordPress and my shambrarian status (I simply haven’t had time to hone the search terms as carefully as I would like) as with the plug-in itself (which of course is pre-release.)

Once installed, famously straightforward in WordPress even prior to release (via FTP), you get a new “Dynamic Collection” tab in the dashboard where I can add a new collection…pretty much at random, I chose an undergraduate course from Leeds Met – Civil Engineering – around which to build my dynamic collection – it’s then just a matter of adding title and search terms, updating the feeds from the three source repositories and publishing:

This admittedly unsophisticated search returned 9 results:

Obviously the plug-in is only as effective as the keyword data / api / source repository(ies) that it is using and the fifth link here actually points at an entirely different resource (in Jorum) with no relevance to Civil Engineering, presumably due to an error at some point along it’s, er, conjugation – as the plug-in does not search Jorum directly this must have come via Xpert which does harvest Jorum. While experimenting with the plug-in I’ve also had instances where links have returned 404s or been otherwise broken so one requirement I think would be the option to remove links from the collection that are incorrect, broken…or simply less relevant; to allow the WordPress administrator fuller control of the collection.

In order to add a blog or podcast under the Settings tab, the plug-in has installed several new tabs (I don’t think the Feed management / Collection statistics / Collection tabs are yet fully functional in the version I am testing):

Under the Dynamic Collection Options there are fields to add rss feeds from blogs or podcasts:

I’m very optimistic about the potential of this approach to allow WordPressing course leaders, perhaps with support from learning technologists, to quickly and easily assemble a dynamic collection of OER for their students and look forward to the formal release of the finished product* – in the meantime, in true Blue Peter stylee, here are a number of collections that Pat made earlier to give a sense of what should be possible:

* The only caveat from my perspective is that my own institution does not formally support the use of WordPress, nevertheless, there is certainly a requirement, explicitly identified by senior stakeholders, to develop tools to cross-search Open Educational Resources and, in this context, I think we can learn a lot from the Triton project.

Motivated by this post on the OpenSpires blog from @patlockley I’ve been experimenting with WordPress with a view, ultimately, to providing a one-stop OER environment for my institution. Pat has written a plug-in that allows the WordPress admin to specify search terms to create (a) dynamic collection(s) from Xpert, Merlot and OER Commons via their APIs (also searches Wikipedia, Wikibooks and Wikiversity for Openly licensed materials, openly licensed blogs on politics and Mendeley for journals as well as political podcasts from OpenSpires.) For examples of the plug-in in action see http://politicsinspires.org/oer/political-theory/

The plug-in isn’t yet publicly available – I’m hoping that I can have a go fairly soon *waves at Pat*…I’m no WordPress developer and am just finding my way round a test install of the platform, experimenting by pulling in different feeds from various sources (our own repository, Jorum, HumBox) using a plug-in called FeedWordPress – http://feedwordpress.radgeek.com/. It’s dead easy to syndicate one (or multiple) feeds to a designated posts page but what I can’t figure out is how I might push different feeds to different pages so I could, say, have one page that auto-publishes from the Leeds Met repository, one from Jorum, one from Humbox etc.

Below: Syndicated posts from Jorum (HE – Architecture, Building and Planning) to a “Jorum” page…but how can I push separate HumBox and Leeds Met feeds to the respective pages?

Owen helped me put together a simple pipe that took this feed and used regex to replace an identifier from the record and redirect to the Open Search URL (which is built from this identifier. Like this:

So far so good…however, when I originally defined my Application Profile for research in intraLibrary I used multiple instances of <lom:description> with the first holdong ISSN (frustratingly missing from intraLibraries Bib extensions) and the Abstract held in a second instance of <lom:description> meaning it isn’t exposed in an RSS feed.

So…I thought that if I used an SRU query instead as a pipe input there would be a lot more data to play with in Pipes and hopefully I would be able to build a better RSS feed – including author and abstract.

After some initial problems with Pipes taking an SRU input, Owen responded to my plea for help with this pipe that extracts title and abstract from the SRU by defining the path through the XML to the relevant fields and mapping them to title and description:

However, I still need to figure out how to link the title to the respective record on Open Search. There is a link in the XML but this is no good as, once again, it points to the resource in the wild rather than the record on Open Search…somehow I need to use the identifier to build a link to the respective record on Open Search.

And frankly now I’m a bit stumped again – the regex function from the first pipe presumably needs to be in there somewhere…first vague attempt (doesn’t actually return any output – but this is a brain dump!):

However, due to the way I initially mapped intraLibrary LOM, the abstract is held in a second instance of the description field so my RSS feeds are just title (I’ve got ISSN in the first which I hide)…sure there must be a way to produce an RSS feed from an SRU query including this field using Yahoo Pipes…this is the SRU query:

I’ve been enjoying exploring Yahoo Pipes and have now managed to generate a feed for OER that also incorporates the author vCard; I’m sure it must also be possible to extract other data including rights information (though I’m not quite sure how!):

(vCard not visible in the Pipe but is displayed if the Pipe is rendered as RSS)

It also occurred to me that it may be possible to use a similar method to extract the abstract for research RSS feeds (which is in the second description field) and I’ve been grappling with Pipes to this end but am struggling at the moment – tantalisingly I am able to construct a “Path to item list” that returns an individual abstract but for the life of me, can’t figure out how to return abstracts for all records – SRW:records.SRW:record.1.SRW:recordData.lom:lom.lom:general.lom:description.1.lom:string.content

I’ll certainly continue to use Pipes as it’s a very powerful tool, especially for a non-programmer like me, however, for the time being this mini-project is postponed. Nothing is ever wasted though and I’m sure what I’ve learned will come in useful somewhere down the line….

This afternoon I’ve been grappling with Yahoo Pipes, trying to generate a feed that incorporates metadata from an SRU query. I’ve made a modicum of progress and, as I’m now a bit stuck, this is just a quick post to document that modicum before it all leaks away over the weekend.

The Pipe module that seems most appropriate is “Fetch Data”: “This module retrieves any XML, JSON, iCal or KML file and tries to extract a list of elements using the provided path parameter.”

I input an appropriate URL to query “ukoer” which will return all OERs uploaded as part of UniCycle:

And read the instructions here which describe how to use the “Path to item list” field in order to extract just a portion of the data by listing the nested XML elements, separating each with a dot (“.”)

After a bit of trial and error I was able to zero in on the “Description” field for the first record by entering SRW:records.SRW:record.0.SRW:recordData.lom:lom.lom:general.lom:description.lom:string.contentinto “path to item list”

However, when I attempt to run the pipe, nothing is returned – though the correct field IS displayed correctly in the debugger panel:

Also, what I would like to do of course, is to return the “Description” field for ALL records but I have no idea how to achieve this…I’ll have another look when I’m nice and fresh on Monday morning!

In theory RSS is simple, Really Simple, but the way that the technology is implemented by our underlying repository software (intraLibrary) and issues around how we have needed to integrate that software within our repository infrastructure in order to ensure appropriate Open Access has meant that, in reality, it has been anything but.

Broadly speaking, the issues are two fold:

The fields exposed by the intralibrary RSS feed are limited to “Title” and “Description”.

The URL exposed by the feed points to the public URL generated by intraLibrary (which would simply be the resource itself, either a file or URL i.e. without the context of the metadata record) whereas I need it to point to the Open Search metadata page.

I have been aware of these issues for some time but finding a resolution has been elevated in priority recently due to two separate, though similar, use-cases being explored by JorumOpen and the Xpert project at Nottingham University, that effectively seek to extend RSS from simply being a notification system to potentially also being used to harvest repository content (some are of the view that this not an appropriate use of the technology and that there are established technologies more suitable, specifically OAI-PMH and SWORD – see Intrallects’s Charles Duncan’s contribution to the discussion on Lorna Cambell’s blog). The full discussion on Lorna’s post is certainly worth reading and also includes several contributions from Xpert’s Julian Tenney and Pat Lockley.

By now I have had an extended correspondence with Pat, initially prompted when I submitted the generic intraLibrary feed to be harvested by Xpert. Preliminary feedback was that when Xpert tried to harvest our feed, it appeared that a randomly generated “key” was added to the URL, meaning they were seeing these urls as new resources, whereas they were actually duplicates (this was a consequence of the manner in which intraLibrary generates a publicURL each time a record is returned with a new machine generated “key” each time.)

As I described in a recent post, a little bit of Twitter serendipity and specifically input from Owen Stevens, subsequently led me to use Yahoo Pipes to redirect to the Open Search metadata page instead of the publicURL; Yahoo pipes also allows a pipe to be rendered as RSS and it occurred to me that this new feed could be resubmitted to Xpert for harvest. Sure enough Pat confirmed that this new feed rendered from the pipe could indeed be harvested and that each URL was definitely unique (generated, of course, from resource unique IDs by http://repository.leedsmet.ac.uk/main/index.php – see https://repositorynews.wordpress.com/2009/11/09/leeds-met-repository-open-search-version-2-0/ for more info). However, Pat also emphasised that it would be nice to find a generic way of harvesting without the pipe but for the time being I’m not sure we are in a position to implement such a solution; the work that we have done at Leeds Met on the Open Search interface and my personal obsession with redirecting RSS feeds is quite distinct from Intrallect’s primary commercial interest and so unlikely to be *officially* supported by the company (Note: As always Intrallect have been very supportive to me and certainly have proactively supported the development of our infrastructure throughout, it’s just that this isn’t a priority for them in the same way.)

As Owen has pointed out, all the Pipe does is take the ID which is part of the item’s GUID in the original RSS, and constructs a link to the metadata page which is put back into the RSS feed. You could obviously do this programatically (which might be a solution to Pat’s requirement of harvesting without the Pipe?) – it was just easier to throw together something quickly using Pipes and we simply don’t have the resources to explore another (programmatic) solution in detail.

Owen has also suggested that, as intraLibrary supports OAI-PMH, this would be the ‘supported’ mechanism for harvesting which really brings us full circle back to Charles’ argument referred to earlier in this post.

Note: On Lorna’s blog, Julian argues back at Charles that “OAI-PMH/SWORD etc are big technical barriers for many people who have resources to expose and that anyone can make a feed”; though in a subsequent comment, Pat does acknowledge that “it might be logical that if we are making a second RSS for harvesting we might use some other technology instead.”

All of which doesn’t get me very much further with my other RSS issue which is the limited metadata with just <title> and <description> exposed by RSS and it was this that seemed to be more of an issue for harvest by JorumOpen with feedback from Gareth Waller confirming that the feed from intraLibrary “does not represent the metadata for the individual items (or the top level feed) in DC format (except of course date)” and in order for our feed to be processed with the code Gareth has implemented in JorumOpen , the feed would “need to contain DC metadata for each of the items and, more importantly, licence information”. The feed from the Pipe, of course, still comprises just <title> and <description> and so would not be suitable for registering our OER content in JorumOpen. (N.B. The limited metadata, of course, will surely also affect the quality of metadata harvested and searchable via Xpert?)

What would really make my life easier I think is for JorumOpen and Xpert to harvest my metadata using OAI-PMH (go on , you know you want to!) but for the time being I am just happy to have found a method to generate RSS feed that point to my static URLs at http://repository.leedsmet.ac.uk/main/index.php!