The OU’s VC, Martin Bean, gave the opening keynote, and I have to admit it really did make me feel that the OU is the best place for me to be working at the moment :-)

… though maybe after embedding that, my days are numbered…? Err…

Anyway, I feel like I’ve not really been keeping up with other Martin’s efforts, so here’s a quick hack a placemarker/waypoint in one of the directions I think the captioning could go – deep search linking into video streams (where deep linking is possible).

Rather than search the content, we’re going to filter captions for a particular video, in this case the twitter caption file from Martin (other, other Martin?!) Bean’s #JISC10 opening keynote. The pipework is simple – grab the URL of the caption file and a “search” term, parse the captions into a feed with one item per caption, then filter on the caption content. I added a little Regular Expression block just to give a hint as to how you might generate a deeplink into content based around the tart time of the caption:

One thing to note is that it may take some time for someone to tweet what someone has said. If we had a transcript caption file (i.e. a timecoded transcript of the presentation) we might be able to work out the “mean time to tweet” for a particular event/twitterer, in which case we could backdate timestamps to guess the actual point in the video that a person was tweeting about. (I looked at using auto-genearated transcript files from Youtube to trial this, but at the current time, they’re rubbish. That said, voice search on my phone was rubbish a year ago, but by Christmas it was working pretty well, so the Goog’s algorithms learn quickly, especially where error signals are available. So bear in mind that if you do post videos to Youtube, and you can upload a caption file, as well as helping viewers, you’ll also be helping train Google’s auto-transcription service (because it’ll be able to compare the result of auto-transcription with your captions file…. If you’re the Goog, there are machine learning/supervised learning cribs everywhere!))

(Just by the by, I also wonder if we could colour code captions to identify in a different colour tweets that refer to the content of an earlier tweet/backchannel content, rather than the foreground content of the speaker?)

Unfortunately, caption files on Youtube, which does support deep time links into videos, only appear to be available to video owners (Youtube API: Captions), so I can’t do a demo with Youtube content… and I so should be doing other things that I don’t have the time right now to look at what would be required deeplinking elsewhere…:-(