Just to give you a quick bit about my background and why I’m here; I’ve been working on a project called Ensemble since October 2008, which is funded under the technology enhanced learning research programme. Ensemble has been conducting research into case-based learning in higher education, and exploring the potential for semantic technologies to support this. It involves six UK universities; I’m based at two of the institutions involved, with posts at the university of Cambridge and here at City, where I work in the Learning Development Centre. But what do I mean when I say ‘semantic technologies’? When I talk to people about the project, they’ve often heard of the semantic web or semantic technologies, but these terms are not synonymous and need clarification.

To get to grips with semantic technologies, we need to talk about the semantic web first. I know I just said that they aren’t the same thing, but you need to be a bit familiar with the semantic web for semantic technologies to make sense. The semantic web is the concept of all the data and information available online being available in standardized, machine-readable formats. Standardised formats would mean that computers can apply logic and reasoning across datasets.You might be thinking, isn’t the internet machine-readable anyway? Just because information is digital, doesn’t necessarily mean it is machine readable. Machine readable means that computers can actually make sense of the data that is presented to an extent, being able to reason across and make inferences from data. I’m not going to go too much into the technical details about the semantic web, but I find it is helpful to think about these two principles being required for the semantic web to work: one, that data and presentation are separated. two, that data is structured.

Taking that first point, that data and presentation are separated. The information should exist separately from the way it is being displayed. For example, let’s consider this graph showing changes in the level of carbon dioxide in the earth’s atmosphere. If it was embedded in a website as a static picture file like a bitmap or jpeg, because there is no actual numerical data in the file, it’s just a picture, all that a computer can do with it is to display the picture on the screen. But, if the actual data exists as a file, and it is displayed on the screen using a visualisation tool rather than as a static picture, there is much more that can be done with it. You could bring in other datasets too – for example, here’s how oxygen levels varied over the same period. The computer might then be able to apply reasoning to detect the dramatic fall in CO2 levels here, and cross-reference this with data about developments in plant morphology from the fossil record to find a correlation with the appearance of roots, and suggest further reading about types of plants that appeared in that period. Having data being independent from presentation also means that if you wanted to take that data and use it in a different visualisation, you wouldn’t have to start again from scratch.

This kind of thing is only possible if the data is available in standard formats, which are logically structured so that computers can read them. The standard format for data in the semantic web is something called RDF, which is short for ‘Resource description framework’. This format allows hierarchies within information to be maintained. To think about RDF and its importance, let’s take this piece of text about me as an example. A person can read it and comprehend it quite easily, but the semantic web would need it to be formatted in a way that a machine could make use of it. This is the same information but rendered in RDF. It first specifies the vocabulary that it is going to use – this sets out the available fields for the tags here – and because the information is about a person, we’re going to use one called FOAF, which stands for ‘friend of a friend’. So it’s using my homepage here as a unique identifier, and then specifying that I’ve got the following properties: my name, title, workplace homepage, and current project. The semantic web would then be able to use this information if you wanted to run queries like ‘show me all my publications’, using the name tag to search bibliographic databases, or ‘who does Katy work with’, looking for others sharing the currentProject tag.

A great deal of work has gone into working through the technical issues such as the specifications for different data formats and vocabularies and such; although this is important, it’s been largely unseen and behind the scenes. Although the semantic web is a bit patchy, and there remains a lot of data providers which you wish ‘if only so-and-so type of data was available in a SW-ready format… ‘, there has been progress with this in recent years; some high profile providers of semantic web data are data.gov.uk, Ordnance Survey, and various other providers involved in the Linked Data initiative. It is an enormous, seismic shift for the internet, so it’s understandably a slow process, but you’ve probably used semantic web-based websites without even realising it; LinkedIn and the BBC website both use semantic technology, for example.

Having established what is meant by the semantic web, and ascertained that ‘creating the semantic web’ is a task far beyond the scope of one project, what then are semantic technologies? ‘Semantic technologies’ doesn’t have a fixed definition, but instead has been applied as a generic term to cover a wide range of online technologies which are constructed with the principles behind the semantic web in mind. Semantic technologies use the benefits of the semantic web vision, even if they are not in a position to be able to fully exploit the semantic web itself yet. So, some of the implications of a machine-readable web are the ability to be able to integrate different types of data; another is that if you’re making the data easier for machines to read, visualisations are going to be important to allow it to be accessible to humans again; or they might use ontologies or taxonomies to structure complex data, or perhaps use simple artificial intelligence such as recommender systems. There are no fixed ‘rules’ though about what semantic technologies do or how they work; any tools which use one or more of these affordances could be argued to be a semantic technology.One of the main types of semantic technology we’ve been using throughout the Ensemble Project is the SIMILE toolkit. This was developed at MIT, with the aim of making it easier for non-technical people to develop semantic applications. it’s a collection of tools such as things to convert data from spreadsheets into semantic-web style formats, and visualisation tools, for example. The examples that I’m about to show you were created using SIMILE, using the principles of having structured data separate from presentation, by people involved in education in different disciplines.

I just want to take a minute here to unpick these stages a bit and show you whats going on ‘behind the scenes’ at each of these stages. Existing practice was to list references at the end of each lecture handout. Each lecture handout was available as a PDF, like this one. There isn’t much you can do with this, each lecture is isolated.The first and most labour intensive step of translating this information into a semantic application involved collecting the references from all the lecture handouts and putting them into a spreadsheet. Information about which lecture handout they feature in now becomes a type of metadata for each reference. By splitting up the citation data into different columns, this allows each column to now become a potential way of filtering the data. This was labour intensive to populate initially, but once you’ve got the initial spreadsheet, it’s easy to make changes from year to year.The spreadsheet was then converted to a format called JSON, using an online web service called Babel, which is provided by SIMILE. If we take a look at this, this is starting to look a lot more like the RDF we saw earlier. The data is now ready to be hooked into a visualisation, such as Exhibit. Exhibits are normal HTML pages, which called up the Exhibit API from SIMILE in the header. This then means that you can use whats known as the ‘dot notation’ to specify where the page should display information from different fields in the spreadsheet; for example, this puts in a facet allowing the data to be filtered by lecture.

This tool is calledMaths for Engineers, and it was created in a very similar way. This is a collection of resources intended to help new engineering students with the preparatory problems booklet they have to complete before starting their course at Cambridge. The resources were initially kept in a website with a hierarichical structure of pages, which made it difficult to find the right help. Now, this data-driven page means that students can approach the resources fromd different angles, from a broad topic to a specific question.

This is an example based on data from an online database called the global biodiversity information facility, using data about the distribution of a number of rare plant species in the UK. Like the others it is spreadsheet based which allows facetted browsing (EXPLAIN); the map visualisation can be used if you have location-based data, and also means you can bring in overlays from Google Maps too.

This is one of our most widely used tools, an interactive timeline of plant evolution. This visualisation tool brings together data about the various different biotic and abiotic factors which have influenced the evolution of plants over millions of years. It has proved popular with learners in Cambridge and also across the globe as we make it publicly available through our website.

The Programme Overview Browser was created by James Toner, who works in the City Law School. Ituses the technology to allow students on the Bar Professional Training Course to browsethe programme components in novel ways such as by case, by week or by subject area, which is much more flexible and provides a clearer overview than conventional course documentation.

So: These are all examples of things that have been achieved by academic practitioners using semantic technologies. If you’re thinking ‘great! I’ve got something which I use in my teaching that would really benefit from this kind of transformation’, great – to take this forward, we’re planning on setting up a community on Moodle, this presentation will go on there along with demos and information about how they were created, and discussion forums for help and support. I hope this has been informative and interesting, thank-you for your attention – any questions?

5.
2. Structured dataMiss Katy Jordan works at City University, London where she works on theEnsemble Project. Her homepage is www.katyjordan.com .<rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#”xmlns:foaf=http://xmlns.com/foaf/0.1/><foaf:Personrdf::about=“http://www.katyjordan.com”><foaf:name>Katy Jordan</foaf:name><foaf:title>Miss</foaf:title><foaf:workplaceHomepage>http://www.city.ac.uk</foaf:workplaceHomepage><foaf:currentProject>Ensemble</foaf:currentProject></foaf:Person></rdf:Description></rdf:RDF>

6.
Life with the Semantic Web“At the doctors office, Lucy instructed her SemanticWeb agent through her handheld Web browser. Theagent promptly retrieved information about Momsprescribed treatment from the doctors agent, lookedup several lists of providers, and checked for the onesin-plan for Moms insurance within a 20-mile radius ofher home and with a rating of excellent or very goodon trusted rating services. It then began trying to finda match between available appointment timessupplied by the agents …” Berners-Lee et al, 2001

10.
Semantic technologies in education: Reading lists – ‘under the bonnet’• To begin with, references were just listed at the end of eachlecture handout (PDF)• The references were collected into a single spreadsheet• This was converted to JSON using Babel• Then, data hooked into an Exhibit page

15.
Semantic technologies in education: What’s next?• These are all examples of what can be achieved by practitioners with existing structured data• We want you! (And your spreadsheets!)• We will set up a community in Moodle soon to include examples, ‘how to’ guides and a forum to help you create your own semantic applications