A Symbiotic Relationship: Information Architecture and the Semantic Web

Linked open data technology and the Semantic Web open up new opportunities for information architects in the connection, management and publishing of information. Information Architects, in turn, can help to make resources published through the Semantic Web more user friendly. Traditionally, locating and querying information within the Semantic Web has been difficult for general web users who are not familiar with the structure and query response formats commonly utilized by linked open data projects. However, through combining user centered design principles from information architecture, and the organization and curation of information through linked open data technologies; librarians, information scientists and information architects can work together to create new, intuitive and powerful web interfaces for connecting researchers and readers to the information they are seeking.

The BBC’s Wildlife Finder is a notable example of how linked open data technologies and web resources can be leveraged by information architects to create dynamic and coherent websites. Information architects at the BBC were challenged by the lack of coherency throughout the BBC’s family of websites. Traditionally the BBC utilized microsites, which were built around a specific television program, show or series. Although these microsites enabled visitors to browse within the confines of each microsite, there was no way to combine information that was siloed across different microsites. This meant that information remained arbitrarily siloed based solely on the program or series that the information appeared in, therefore making it harder for users to figure out where on the BBC website to look for information, without utilizing a search box or returning to Google or another search engine.

Before the creation of the BBC Wildlife Finder site, resources related to a specific species of animal were scattered throughout the BBC’s web content in different program microsites, and deeper research into its habitat or characteristics could require that a user perform additional searching. Looking for information on lions might have required a user to search through different nature programs and news stories, sort through different content types (videos, pictures, articles, blog comments, etc.). Furthermore, if a user wanted to find additional information about the lions’ natural habitats they would then be required to perform different searches or follow different paths in order to find this supplemental information; thus beginning the exhaustive search process once again.

In order to tackle these issues, the BBC’s information architects first considered users’ needs: the BBC UX researchers hypothesized that casual users are less interested in a specific document than they are in a concept or finding out information about a specific ‘thing,’ which might be a lion when they first engage with a website, and through browsing could later be tribe animal structures, carnivores, or even antelope. The information architects along with user experience researchers then tested this model against the mental models of users in order to ensure that their designs were going to be consistent with users’ mental models.

Building off of the model of a ‘thing’ or ‘concept’ driven website — where content would in turn shepard documents and content — the BBC began to publish species, habitat and behavior pages for their wildlife clip archive project. By publishing hubpages where all of the information about a specific topic would be presented and linked to, the BBC created a densely connected network of information and links from this project, rather than the siloed video clip archive that they could have created instead.

The BBC’s information architects took this model further by utilizing DBpedia’s unique identifiers for each ‘thing’ (species, habitat, and behavior in the case of the BBC Wildlife finder) as the unique identifiers within the BBC’s website. If a hubpage did not already have a DBpedia URI (unique resource identifier) or Wikipedia entry, the BBC utilized the information they had on the topic to create a new Wikipedia entry (therefore minting a new DBpedia URI).

Additionally, through using DBpedia URIs and linked open data, a box on each of the BBC Wildlife Finder entry pages now includes general information on the topic harvested from Wikipedia. Connecting the Wildlife sites to Wikipedia ensures that even species pages that only have one video attributed to them have built in contextualizing text, without the extra work of writing new content. Furthermore by using linked open data to connect content with concepts, new videos about lions can be automatically published onto the lion page without any time delay or need for manual updating.

Utilizing linked open data on the BBC also opens up the opportunity for the BBC’s information to be linked to other institutions that utilize and linked open data, including news sources like the New York Times and many national libraries and museums. The BBC could potentially create a widget to connect visitors on a species page to resources available at their local library. Additionally, the BBC has now also created more ways for other institutions and individuals to link and connect to their assets, therefore improving their search rankings and creating new ways for these resources to be discovered.

After the success of the BBC Wildlife Finder, the BBC has begun to incorporate this model throughout their family of websites including their news, sports and music websites. Through the creation of URIs for BBC programs, people and things the BBC has created new pathes for content to be interconnected and contextualized. By creating RDF triples to connect BBC hosts to content, the BBC can now publish information regarding all of the shows that a host has presented on, or they can even create ways for BBC journalists to aggregate news stories or essays into relevant topics, columns or subjects.

Utilizing principles of information architecture together with linked open data web technologies can help to strengthen not only the semantic web, but also promote the use and reuse of assets and resources of cultural heritage institutions that are published online as linked open data. Linked open data makes these assets easier to connect and point to, therefore increasing the findability of these institutions online through increased search rankings and new online connections. Unfortunately traditional methods of querying linked open data such as SPARQL queries or API requests can be difficult and confusing for many web searchers, and often respond with results that are formatted for computers over human readers (json files, rdf triples, csv files, etc). However, when information architects incorporate linked open data resources into the design and framework of websites this information can be used to automate the publishing and curation of content into a human readable format.

This symbiotic relationship between user centered design and the semantic web could potentially create new ways for librarians, information scientists and information architects to collaborate together to build intuitive, dynamic websites that leverage open access materials and are created with user’s mental models as the underlying architecture. A potential application for this collaboration would be the creation of a new model for library guides, where librarians could utilize the Semantic Web to bring together resources from their own collection with Open Access resources on topic pages. These pages, designed with users in mind, could utilize infoboxes that would be populated from Wikipedia or even from collections like the BBC Wildlife Finder. Through shifting the focus of library websites from document management to the creation of strong content driven models — with an emphasis on user’s needs and desires — librarians would be able to curate online resources that help to build a dense network of cultural heritage links and assets.