Crowdsourcing Our Cultural Heritage

About this blog

Posts from a cultural heritage technologist on digital humanities, heritage and history, and user experience research and design. A bit of wishful thinking about organisational change thrown in with a few questions and challenges to the cultural heritage sector on audience research, museum interpretation, interactives and collections online.

Tag: Science Museum

This may be familiar to you if you’ve worked on a museum website: an object will capture the imagination of someone who starts to spread the link around, there’s a flurry of tweets and tumblrs and links (that hopefully you’ll notice in time because you’ve previously set up alerts for keywords or URLs on various media), others like it too and it starts to go viral and 50,000 people look at that one page in a day, 20,000 the next, furious discussions break out on social media and other sites… then they’re gone, onto the next random link on someone else’s site. It’s hugely exciting, but it can also feel like a missed opportunity to show these visitors other cool things you have in your collection, to address some of the issues raised and to give them more information about the object.

There are three key aspects to riding these waves of interest: the ability to spot content that’s suddenly getting a lot of hits; the ability to respond with interesting, relevant content while the link is still hot (i.e. within anything from a couple of hours to a couple of days); and the ability to put that relevant content on the page where fly-by-night visitors will see it.

For many museums, caught between a templated CMS and layers of sign-off for new content , it’s not as easy as it sounds. When the Science Museum’s ‘steampunk artificial arm’ started circulating on twitter and then made boingboing, I was able to work with curators to get a post on the collections blog about it the next day, but then there was no way of adding that link to the Brought to Life page that was all most people saw.

Someone shares an old article with their friends, some of their friends either already use or install the app, and the viral effect begins to take hold. … We’ve got over 1.3 million articles live on the website, so that is a lot of content to be discovered, and the app means that suddenly any page, languishing unloved in our database, can become a new landing page. When an article becomes popular in the app, we sometimes package it with content. Because we know the attention has come at a specific time from a specific place, we can add related links that are appropriate to the audience rather than to the original content. …when you’ve got the audience there, you need to optimise for them

As a content company with great technical and user experience teams, the Guardian is better placed to put together existing content around a viral article, but still, I’m curious: are any museums currently managing to respond to sudden waves of interest in random objects? And if so, how?

[Update: I’m working on a shorter version with fewer long words. Something like crowdsourcing geolocated historial materials/artefacts with specialist users/academic contributors/citizen historians.]

A few people have asked me about my PhD* topic, and while I was going to wait until I’d started and had a chance to review it in light of the things I’m already starting to learn about what else is going on in the field, I figured I should take advantage of having some pre-written material to cover the gap in blogging while I try to finish various things (like, um, my MSc dissertation) that were hijacked by a broken wrist. So, to keep you entertained in the meantime, here it is.

Please bear in mind that it’s already out-of-date in terms of my thinking and sense of what’s already happening in the field – I’m really looking forward to diving into it but my plan to spend some time thinking about the project before I started has been derailed by what felt like a year of having an arm in a cast.

* I never got around to posting about this because my disastrous slip on the ice happened just two days after I resigned, but I’m leaving my job at the Science Museum to take up the offer of a full-time PhD in Digital Humanities at the Open University in mid-March.

This project begins with the assumption that researchers are already digitising and geo-locating materials and asks whether it is possible to create systems to capture and share this data. Could the digital records and knowledge generated when researchers access primary materials be captured at the point of creation and published for future re-use? Could the links between materials, and between materials and locations, created when researchers use aggregated or mass-digitised resources, be ‘mined’ for re-use?

Through the use of a case study based around discovering, collating, transforming and publishing geo-located resources related to early scientific women, the project aims to discover:

how geo-located materials are currently used and understood by researchers,

what types of tools can be designed to encourage researchers to share records digitised for their own personal use

whether tools can be designed to allow non-geospatial specialists to accurately record and discover geo-spatial references

the viability of using online geo-coding and text mining services on existing digitised resources

Possible outcomes include an evaluation of spatially-oriented approaches to digital heritage resource discovery and use; mental models of geographical concepts in relation to different types of historical material and research methods; contributions to research on crowdsourcing digital heritage resources (particularly the tensions between competition and co-operation, between the urge to hoard or share resources) and prototype interfaces or applications based on the case study.

The project also provides opportunities to reflect on what it means to generate as well as consume digital data in the course of research, and on the changes digital opportunities have created for the arts and humanities researcher.

** This case study is informed by my thinking around the possibilities of re-populating the landscape with references to the lives, events, objects, etc, held by museums and other cultural heritage institutions, e.g. outside museum walls and by an experimental, collaborative project around ‘modern bluestockings’, that aimed to locate and re-display the forgotten stories around unconventional and pioneering women in science, technology and academia.

Call me mildly obsessive (sad, even), but I got really excited when I read this and mentally replaced ‘BBC programme’ with ‘museum object’. From the BBC Internet Blog:

Today sees the launch of Shownar; a new prototype from BBC Vision which aimsto track online buzz around BBC TV and radio programmes and reflect it back inuseful and interesting ways, aiding programme discovery and providing onwardjourneys to discussion about those programmes on the wider web.

…

Shownar aims to track the wealth of activity that takes place around BBC progammes online and work out which are currently gaining the most attention.

…

So, how does it work? In the first instance, we decided to focus on tracking in-bound links to programme-related pages on bbc.co.uk, so we could be confident that the discussions were actually about a BBC programme … We took a look at a range of possible suppliers, and for this initial prototype chose data provided by Yahoo! Search BOSS, NielsonOnline’sBlogPulse (which indexes over 100 million blogs), and Twingly (which searches microblogging services like Twitter, Jaiku and Identi.ca for links, even when they are shortened using URL shortening services such as TinyURL and bit.ly). We are also ingesting data from LiveStats, the BBC’s own real-time indicator of traffic. Once ingested, this data is processed according to a specially created algorithm to calculate the ‘buzz measure’ for every BBC programme – more detail on the algorithm can be found on Shownar’s Technical information page.

The post discusses some of the interfaces and benefits – I think the possibilities are pretty endless, and will be exploring how it might enhance the discoverability of and harness conversations about the Science Museum’s online collections over the year.

More inspiration for Ada Lovelace Day 2009 from Discover magazine’s 2002 list of The 50 Most Important Women in Science. Not everyone listed is directly involved with technology, but it’s worth checking them out anyway, because as the article points out, ‘[i]f just one of these women had gotten fed up and quit—as many do—the history of science would have been impoverished’:

Three percent of tenured professors of physics in this country are women. Nonetheless, a woman physicist stopped light in her lab at Harvard. Another woman runs the linear accelerator at Stanford. A woman discovered the first evidence for dark matter. A woman found the top quark. The list doesn’t stop there, but the point is clear.

Three years ago, Discover started a project to look into the question of how women fare in science. We knew there were large numbers of female researchers doing remarkable work, and we asked associate editor Kathy A. Svitil to talk to them. The result of her investigation is a selection of 50 of the most extraordinary women across all the sciences. Their achievements are detailed in the pages that follow.

To read their stories is to understand how important it is that the barriers facing women in science be broken down as quickly and entirely as possible. If just one of these women had gotten fed up and quit—as many do—the history of science would have been impoverished. Even the women who have stuck with it, even those who have succeeded spectacularly, still report that being a woman in this intensely male world is, at best, challenging and, at worst, downright disheartening.

It will take goodwill and hard work to make science a good choice for a woman, but it is an effort at which we cannot afford to fail. The next Einstein or the next Pasteur may be alive right now—and she might be thinking it’s not worth the hassle.

NEW YORK, March 10, 2009 – ACM, the Association for Computing Machinery, has named Barbara Liskov of the Massachusetts Institute of Technology (MIT) the winner of the 2008 ACM A.M. Turing Award. The award cites Liskov for her foundational innovations to designing and building the pervasive computer system designs that power daily life. Her achievements in programming language design have made software more reliable and easier to maintain. They are now the basis of every important programming language since 1975, including Ada, C++, Java, and C#. The Turing Award, widely considered the “Nobel Prize in Computing,” is named for the British mathematician Alan M. Turing. The award carries a $250,000 prize, with financial support provided by Intel Corporation and Google Inc.

The first woman to be awarded a Ph.D. from a Computer Science department (in 1968 from Stanford University), Liskov revolutionized the programming field with groundbreaking research that underpins virtually every modern computer application for both consumers and businesses. Her contributions have led to fundamental changes in building the computer software programs that form the infrastructure of our information-based society. Her legacy has made software systems more accessible, reliable, and secure 24/7.

This is a rough transcript of my lightning talk ‘Happy developers, happy museums’ at JISC’s dev8D ‘developer happiness’ days last week. The slides are downloadable or embedded below. The reason I’m posting this is because I’d still love to hear comments, ideas, suggestions, particularly from developers outside the museum sector – there’s a contact form on my website, or leave a comment here.

“In this talk I want to show you where museums are in terms of data and hear from you on how we can be more useful.

If you’re interested in updates I use my blog to [crap on a bit, ahem] talk about development at work, and also to call for comment on various ideas and prototypes. I’m interested in making the architecture and development process transparent, in being responsive to not only traditional museum visitors as end users, but also to developers. If you think of APIs as a UI for developers, we want ours to be both usable and useful.

I really like museums, I’ve worked in three museums (or families of museums) now over ten years. I think they can do really good things. Museums should be about delight, serendipity and answers that provoke more questions.

A recent book, ‘How does one become a scientist? : survey on the birth of a Vocation’ states that ‘60% of scientists over 30 and 40% of scientists under 30 note claim, without prompting, that the Palais de la Découverte [a science museum in Paris] triggered their vocation’.

Museums can really have an impact on how people think about the world, how they think about the possibilities of their lives. I think museums also have a big responsibility – we should be curating collections for current and future audiences, but also trying to provide access to the collections that aren’t on display. We should be committed to accessibility, transparency, curation, respecting and enabling expertise.

So today I’m here because we want to share our stuff – we are already – but we want to share better.

We do a lot of audience research and know a lot about some of our users, including our specialist users, but we don’t know so much about how people might use our data, it’s a relatively new thing for us. We’re used to saying ‘here are objects in a case, interpretation in label’, we’re not used to saying ‘here’s unmediated access, access through the back door’.

Some of the challenges for museums: technology isn’t that much of a challenge for us on the whole, except that there are pockets of excellence, people doing amazing things on small budgets with limited resources, but there are also a lot of old-fashioned monolithic project designs with big overheads that take a long time to deliver. Lots of people mean well but don’t know what’s possible – I want to spread the news about lightweight, more manageable and responsive ways of developing things that make sense and deliver results.

We have a lot of data, but a lot of it’s crap. Some of what we have is wrong. Some of it was written 100 years ago, so it doesn’t match how we’d describe things now.

We face big institutional challenges. Some curators – (though it does depend on the museum) – fear loss of control, fear intellectual vandalism, that mistakes in user-generated content published on museum sites will cause people to lose trust in museums. We have fears of getting the IT wrong (because for a while we did). Funding and metrics are a big issue – we are paid by how many people come through our door or come to our websites. If we’re doing a mashup, how do we measure the usage of that? Are we going to cost our organisations money if we can’t measure visits and charge back to the government? [This is particularly an issue for free museums in the UK, an interesting by-product of funding structures.]

Copyright is a huge issue. We might not even own an object that appears in our collections, we might not own the rights to the image of our object, or to the reproductions of an image. We might not have asked for copyright clearance at the time when an object was donated, and the cost of tracing it might be too high, so we can’t use that object online. Until we come up with a reliable model that reduces the risk to an institution of saying ‘copyright unknown’, we’re stuck.

The following are some ways I can think of for dealing with these challenges…Limited resources – we can’t build an interface to meet every need for every user, but we can provide the content that they’d use. Some of the semantic web talks here have discussed a ‘thin layer’ of application over data, and that’s kind of where we want to go as well.

Real examples to reduce institutional fear and to provide real examples of working agile projects. [I didn’t mean strictly ‘agile’ methodology but generally projects that deliver early and often and can respond to the changing technical and social environment]

Finding ways for the sector to reward intelligent failure. Some museums will never ever admit to making a mistake. I’ve heard over the past few days that universities can be the same. Projects that are hyped up suddenly aren’t mentioned, and presumably it’s failed, but no-one [from the project] ever talks about why so we don’t learn from those mistakes. ‘Fail faster, succeed sooner’.I’d like to hear suggestions from you on how we could deal with those challenges.

What are museums known for? Big buildings, full of stuff; experts; we make visitors come to us; we’re known for being fun; or for being boring.

Museum websites traditionally appear to be about where we are, when we’re open, what’s on, is there a cafe on site. Which is useful, but we can do a lot more.

Traditionally we’ve done pretty exhibition microsites, which are nice – they provide an experience of the exhibition before or after your visit. They’re quite marketing-led, they don’t necessarily provide an equivalent experience and they don’t really let you engage with the content beyond the fact that you’re viewing it.

We’re doing lots of collections online projects, some of these have ended up being silos – sometimes to the extent if we want to get data out of them, we have to screen-scrape our own data. These sites often aren’t as pretty, they don’t always have the same design and usability budgets (if any).

I think we should stick to what we’re really good at – understanding the data (collections), understanding how to mediate it, how to interpret it, how to select things that are appropriate for publication, and maybe open it up to other people to do the shiny pretty things. [Sounds almost like I’m advocating doing myself out of a job!]

So we have lots of objects, images, lots of metadata; our collections databases also include people, events, dates, places, businesses and organisations, lots of qualified information around things like dates, they’re not necessarily simple fields but that means they can convey a lot more meaning. I’ve included that because people don’t always realise we have information beyond objects and object metadata. This slide [11 below] is an example of one of the challenges – this box of objects might not be catalogued as individual instruments, it might just be catalogued as a ‘box of stuff’, which doesn’t help you find the interesting objects in the box. Lots of good stuff is hidden in this way.

We’re slowly getting there. We’re opening up access. We’re using APIs internally to share data between gallery interactives and the web, we’re releasing them as data points, we’re using them to provide direct access to collections. At the moment it still tends to be quite mediated access, so you’re getting a lot of interpretation and a fewer number of objects because of the resources required to create really nice records and the information around them.

‘Read access’ is relatively easy, ‘write access’ is harder because that’s when we hit those institutional issues around authority, authorship. Some curators are vaguely horrified that they might have to listen to what the public have to say and actually take some of it back into their collections databases. But they also have to understand that they can’t know everything about their collections, and there are some specialist users who will know everything there is to know about a particular widget on a particular kind of train. We’d like to capture that knowledge. [London Transport Museum have had a good go at that.]

We’re taking our content to where people hang out. We’re exploring things like Flickr Commons, asking people to tag and comment. Some museums have been updating collections records with information added by the public as a result. People are geo-tagging photos for us, which means you can do ‘then and now’ mashups without a big metadata enhancement budget.

I’d like to see an end to silos. We are kinda getting there but there’s not a serious commitment to the idea that we need to let things go, that we need to make sure that collections online shareable, that they’re interoperable, that they can mesh with other things.

Particularly for an education audience, we want to help researchers help themselves, to help developers help others. What else do we have that people might find useful?

What we can do depends on who you are. I could hope that things like enquiry-based learning, mashups, linked data, semantic web technologies, cross-collections searches, faceted browsing to make complex searches easy would be useful, that the concept of museums as a place where information lives – a happy home for metadata mapped around objects and authority records – are useful for people here but I wouldn’t want to put words into your mouths.

There’s a lot we can do with the technology, but if we’re investing resources we need to make sure that they’re useful. I can try things in my own time because it’s fun, but if we’re going to spend limited resources on interfaces for developers then we need to that it’s actually going to help some group of people out there.

The philosophy that I’m working with is ‘we’ve got really cool things, but we can have even cooler things if we can share what we have with everyone else’. “The coolest thing to do with your data will be thought of by someone else”. [This quote turns out to be on the event t-shirts, via CRIG!] So that said… any ideas, comments, suggestions?”

And that, thankfully, is where I stopped blathering on. I’ll summarise the discussion and post back when I’ve checked that people are ok with me blogging their comments.

[If the slide show below has a brown face on a black background, it’s the right one – slideshare’s embed seems to have had a hiccup. If it’s not that, try viewing it online directly.]

[My slide images include the Easter Egg museum in Kolomyya, Ukraine and ‘Laughter in Odd Places’ event at the Museum of London.]

This is a quick dump of some of the text from an interview I did at the event, cos I managed to cover some stuff I didn’t quite articulate in my talk:

[On challenges for museums:] We need to change institutional priorities to acknowledge the size of the online audience and the different levels of engagement that are possible with the online experience. Having talked to people here, museums also need to do a bit of a sell job in letting people know that we’ve changed and we’re not just great big imposing buildings full of stuff.

[What are the most exciting developments in the museum sector, online?] For digital collections, going outside the walls of the museum using geo-location to place objects in their original context is amazing. It means you can overlay the streets of the city with past events and lives. Outsourcing curation and negotiating new models of expertise is exciting. Overcoming the fear of the digital surrogate as a competitor for museum visits and understanding that everything we do builds audiences, whether digital or physical.