Thursday, 8 October 2009

During the project advisory board meeting we briefly discussed the legal issues involved in archiving web2.0 sites. I’ve been doing a bit of investigating already, looking at what the various service providers’ Terms of Service say and thought I’d share what I’ve found.

Each Terms of Service is basically the same, though a few are a bit more specific about what is and is not allowed. Here’s a basic table where you can see briefly what each ToS contains (sorry it's a bit small):

As you can see, all the ToS agree that the account holder owns their content, which is good news for archivists as well as account holders, but they also agree that the site provider owns all copyright, trademarks, logos and any other intellectual property. This means that if an archive wants to harvest a site interface, not just a user’s data, then the site provider’s permission needs obtaining.

A second problem is that most sites restrict data harvesting. Facebook bans it outright, however Twitter only prohibits scraping; crawling is allowed “if done in accordance with the provisions of the robots.txt file” (which aren't stated). Also, Myspace only prohibits automated harvesting data “for the purposes of sending unsolicited or unauthorised material”. This implies that harvesting data for archival purposes is allowed. However, this isn’t stated directly, and since some stipulations are quite specific I’d be inclined to check with the service provider rather than rely on assumptions.

Interestingly, Twitter used to have a rather vague ToS which said nothing about other people using their logos and trademarks. However, they updated their terms on 18th September and now restrictions on using Twitter’s intellectual property are written in.

So altogether it looks like an archivist can’t do much with a web2.0 account without the service provider’s permission. Now it just depends how amenable they’d be to granting it.

What's the futureArch blog?

A place for sharing items of interest to those curating hybrid archives & manuscripts.

Legacy computer bits wanted!

At Bodleian Electronic Archives and Manuscripts (BEAM) we are always on the lookout for older computers, disk drives, technical manuals and software that can help us recover digital archives. If you have any such stuff that you would be willing to donate, please contact susan.thomas@bodleian.ox.ac.uk. Examples of items in our wish list include: an Apple Mac Macintosh Classic II Computer, a Wang PC 200/300 series, as well as myriad legacy operating system and word-processing software.