Wednesday, September 23, 2009

Emails, blog posts, and tweets are flying by regarding OCLC's recent message to OAI-PMH data providers asking them to agree to a set of Terms & Conditions allowing OCLC to include data harvested via OAI-PMH in both free and toll services that OCLC provides. We do love our drama in the library community!

I agree with the predominant theme that this has all been handled very poorly, but I think the biggest problem lies somewhere else entirely. OCLC has set this whole system up completely backwards. OAI-PMH is a mechanism to share metadata widely, without having 1:1 agreements between data providers and service providers (harvesters). The entire point is to reduce the overhead of sharing. OCLC asking each data provider to check their status and preferences against OCLC's ideal is the wrong way 'round! The way this really should be done is with data providers making clear statements about what can and can't be done (per both copyright and license) with the metadata they're sharing. And, oh, look, OAI-PMH, already lets data providers do that.

To be fair, there's lots of data provider software out there that doesn't support this optional part of the profile. Still others are using software that provides for this but they don't go to the effort to use it. My own repository doesn't have this mechanism in place. (Working on it, I promise!) But this really is the way it has to be for any kind of open data initiative to work. I as a data provider put my metadata (and content if I can!) up, make it clear what copyright terms apply and what license terms I place on its use, and let the sharing begin. The burden must be on the service provider (or harvester, OCLC/OAIster in this case) to determine if the use they want to put the data to conforms with my terms. Service providers should bear the load of managing multiple data providers - it's part of the work they have to do to set up the service. If they want the free stuff, they have to do the work to figure out if their efforts are kosher. OCLC must be responsible for protecting themselves from lawsuits stemming from their use of stuff they're not supposed to, rather than transferring that responsibility to us as data providers.

But I have to temper the other side of this too. I was a member of the group that developed this set of recommendations, urging data providers not to put undue restrictions over reuse of their metadata. I really believe this is the right way to go. Of course we as data providers are sometimes under legal (copyright, contract, etc.) constraints that limit what we can do with our metadata. We have to honor those agreements. But for the vast majority of our stuff, we can share without restriction if we choose to. Giving up control is part of sharing, and we have to learn to live with that. Blessing certain uses and banning others is a dangerous business, and one that doesn't mix very well with the open sharing of information libraries are all about. As the Creative Commons recently found, even "non-commercial use" isn't a very straightforward issue, so I don't think it serves us well to fall back on that old standby. Freedom is about taking the inevitable small amounts of bad with the overwhelming good, and I really do believe those principles apply to information sharing as well. Let's spend our efforts on sharing more and better information, and less on metering out what we do have.