Harvard bibliographic data released with prominent nod to OCLC

Into the flow. Back in October we were excited to announce the final step in a project on which OCLC Research worked with the University of Cambridge – the release of their library catalog data as both MARC21 and as Linked Data. They worked with us and implemented our provisional recommendation to use an Open Data Commons Attribution license for the data release, which include data that was derived from WorldCat. While we are working to finalize and formalize that recommendation (it was a major discussion item at last week’s OCLC Global Council meeting) other institutions have been working on their own data releases.

Today the Harvard University Libraries released their library catalog of more than 12 million bibliographic records. This release furthers the mandate from their Library Board and Faculty to make as much of their metadata as possible available through open access in order to support learning and research, to disseminate knowledge and to foster innovation and aligns with the very public and established commitment that Harvard has made to open access for scholarly communication. I’m pleased to say that they worked with OCLC as they thought about the terms under which the release would be made. Although Harvard Libraries did not ultimately accept our recommendation about the ODC-BY license, the approach chosen by the Harvard Libraries takes into account some of the primary aspects of OCLC’s recommendation.

Specifically, our discussions acknowledged the Harvard mandate as well as what was most important to the OCLC cooperative – receiving attribution and making others aware of the cooperative’s norms and expectations of one another in regards to data derived from WorldCat. And again I’m pleased to say that our Harvard colleagues took the cooperative’s desires into account. The dataset is being released subject to the Creative Commons Public Domain designation (CC0) but Harvard requests that subsequent use provide attribution to Harvard, OCLC and the Library of Congress. They also request that users be aware of and act in a manner consistent with the OCLC cooperative community norms and provide a link to those norms. We think this is a well-intentioned and executed compromise.

It’s true we don’t think that public domain dedications for data derived from WorldCat are consistent with the OCLC cooperative’s norms as expressed in the WorldCat Rights and Responsibilities (WCRR) statement, particularly at Section 3.B.5. We also recognize that the WCRR statement is not a legally binding document and that interpretations of these community norms within the cooperative may differ. Releasing data is ultimately the choice of the OCLC member institution as are the terms. Would other members of the cooperative consider the release of the Harvard dataset under these terms and conditions bad acting and a risk to the long-term viability and sustainability of WorldCat? Probably not, particularly with attribution, and awareness and responsible treatment of WorldCat-derived data being requested so prominently.

Our discussions and this outcome are evidence that interpretations of community norms within the cooperative may differ. The mandates of institutional mission, the imperatives of emerging local policy, national and supra-national structures may all contribute to a differing view and legitimately demand precedence. In our discussions with Harvard we acknowledged that their direction was their choice. Their mandates took precedence. They acknowledged the cooperative’s concerns and responded as a responsible cooperative citizen by requesting attribution, and awareness of and adherence to the community norms of the OCLC cooperative. The discussion was frank and mutually supportive. After all, OCLC like its member institutions is in the early stages of large shifts in data technology and policy. There are inevitable tensions and conflicting goods that will need to be reconciled over time. The process in which we are engaged will if we continue to work together with good will, ultimately lead to a new suite of best practices that balance the common good and institutional sustainability.

Jim coordinates the OCLC Research office in San Mateo, CA, focuses on relationships with research libraries and work that renovates the library value proposition in the current information environment.

Well, the requirements to re-use or publish openly the records that the libraries *created* are not reasonable, so I suspect most will go their own way. Yes, attribution is nice, mentioning OCLC is good (cooperatives work because they work together…), but requiring those restrictions is too much, IMHO. Paraphrasing, like the words from the Internet’s earlier days, “libraries own their own records,” whether created locally or with OCLC’s system.
My view,
DrWeb