Monday, November 05, 2007

Where Will We Get Our Metadata in the Future?

While I agree that cataloging could benefit greatly from
industrialization, do we really want cataloging to go the way of
library systems? Libraries are increasingly held hostage by the fact
that they don't control their own systems, and the vendors who do
control our systems are not willing or able to keep up with our needs.
And since system/software development is a lot more lucrative than
metadata creation and management, I don't hold out much hope for
getting metadata of acceptable quality out of vendors unless libraries
are willing/able to finance the industry. Doubtful at best.

On several fronts, we're hearing that one of the changes for bibliographic control will be re-using metadata from sources outside the library. So this is an interesting question. How much interest is there from publishers and book vendors in the area of metadata creation?

I just listened to Karen Calhoun's presentation recently given at the OCLC Members Council Meeting, WorldCat and the Future of Bibliographic Control [mp3]. She states that re-use of metadata from other constituencies is a big part of OCLC's vision for the future of bibliographic control. Any thoughts?

Comments

You can follow this conversation by subscribing to the comment feed for this post.

I think its important to keep in mind that publishers and vendors are in the business to print and sell books, not to create metadata. Because technology allows them to readily create metadata that we can access and use, then we should take advantage of this in order to ease our own workload. However, I agree that the quality will likely be minimal. Because we as catalogers are in the business to create metadata we should always consider it our role to review and enhance this metadata.

I agree with Jenny: Publishers and cataloguers are in distinctly different businesses and have distinctly different foci. Medatada creation by publishers is a cost.

We could try to persuade them that finding their products more efficiently might help to sell more of their product, but then we'd also have to work with Amazon etc. so that we'd all be using similar metadata standards.

Alternately, we could absolutely give up on creating library metadata, which would eliminate all cataloguing costs. We might then find that, with no bibliographic control, people stopped using libraries altogether (if users find an information source unreliable, how long until they stop using it altogether?).

Jennie, I agree that it will be the cataloger and metadata librarian's role to review and enhance metadata. This is one of the reason I've argued that PCC and OCLC's Enhance programs need to lower their threshold for participation. This way we could enrich metadata beyond our local library. Also, this role will most likely grow in the future since we will be dealing with a multiplicity of metadata formats, not just MARC.

Carlos, I think the three of us are in agreement. We'll have to see how much crosswalks can do. Publishers and book vendors may be exceptions, but in general, I don't see the business world (Amazon, et al.) being willing to conform to library metadata standards. And, of course, why should they since they've created their own metadata standards already.

And the second option you mentioned, giving up on creating library metadata altogether seems untenable. I think Martha Yee's point about the self-immolation of our profession is accurate. If this option was taken to its logical conclusion, users would really give up on libraries.

I remember being asked not to long ago whether we (those of us actively engaged in "metadata futures"--for lack of a better term) were still concerned about "bibliographic control." It was a good question, but my off-the-cuff answer then is still the one I'd give today. Usually when catalogers talk about "bibliographic control" they're really talking about controlling as much as possible about the source and configuration of the data that describes their resources, and, quite frankly, that's a conversation I think is increasingly irrelevant. I think we have to stop wishing for the return of the golden age of cataloging (whenever that was, exactly) and get real about what faces us.

It makes no sense to continue to wail and moan about the quality of what we might be getting from publishers or other vendors--in fact it reminds me very much of the early days of shared cataloging, when there was much dark talk about how accepting cataloging from "libraries out there who weren't as good as we are" would inevitably reduce the usefulness of OUR catalog for OUR users. Well, guess what? We figured that out, we figured out shared authority control, and we moved on.

We know a hell of a lot more than we sometimes give ourselves credit for about managing and using data--lots of different data (isn't that what we've been doing all along, with some help from our shared standards?), and we're going to have to learn a lot more, because it's the price of participation in the information environment to come.

First step I think is to consider what functions we need our data to support (aside from our OPACS, which I suspect will be largely backend support for most libraries within the next few years, and not something we expose to our users). If we can't do that in a much broader context than we have been doing--and in addition stop talking about "quality" as if it were something that only blessed, classically trained catalogers understood--we're the ones waving the matches and incendiary materials around, preparing to set the fire that begins the "self-immolation of our profession."

It ain't about control, really: it's about using what we know to provide the information that keeps libraries viable, and visible, among the other services our users are likely to see first. We can't control much these days, but our best hope is to compete.

Diane, I appreciate your rant very much, thank you. We all need to wake up and smell the coffee with regard to the golden age of cataloging being over. And we all need to learn and prepare for change. I agree and I'm trying.

Initially, I found this anonymous quote interesting because of a conversation I had with a tech savvy public librarian. She said that she would never pay for bibliographic records. So, that prompted me to think about the vendor's role in the local library. Especially with the open source ILS options out there. It doesn't hurt to take the next step and ask--Could our metadata go open source also?

That aside, I'd like to expand on what I mean by enhancing metadata (and it's not fooling around with punctuation). I mean adding real value to records. For example, in theological libraries my ATLA colleagues and I sometimes need to add subject headings to get as close as we can to the vocabulary of our users. We've been encouraged by librarians from the Library of Congress to do this subject analysis from the perspective of our specific religious denominations, etc. rather than from LC's more general approach. I hope this type of quality issue will still be relevant even though how we get there will be very different from what we're use to.

I think your points are very important. The distribution systems we now have for our data (apparently now mostly OCLC), rely on assumptions about what we want in our data that do not easily accommodate the distribution of "enhancements" nor do they enable the recording of the source of those enhancements.

This is an important issue, and I know some folks don't think we should be discussing it because they think it's in conflict with our general mandates to be considering the economics of what we do. It's my contention that we need to look beyond our current distribution mechanisms and attempt to think about what our needs really are, and how to get there in efficient and economically sustainable ways.

One bottleneck, of course, in enhancing the value of our data in useful and economically viable ways, is that our current cataloging rules and processes are absolutely not designed to help us accomplish this. I think your point about "open source data" is very much on point here. What if libraries distributed their enhanced data via OAI, even as they continued to use the older distribution systems? Wouldn't that enable the newer open source developers to come up with some new solutions, even as we work to update our cataloging rules?

Diane, I really like your suggestion: "What if libraries distributed their enhanced data via OAI, even as they continued to use the older distribution systems?" A both/and paradigm shift for metadata sharing and distribution makes sense. Opening up our metadata for harvesting (like the OAI model) as well as continued contribution to older distribution systems would allay one of my fears: data loss. I fear that solely focusing on economics will mean a loss of valuable data in the rush to implement change. For example, I have the sense that for some libraries there was data loss moving from a card environment to machine readable bibliographic records. There's really no reason that should happen now if we do it right! So, more than one distribution mechanism sounds attractive to me.

Also, from an economic standpoint getting on the Web through different channels will give libraries the viability (and visibility) they need. This is where your other point about the barrier we face with our current cataloging tools and processes really hits home. If our tools don't change, we can't move forward. I'm looking forward to see how the LC Working Group addresses this issue in their draft report.

Amazon, etc., care about receiving high quality data from publishers, and I don't think their standards are that far off from ours. I was at the Annual Meeting of the Book Industry Study Group earlier this week, which is made up of publishers, distributors, and retailers (libraries are welcome to the group, but there was only one there). A new initiative for 2007-2008 is to kick off a Product Data Certification program to improve the quality of data coming from publishers. It was acknowledged somewhat off the record that quality is a problem right now, but the certification panel is made up of people from Amazon, Ingram, B&T, B&N, etc. whose core business requires that the problem be fixed. Certification is based on an 86-page Metadata Best Practice document which I haven't had a chance to read yet, but appears to be ensuring the publisher comply with ONIX (which may be easier to work with than MARC from a strictly pure data perspective...every data element has it's own field).

I find it somewhat unfortunate that publishers and libraries are so far apart in their work and are actually duplicating efforts. Publishers like to get their work in front of people, and the essential information to do that is required by both a library and a retailer - i.e., title, author, publisher, format, etc. If we can know the data is going to be in correct ONIX, it's not too hard to get it to MARC (no perfect, but not impossible).

Not at all to be an apologist for publishers & retailers, but just because they're doing something for a profit doesn't mean a library can't gain some value from it if we're willing to compromise just a little bit now and then.

Re-use and enhancement of metadata content from any and all sources is exactly what I would love to be able to do. I'm sick and tired of having to do manual data entry to get basic information from one format, system, etc. to another. Often, I can't even get a basic title list out of vendors for online products that we've purchased, let alone real metadata records. Another major problem is the lack of systems or even software tools that make it easy for catalogers in a local library context to gather and transform metadata. I still have to get data into MARC, because the only bibliographic database I have to work with is our ancient Innovative Interfaces ILS. MarcEdit is about the best tool I have at this point. I'd be especially interested in hearing from Diane if she knows if there are any projects out there looking at defining needs and/or developing "next generation" cataloging systems, to work in either the "metadata vendor" or the local library level. I have a sabbatical coming up and I would love to sign on to work on such a project. The future of libraries depends upon the development of such systems and tools.