Category: catalogs

I am at the ELUNA (Ex-Libris user’s group) conference, and just saw a presentation on Ex Libris strategy from Oren Beit-Arie, chief strategy officer of Ex Libris, and Catherine [someone], URM project manager.

I was quite impressed. The URM strategy is basically Ex Libris’ vision for a new kind of ILS (post-ILS ILS?). [I talked about this before after last year’s ELUNA] I hope they make the slides and such from this presentation public, because I can’t quite cover what they did. They showed basically a structural diagram of the software (fairly consistent with what I wrote on future directions of library systems), and some mock-ups of staff user interfaces for workflow.

But my basic summary is: Yes! This is indeed a vision for an ILS that makes sense. They get it. I imagine most code4libbers if they saw this presentation would agree, wow, yeah, that’s the way an ILS should actually work: supporting staff workflow in an integrated way that actually makes sense, modular and componentized, full of APIs and opportunities for customer-written plugins, talking to various third-party software (from vendors (of various classes) to ERP software etc.), etc etc.

The vision was good. What’s the execution going to be? So enough praise, let’s move on to questions, concerns, and some interesting implications.

Timelines?

The timelines given were pretty ambitious. Did they really mean to say that those mock-ups of staff interfaces Catherine showed are planned to be reality by the end of 2010? The software is ambitious enough to begin with, but on top of that the mock-ups shown were heavily dependent on being able to talk to external software via web services, and getting other people’s software that may or may not have those services available to do what’s needed in that time too… I’m a bit skeptical. [And with the staffing required to pull this off and based on pricing for other products… I would have to predict that end pricing is going to be in the mid six figures].

On top of that, when Oren supplied his timeline there seemed to be a bit of slight of hand going on that confused me a bit. He said that next version of SFX was expected end of 2009, and with that lit up a bunch of boxes on his structural diagram that he said this release of SFX would fulfill. If SFX is really going to fill those boxes for the integrated and modular architectural vision he outlined (with rationalized workflow and data management not based on the existing borders of silos that exist for historical reasons, and which SFX definitely exhibits—and SFX is not known for a staff interface that supports rational workflow)…

….then either SFX is going to become quite different software than it is now (by end of 2009?)—-or the execution is going to be significantly less than the vision offered.

Network-level metadata control?

Part of the vision involved a network-level metadata store, with individual libraries linking holdings to truly shared metadata records. (Catherine at one point said “MARC… er… descriptive records” That is, an actual slip of the tongue, not one made intentionally for effect. I suspect she intended to avoid saying “MARC” at all to avoid bringing up the issue without specifying MARC…. hm.). The example used was that 4200 libraries use essentially the same record for Freakonomics, and they all manage it seperately, and when they enhance it, their enhancements are seldom shared–and this makes no sense.

We all know this vision makes sense. We also all know that this is Ex Libris challenging OCLC’s baliwick. And, I will say, with some sadness, the fact that this vision sounds so enticing and is so different from what we’ve got is kind of an indictment of OCLC’s stewardship of what they do. This is how OCLC should be working, we all know. So Ex Libris has a vision to make it work.

How is this going to interfere with OCLC? Where are they going to get all these records from? In the flowchart of where these records would come from, libraries were identified. Can libraries legally share these records with an Ex Libris central metadata store, will OCLC let them (not if they can help it!). The screenshots of staff workflow imply that when a new acquisition comes in (or really, even at the suggestion/selection level), a match will be immediately be made to a metadata record in the central system—this implies the central system will have metadata records for virtually everything (ie, like Worldcat). Can they pull this off?

If they DO pull it off, it’ll be a great system—and will divide library metadata into several worlds, some libraries using a central metadata system provided by a for-profit vendor, others using the OCLC coooperative, others using… what? It’ll be sad to me to fragment the shared cataloging corpus like this, divide it along library ‘class’ lines, and surrendure what had been owned by the collective library community through a non-profit cooperative to a for-profit vendor.

On the other hand, lest we forget, the shared metadata world really already is divided on “class” lines–many libraries can not afford to live in the OCLC world (although I suspect those same libraries will not be able to afford to live in the Ex Libris shared metadata world either). And if OCLC actually acted more like the non-profit cooperative representing the collective interests of the library sector, as it claims to be… it would be even more distressing that it is not succeeding in supplying the kind of shared metadata environment that a for-profit vendor is envisoning challenging them with.

True Modularity?

Oren suggested that their vision was open-ness, integration, and modularity. The implication to me is that I should be able to mix-and-match components of this envisioned URM architecture with other non-ExLibris (proprietary or open source) components.

Is this really going to be? As I in fact said last year after the introduction of the URM strategy, this URM strategy is technically ambitious even without this kind of true mix-and-match modularity. To add that in in a realistic way makes it even more ambitious. And to what extent does it really meet Ex Libris business interests (not suggesting they are misleading us about the goal, but when your plan is too ambitious to meet the timelines you need to stay in business… what’s going to be the first thing to drop?).

For instance, to go back to our Metadata discussion, if Worldcat (or LibLime, or anyone else) did provide a central metadata repository with the kind of functionality envisioned there, and further provided a full set of web services (both read and write) for this functionality… could I use Worldcat in this URM vision instead of the Ex Libris centralized metadata repository? (by 2010?). Really?

For another example, Primo is in some ways written quite satisfactorily to be inter-operable with third party products. But what if I want to buy just the “eShelf” function of Primo (because it’s pretty good), and use someone elses discovery layer with this? What if I want to buy Primo without the eShelf and use some other product for personal collection/saved record/eShelf functionality? Right now I can’t. How truly “modular mix-and-match” something is depends on where you draw the lines between modules, doesn’t it?

[If Ex Libris really were interested in prioritizing this kind of mix-and-match modularity, I’d encourage them to explore using the Evergreen OpenSRF architecture in an Evergreen-compatible way. But, yes, to do this in a real way would take additional development resources in an already ambitious plan.]

[And in Primo’s defense, if I wanted to use Primo with a third-party “eShelf”, or use the Primo eShelf with a third party discovery layer, Primo’s configuration and customizability would _probably_ support this. The former might even make financial sense with no discount—buying Primo just for the eShelf obviously would not. As more and more complex features are added, however, will they be consistently modularized to even allow this technically? It’s definitely not a trivial technological project.]

May you live in interesting times

If nothing else, it’s disturbingly novel to be presented with a vendor who seems to really get it. They are talking the talk (and mocking the interface mock-ups) that match where library software really does need to go.

Will they pull it off? If they do, will it really be open in the modular mix-and-match way we all know we need to avoid vendor lock-in and continue to innovate? Will we be able to afford it? We will see.

I would think the strength of this vision would light a fire under their competitors (including OCLC, and maybe the open source supporters too), spurring more rational vision-making from other library industries, making it more clear (if it wasn’t already) that keeping on going the way you can keep on going is not an option. I hope so. (On the other hand, Ex Libris is clearly targetting a library customer field that can afford it, in several definitionsof ‘afford’. Do other vendors think they can keep on keeping on to target different markets?]

Update: 17 Dec 2008: This old blog post is getting a LOT of traffic, so I thought it important to update it with my current thoughts, which have kind of changed.

Lots of library applications out there are using Amazon cover images, despite the ambiguity (to be generous; or you can say prohibition if you like) in the Amazon ToS. Amazon is unlikely to care (it doesn’t hurt their business model at all). The publishers who generally own copyright on covers are unlikely to care (in fact, they generally encourage it).

So who does care, why does Amazon’s ToS say you can’t do it? Probably the existing vendors of bulk cover image to libraries. And, from what I know, my guess is that one of them had a sufficient relationship with Amazon to get them to change their terms as below. (After all, while Amazon’s business model isn’t hurt by you using cover images for your catalog, they also probably don’t care too much about whether you can or not).

Is Amazon ever going to notice and tell you to stop? I doubt it. If that hypothetical existing vendor notices, do they even have standing to tell you to stop? Could they get Amazon to tell you to stop? Who knows. I figure I’ll cross that bridge when we come to it.

Lots of library apps are using Amazon cover images, and nobody has formally told them to stop yet. Same for other Amazon Web Services other than covers (the ToS doesn’t just apply to covers).

But if you are looking for a source of cover images without any terms-of-service restrictions on using them in your catalog, a couple good ones have come into existence lately. Take a look at CoverThing (with it’s own restrictive ToS, but not quite the same restrictions) and OpenLibrary (with very few restrictions). Also, the Google Books API allows you to find cover images too, but you’re on your own trying to figure out what uses of them are allowed by their confusing ToS.

And now, to the historically accurate post originally from March 19 2008….

Following the release of the Customer Service Agreement from Amazon this past

December, we requested clarification from Amazon regarding the use of AWS for library catalogs and received the following response:

“Thank you for contacting Amazon Web Services. Unfortunately your application does not comply with section 5.1.3 of the AWS Customer Agreement. We do not allow Amazon Associates Web Service to be used for library catalogs. Driving traffic back to Amazon must be the primary purpose for all applications using Amazon Associates Web
Service.”

There are actually a bunch of reasons library software might be interested in AWS. But the hot topic is cover images. If libraries could get cover images for free from AWS, why pay for the expensive (and more technically cumbersome!) Bowker Syndetics service to do the same? One wonders what went on behind the scenes to make Amazon change their license terms in 2007 to result in the above. I am very curious as to where Amazon gets their cover images and under what, if any, licensing terms. I am curious as to where Bowker Syndetics gets their cover images and on what licensing terms–I am curious as to whether Bowker has an exclusive license/contract with publishers to sell cover images to libraries (or to anyone else other than libraries? I’m curious what contracts Bowker has with whom). All of this I will probably never know unless I go work for one of these companies.

I am also curious about the copyright status of cover images and cover image thumbnails in general. Who owns copyright on covers? The publisher, I guess? Is using a thumbnail of a cover image in a library catalog (or online store) possibly fair use that would not need copyright holder permission? What do copyright holders think about this? This we may all learn more about soon. There is buzz afoot about other cover image services various entities are trying to create with an open access model, without any license agreements with publishers whatsoever.

I havent’ actually read it yet, but just the abstract alone of this Dlib article makes me think of a reoccurent problem I think about. If showing the user all the subjects that matched their query along with hits is useful (we often describe this as ‘facetted’ display, which I think is actually a misnomer), that might work well when you only have LCSH, but what the heck do you do when you have a corpus involving disparate controlled vocabularies?

Just listing all the controlled terms raw can easily give users misleading ideas in several ways, or just be plain confusing.

And what if some items in the corpus don’t have controlled subject/genre vocab at all?

So on reading that abstract I think, hmm, assuming LCSH is still the most common controlled vocab in your corpus could you use automated clustering algorithms to map other items to LCSH, to actually provide a meaningful list of subjects across your corpus?

While it was mainly designed to be used within the ONIX SOH and SPS formats, they wisely decided to publish it as a free-standing schema too: “The Coverage Statement may also be used to express holdings or coverage in XML structures other than those specified in ONIX for Serials.”

I think this is a great idea, along the lines of the ‘mix and match’ incipient semantic web we find ourselves in. If you look at the standard, it is really a very nice way way to describe serial holdings coverage, in ways very amenable to machine calculation. For instance, to answer the question: “Is this particular issue X within the holdings?” Or, to combine various holdings statements into a contiguous human-displayable statement. Etc. This is something our current systems have trouble doing, because we don’t store the neccesary data in machine-actionable ways.

While the standard says it’s “designed to convey information about online serial resources from suppliers – such as hosting services, publication access management services, agents or publishers – to end customers in subscribing libraries.”, there’s really nothing about it that’s limited to that context.

If anyone is writing software where they need to store or exchange serials coverage data, I’d encourage them to check out ONIX For Serials Coverage. It’s very elegant, seems to me to be just the right level of complexity and flexibility to do what it needs to do, without being overly abstract/complex/flexible. Should be quite easy to work with. Hats off to the standards writers here.

Among other things, Oren Beit-Arie from Ex-Libris gave a presentation on their “Unified Resource Management” strategy for their products. I’m afraid I can’t find even any good marketting materials from EL on this online, let alone Oren’s slideshow, which would be great. But here’s what I took away from it (anyone else feel free to correct me).

A) Get Ex Libris products working together in an integrated way, working with one common data model, as ‘services’ layered on top of a common database. Buy what services you want, all the services work well together in an integrated fashion. (There was a nifty boxes-and-lines diagram here about the idea for architecture).

B) An openness to… openness. The individual services should be mix-and-matchable with services from other providers (vendors or open source).

Now, I have no doubt that ‘A’, even by itself, is the right strategy for architecting library software. The current divisions between products and ‘modules’ we have are not always rational, leading to both user interface problems (lack of integration, have to look in different places for things that should be together), and workflow problems (again, staff has to enter things multiple times, and work with multiple products/modules/screens to accomplish what is one task to their workflow).

The architecture Oren was talking about is right from a technical perspective, and it’s indubitably right from a business perspective to give customers what they need/want. In fact, this strategy is no doubt in part a response to competitor Serials Solutions, whose products more-or-less already work like this. SerSol recently rebranded their products as the ‘360 suite’ to emphasize this level of integration. With the important caveat that SerSol’s suite does NOT include a traditional ILS or most of the functions/interfaces traditionally housed there. Which is the most complicated/difficult part, of course.

But it’s Part B that is really exciting. SerSol doesn’t do that.

Now, achieving part A alone will be a technical challenge. But on top of that, making all the pieces mix-and-matchable with other vendors products? It’s an even bigger technical challenge (does that mean that common data schema needs to be some kind of standard common accross vendors?). It’s also a political challenge and a business challenge for Ex Libris.

Will they be able to get other vendors to cooperate? (The incipient emergence of real-world stable open source library software makes this more likely, and provides some projects that are likely to cooperate regardless of what the traditional ones do).

Will they be able to actually make this work for themselves, actually commit to it, not be scared of what it could mean to their own bottom line? Now, Ex Libris has always been one of the vendors that is most comfortable with open-ness, and really trying to provide software that works with their competitors, instead of relying on lock-in. Not to detract from ‘ideological’ motivations in the company which are probably present, this is also due in part to their business position. (There are always ‘material conditions’ helping to determine things, sez this Marxist). Their strongest product was SFX–NOT an ILS–and their ILS product always had a relatively weak market share in North America anyway. So they had no choice but to promote interoperability.

It’s clear that they want Primo, their new ‘front end’ unit, to inter-operate with everyone else’s stuff, so other vendor’s customers can still buy Primo.

But do they really want all their other products–including back-ends–to inter-operate with OTHER vendors front-ends and back-ends too? So I could even–gasp–choose to use a Primo competitor with their back-end products? Are they really going to be willing/able to pull this off? It’s going to take serious technical resources, and in the long term, they’re not going to be able to justify such resources to themselves if it results in lowering their own ‘lock in’–or are they?

What Ex Libris says they want to sell me, is indeed what I want to buy. Will they be able to make it happen? Will they be able to do it in time for it to matter? The URM ‘strategy’ [Oren was clear that it’s not a ‘product’, it’s a ‘strategy’] is just being born, and the timeline was “next 5 years or more”.

Time will tell. But I am encouraged that they seem to have a strategy which makes sense from a technical perspective, not just a business perspective, something we are not used to expecting from our vendors, and which we all desperately need. [Of course a vendor with a great technical idea that goes out of business helps nobody either–I’m not saying the business can be sacrificed to the technical.]

Update 12 Nov 2008: Hey Ex Libris, you’ve been talking about this for nearly two years now at ELUNA presentations, why is there still nothing on your web site about it? Why is this important initiative re-orienting the entire strategy of your company not being shared with your customers and potential customers in written form? Why is this blog post here still the first google hit for ‘ex libris urm’?

Specifically with the stuff that generates the coverage strings missing from the SFX API. In some cases of SFX including services from ‘related’ SFX objects, my current code will generate empty or incorrect coverage strings that do not match what the SFX menu itself provides.

I am trying to come up with a very hacky workaround to this. If you want the new version, contact me in a couple days (I should REALLY put this stuff in a publically available SVN). But a VERY hacky workaround is what it will have to be, and will probably still have some edge cases it can not deal correctly with. To explain what the deal is… it’s confusing to explain, but I’ll just give you my Ex Libris enhancement request:

Our ILSs were originally intended to support our work flow. The old-fashioned-sounding phrase we still use–“automation”–points at this. We were providing ‘automation’ to make individual jobs easier. Whether the ILSs we have do this well or not is another question, but some still assume that the evaluation of an ILS stop and ends here.

But in fact, in the current environment there is another factor to evaluate. Not just work flow support, but ILS serving as a data store for the ‘business’ of the library. The work that the ILS is supporting produces all sorts of data—the most obvious ones being bib records and holdings information, but it doesn’t stop there. The ILS needs to provide this data, not just to support current identified work flow and user tasks, but to be a data store in and of itself for other software, existing and yet to be invented, supporting tasks existing and yet to be discovered. The ILS needs to function as a data store for the ‘single business’ of our library. Not neccesarily the only one (although there’s something to be said for a single data store for our ‘single business’), but the data that IS in there needs to be accesible.

Thanks to Lorcan Dempsey for pointing out the concept of ‘single business systems environment’ in the National Library of Australia’s IT Architecture report, which got me articulating in this way. I think it’s on the way to a useful way to articulate things which are often understood by ‘us’ but not articulated well to ‘them’ (adminstrators, vendors, our less technically minded colleagues).

So I feel like I just won a wrestling match with the Horizon Information Portal (our OPAC), and wanted to gloat and share what I’ve done. The rules of the match were ‘XSLT’, because HIP has a point of intervention where you can change display using XSLT (not always very easily). My stuff isn’t entirely deployed yet, but I’ll show you on our test server.

1) New product will be based on Unicorn. Will moving from Horizon to Unicorn be no easier than moving from Horizon to another vendor’s ILS…. will it be harder than moving to Evergreen?

2) Prior to the end of 2008, will we have a few more more succesful implementations of open source ILSs, in libraries comparable to our own, to give our own timid libraries enough confidence to make that move? Fall 2008 instead of Summer 2007, the previous (well, the latest previous) Corinthian release date—gives us more time.

The biggest winner of this announcement is Evergreen and Koha.

Oh, and all the rest of us too. Sometimes the only way to get off the sinking ship is to be pushed.