MW2008 - Search

This session has been great for me, as this is very much where my head is at right now with ArtsConnectEd… My live notes follow:

Brian Kelly chairs a session on Search, announcing that with the smaller size both speakers are willing to make this a bit more workshop-like. Terry Makewell starts by introducing his project: 9 partners making up the National Museums Online Learning Project. He goes over some of the goals of the project, and the current state of things, and the realization that some sort of federated search was needed to span the partners’ collections.

How to do the federated search? Multi-institution project meant different technical teams, different technologies, and limited resources in some cases. See the paper for more details, but the two technologies they considered most carefully are OAI/PMH and Opensearch.

OAI, the path we’re going down with ArtsConnectEd, uses a central repository and runs the searches there. Opensearch spools the searches out to each institution and then re-orders them locally and returns the result.

Opensearch fit the project requirements and timeline most efficiently, so that was their choice. He discusses their prototyping effort: scraping search results to generate the RSS for Opensearch. They now have a single page with a configuration file they can drop on each partners’ website and it will “just work”. Potential caveats: what if the search result page changes? Also the Opensearch can only be as fast as the response from the slowest partner.

He shows the working prototype, and I’m excited to see they’ve got thumbnails where available – their scraper must be fairly robust for each partner.

Lessons learned: federated search doesn’t have to be expensive or complicated, and it can work with small and large museums equally well. Their method pushes the work offsite, requiring minimal or no effort on the museum’s part.

(Note to self: end slide show with a kitten and you’re in.)

Q&A – Scalibility issues come up, they’re aware of them coming. Asked if they considered Google Co-Op: yes, but quickly found that Google was unable to deeply crawl many of the partners’ collections due to dynamic urls. Lots of twitter traffic in this session too.

Very interesting debate for me to hear on OAI vs. Opensearch. Many institutions moving towards OAI, but the scope of implementing it is a barrier for most. My feeling that OAI gives more searchable fields is somewhat refuted by the idea that the average user has no interest or knowledge of these fields (culture, era, etc)…

Johan Mhlenfeldt Jensen from the Museum of Copenhagen, Denmark, speaks next on his paper. Trying to catch up, I was distracted for the beginning.

The example he’s showing now exposes some fields for filtering, rather than just keywords. Interesting. Another example showing map-based searching, says it’s immensely popular. Easy to make for photographic collections since the address is known, much harder for other sorts of objects sometimes.

Interesting discussion on “advanced search” – he says studies show it’s minimally used, Google has changed everything. People just want a single field. Hmm… Are we wasting time and overbuilding if we have anything more advanced than a single field?? This is the question I’m banging against as I listen to these speakers.

He asks “is the best the enemy of the good?” Good question. Do we wait forever getting it right? Clearly, no, but how far do we go.

They both have good input on the question I ask about overbuilding: move the advanced search behind the scenes and make it more semantic. Still need the metadata, but don’t ask users to know about it. Also need a way to drill down after search: start with simple search, and then apply filters.

Very good comment on positioning: where and at what point in the process do you expose filters and result counts?

Brian summarizes the importance of getting static URIs for resources: then Google will “just work”…