Andornot Blog

Inmagic recently blogged about the limitations of using SharePoint for library applications, and this prompted me to write this post sharing my recent experiences setting up a SharePoint site for a library catalogue.

We have been working with a client to create a SharePoint 2010 site for a new resource library to manage codes, standards and related documents. SharePoint is this client’s preferred platform, and as their processes for getting approval for any new software such as a proper integrated library system are onerous, time consuming and often futile, it was decided to just accept the limitations of SharePoint.

Once it was established that we would need to design a library catalogue in SharePoint, I went searching the web for advice and suggestions. This in itself is not easy, as a core concept in SharePoint is “Libraries”, so it is hard to differentiate terminologies and find results relevant to SharePoint usage in a corporate Library setting. However the references I did find were mostly concerned with how unsuitable it was, although none gave any detailed specifics of particular issues. I found one SharePoint based library system advertised, but the vendor website is no longer active, and I chatted to a reputed ILS vendor who mentioned spending three years trying unsuccessfully to port their ILS to SharePoint.

The prospects for designing a catalogue in SharePoint for our client were therefore not promising! I started our project with SharePoint 2007, but very fortunately the client was able to upgrade the site to SharePoint 2010 mid way through. I would never attempt to design a catalogue (or anything else) in SharePoint 2007 again. However with either version, there are still many frustrations, especially as in our situation we were not allowed access to SharePoint Designer which allows editing the underlying website and HTML. We were required to work with our client’s templates, stylesheets and site structures to ensure a consistent branding across all their SharePoint sites. All comments below are therefore based on just the out of the box functionality available to a site administrator.

Designing any site in SharePoint needs a thorough planning process, and discussion of this is beyond the scope of this post. However for anyone contemplating designing a catalogue in SharePoint, here are some factors to consider.

Specifying content types:

Most corporate library catalogs will include different types of material, i.e. books, reports, journals, videos, websites etc. Some of these may require columns (fields) unique to a specific type. For example you will probably want to add a Frequency column for a journal but not for the rest.

By default, all columns show in all displays regardless of whether they have data. (This reminds me of the original library systems which have now all long since hidden any empty fields!)

To get around this, we set up different reusable Content Types each inheriting from a core set, and different views (display forms) for each type of material.

Depending on your version of SharePoint and your specific site settings, there may be a lengthy list of content types and existing site columns to choose from. There is a very rudimentary description of the expected content for each column, but no indication in advance of parameters such as if the column type is pre-set, i.e as single line of text, multiple line of text, choice, lookup etc. Changing a column from one type to another after the fact is often not an option. Some may also have unexpected settings, e.g. the Route to External Location column. There is no indication when adding it to your content type that this is a Required Yes/No column, or that it is a persistent or “sealed” column that cannot be deleted! There are 28 or so of these persistent columns including others with innocuous sounding names such as Article Date.

SharePoint has several reserved column names that cannot be changed. Therefore “Author” in SharePoint terminology is the person creating the resource (record), not the author of a book. It’s not difficult to add a new column for BookAuthor or equivalent, but on the default search results, all records include this SharePoint Author column which is of course inappropriate in a library context. “Date” is also included by default too, but this is the Date entered not a Publication Date.

Formatting views:

Most default views in SharePoint are columnar which is perfect for many types of information but does not work well with variable library data where for example, a title can be very short in one record, and very long in the next. There is no easy way to force a set column width unless you have access to SharePoint Designer.

There is a Datasheet view option which is very similar to Excel and would be great for quick editing, but SharePoint does not support this type of view if your content type includes any Managed Metadata columns.

Managed Metadata:

Managed Metadata provides a new taxonomy capability in 2010 which mitigates some of the other negatives when working with SharePoint.

We are using this new column type in several ways:

As a controlled vocabulary for our LC Subject Headings so that our technician can start typing and any matching terms are displayed.

Synonyms or abbreviations can be included, so we use this for Publishers so that they are findable by both their full name and their acronym.

Terms can be added in a hierarchy so we use this for specifying a general Location and then a specific Office where the items are stored.

Multiple terms can be added to a record quickly, and new ones added either on the fly, or through the Term Store. (However there is no way to batch add an existing list without SharePoint Designer.)

Best of all, we can use these Manage Metadata columns as Search Refiners to produce a faceted search results page.

The downsides are that you cannot import records from a spreadsheet or use a Datasheet view if the list contains any Managed Metadata columns.

Search Refiners:

We were able to set up several custom search scopes and set the default search to the Library Catalogue only.

Our custom search results page is set up with multiple Search web parts including a Refinement Panel. Choosing which columns to use as refiners is picky requiring editing a popup XML Editor, but at least it can be done without requiring SharePoint Designer. However we have not been able to force a consistent order for displaying these refiners, so if a result set mostly belong to the same material type, that refiner is not considered important so it appears lower down the list.

We have had to lower our expectations regarding what we will be able to accomplish without SharePoint Designer or any IT support. Fortunately the collection is predominantly virtual, so we have not had to think about printing spine labels or shelf lists sorted by LC Classification. We now have a functioning catalogue and some workflow created with InfoPath forms to support requesting and approving new orders, but there is no question that a purpose built integrated library system would be preferable.

It may appear that migrating an existing library system to SharePoint or starting a new catalogue would be a cost saving measure if an organization already has SharePoint. However, as there are no commercial library packages offered on the SharePoint platform, any system will have to be developed and maintained internally. This reminds me of the many library systems set up over the years in Microsoft Access that end up unsupported when the particular developer leaves. We have converted many of these Access databases to standard library software, but this can be a time consuming process as often the records have limited fields or authority control, requiring us to upgrade the cataloguing.

I needed a typeahead suggestion (autocomplete) solution for a textbox that searches titles. In my case, I have a lot of magazines that are broken down so that each page is a document in the Solr index, and has metadata that describes its parentage. For example, page 1 of Dungeon Magazine 100 has a title: "Dungeon 100"; a collection; "Dungeon Magazine"; and a universe: "Dungeons and Dragons". (Yes, all the material in my index is related to RPG in some way.) A magazine like this might consist of 70 pages or so, whereas a sourcebook like the Core Rulebook for Pathfinder, a D&D variant, boasts 578, so title suggestions have to group on title and ignore counts. Further, the Warhammer 40k game Dark Heresy also has a Core Rulebook, so title suggestions have to differentiate between them.

To build this typeahead solution, I:

added new Solr field types to schema.xml to support ngram matching

added a /suggest handler to solrconfig.xml that weights matches appropriately

bound the suggestions in JSON format to Twitter's typeahead.js

Example 1: two core rulebooks.

Example 2: "dark" matching in Title and Collection

Add new field types to Solr schema.xml

text_suggest_ngram

For partial matches that will be boosted lower than exact or left-edge matches, e.g. match 'bro' in "A brown fox".

text_suggest_edge

text_suggest

For whole term matches. These will be weighted the highest.

These field types are taken lock, stock and barrel from https://github.com/cominvent/autocomplete. In that project, the suggest engine takes the form of an entirely separate core - I have simplified matters for myself. Great stuff, though.

Make copies of relevant fields in Solr schema.xml

As noted above, the fields in play for me are title, collection, and universe. Note I am also making a string copy of each to group on.

Add /suggest request handler to solrconfig.xml

The /suggest handler looks for user input matches within the suggest fields defined in the qf parameter. Each field has a boost assigned: the higher the boost number, the more a match on that field will contribute to the final document score. I found I had to play around with the boost numbers relative to each other before getting the behaviour I really wanted. Boosting the whole-term text_suggest fields highest was not an automatic route to success. Your mileage may vary.

The pf parameter is additional to qf: it boosts documents in cases where user input terms appear in close proximity.

Above, I mentioned that a Solr document in this index is equated with a single page from a book. If a book is 50 pages long, then a naive suggester is going to return 50 documents when that book's title is matched. The suggest handler avoids that problem by collapsing (grouping) on the fields in play, which explains why the universe field is referenced there, even though it's not being used to match query input. With grouping, a unique suggestion consists of universe+collection+title. Note that group.sort and sort parameters differ. The former must produce valid groups, while the latter determines order in which suggestions are displayed to the user.

Conclusion

In a future post, I will describe how I bound the results from the /suggest handler to Twitter's typeahead.js on the front end to produce what is seen in the examples seen in the screenshots above.

All clients with a current Inmagic maintenance subscription for either DB/Text for SQL or WebPublisher PRO should soon be receiving an email from Inmagic with the download information for version 14. Version 14 of the non SQL version of DB/TextWorks was released last year.

If you have a current maintenance subscription but have not received a notification email in the next week, please email advantage@inmagic.com with your serial number and email address so it can be resent. Please also remember to let us know if your contact information has changed so we can update our records and pass this on to Inmagic.

New and enhanced features in WebPublisher PRO include:

Support for renaming query logs and starting new ones on a scheduled basis.

Ability to edit validation lists.

Ability to expose the Find button and disable find-as-you-type in InmagicBrowse.

InmagicBrowse can now update records after a validation term is changed to another term already in the list.

Improved support for Internet Explorer v10.

Please contact us if you would like assistance upgrading or would like to renew an expired maintenance subscription. We can also help you update your current interface to include the latest features available in the software itself, or with our add-on products.