LiveSerials

Monday, April 21, 2008

Your ideas, please!

While the conference and its myriad of blog postings is fresh in everyone's memory, we'd like to ask for your input on a related subject.

The UKSG publishes an Open Access e-book called "The E-Resource Management Handbook", which is intended to be an evolving source of comprehensive support for information community practitioners. The Handbook has been well-reviewed and currently covers subjects such as:

New chapters are commissioned on a regular basis to address emerging issues - we are seeking to ensure that it remains current and relevant to readers with different regional and professional perspectives. So we invite you to contribute suggestions for new chapters. For example:

Could you identify discrete aspects of your profession (whether that is librarianship, publishing, marketing, technology, information management...) to which you, or your colleagues and peers, could use a practical guide?

Have you recently sought information about a topic only to find the current literature is inadequate, out-of-date or irrelevant to your particular needs?

Is there an area in which you have expertise from which you think the wider community could benefit?

Are you aware of emerging technologies, new product categories, evolving business functions or changing supply chain processes that need to be documented?

Would you say there are developments outside of our core community (e.g. political, economic, social, technical, legal, environmental) which nonetheless have an impact on our activities and could usefully be assessed within our context?

Please let us know your suggestions, either by commenting on this posting or by emailing marketing@uksg.org. Our thanks, in advance, for helping us to keep the handbook topical and current as the landscape of scholarly communication continues to evolve.

Friday, April 11, 2008

UKSG at Torquay

UKSG goes to the seaside and everyone came too ... beautiful location (if a little chilly, especially in the Sunday morning snow), inspiring programme, and top-notch catering - so nice to have a full meal for delegates on the first evening before conference proper. Something rather bizarre about staying in greasy spoon B 'n B so early in the year but that's part of the charm - my hotel had me and a coachload of pensioner bingo fanatics until the Swedish contingent arrived!

Will remember to take my camera next year to catch the view from the harbour :)

Wednesday, April 09, 2008

Feels Free - Jim Griffin, OneHouse LLC

What is happening now in freeing up information digitally is the most important difference since the Gutenberg Press. Sharing ideas, innovation, culture and arts will lead to all the other important stuff.

Music has been the canary in the cage for much in what is happening with digital data. What has happened with music is that it has become entirely voluntary to pay for music. Legally that may not be the case. Morally it also may not be right. But, practically it really is only voluntary to pay, it's a choice that we make when we consume media. It used to cost $15 to buy the album of a band say but now you can download/share for free if you wish. It is effectively a giant tip jar. But society cannot tolerate funding anything important this way - as a sign in Jim's local bookstore once said: "People who say they like poetry and don't buy some are cheap sons of bitches!"

At the same time we can't restrict access to material based on ability to pay (or parents ability to pay), this is absolutely essential. Libraries speak to this notion. Jim fears that if we had no libraries and someone proposed them they would never happen: it would be seen as a socialist or uncommercial idea. It's therefore an important tradition and restrictions to any information needs to considered in that context.

We are witnessing a "bionomic" flood - there are economic effects in the network but the network ways increasingly like biological things. The flow of data looks more like the flow of a stream or veins in the body. The rising tide of digitisation happens like a flood and has huge potentially devasting effects. However most people in creative industries have to move forward with digitisation in order to be, you can't think about whether it is a good idea or not. You want the shortest path to deliver content to people. Lowered costs of distribution are good, you can get to more content. But some are not rewarded in this model and have no incentive to create. Not good or bad - sometimes they are both. Cars for instance are good for commerce and freedom but also can kill you. You must address risk and how to deal with it. You try to believe that you can address risk with control but you also have just compensate rather than control.

Moore's Law will hold through till 2029 at least - doubling in power and halving in price whether or not there is a network so this trend wouldn't stop if the internet was shut down. Our notion of product economics is not coming back.

The major change with technology and information industries has not really begun. Exponential change has only just begun. Soon wifi and processing power will mean we can have what we want wherever and whenever we want. Storage becomes no issue at all. Distribution and delivery becomes the key issue. This will change dramatically. We won't remain a download culture. Warehouses in the modern world are signs of inefficiency. Hard drives will soon be seen as little warehouses. Just In Time delivery will come to digital industries too. Today is about digital distribution, tomorrow is about digital delivery.

Are these predictions of the far flung future? Jim was reading Mashall Macluhan the other day he said that you would never understand the media of your time - you are largely unconcious of it like a fish in water. If you want to understand your media you need to look in the rear view mirror of history. This leads Jim back to the library and microfiche (the smell of microfiche!) and the 1920s. This was when electricity really kicked off. You realise that change there was far more savage than the last 3000 days of dotcom fever. Acoustic becoming electric is far more savage than electric becoming digital. Your feet dictate your audience (if not in the room no one else can see or hear you), within 20 years you have sound recording, radio and television. That is enormous change, that is savage. How did we deal with that change and how can it help us deal with the future? There are great conversations to be had (in the next few years only) with those who remember that change and can help us understand.

Borge and Victor Hugo both dealt badly with their work being performed - trying to get orchestras and reading out loud banned! A first licensing scheme for music was set up then that is still the model today: people pay a license and fees are distributed from the pool. This will be the way of the future - paying into am acturial pool that is distributed. Should robots be factored in? This will be fascinating as we move forward.

We live in a time of Tarzan economics - you cling to your vine of product and treat them as if they are in a bound format. At some point you must drop the vine of product, and grab the vine of service. Libraries have an advantage: there is a female bias and they realize the value of relationships that never end. The music industry are full of men who cling to power and don't realize the value of their relationships. This feminisation of market will help us understand markets so much better and using information more intelligently.

We need to run things so well that no-one wants to be pirates: this will be an enormous move forward. Whilst the expense of being legal is high, people won't behave legally. If the business model requires restriction then you have an impossible journey ahead of you. All information business have to open up more. Granularity is the enemy of ideas: bundling is crucial. If the music made one key mistake in its industry it is the abandonment of albums in favour of singles. It has been disastrous. Pirates seem the problem but they are not: time and money are as people have other uses for both. Finally embrace the next generation: Facebook and the like come from schools, they are the hotbed for innovation.

Jim hopes that people have been stimulated to think but perhaps not to agree wholly yet. "We will have done the right thing when our information feels free without being free".

Jim concluded that those things you learn when you are under 13 you take for granted. Maybe if you get into this technology industry between 13 and 30 you will probably make a career of it. If you only encountered it over the age of 30 you are of the age group that just finds it weird of unnatural, regulated, banned etc. But we're on the way out. We have to adapt to that which is surely coming.

If you can remember today it should be as the day that you went forth from Torquay to rebuild the library of Alexandra, build a new vision from the world of digitisation or, to be sure, to realise that you can hold more in an open hand than a closed one... An inspiring end to the conference to be sure!

Breakout Session B (22) Knowing Your Users: Research You Can Do - Judi Briden, Digital Librarian for Public Services, University of Rochester, NY

Although the project described today took place at a (campus based) academic institution Judi Briden began the session explaining that they can be applied to users in any institution.

Background:

IMLS grant 2003-2004 to study facility work practices - there was an institutional repositories in place (GSpace) but it wasn't being used properly/enough

Libraries hired an anthropologist - this was a very useful move. After studying staff they began studying (the around 5000) undergraduates as it had been so valuable.

What do students really do when they write research papers was the key question but generally they wanted a better understanding of students and how their library facilities, web pages etc. can work better for them. Plans were originally submitted to the research board and the libraries have been very very careful about protecting data and respecting privacy. Consent forms and the right to leave the study were both used.

Retrospective interviews

Recently Completed Papers - wanted a concrete example with details

From receiving the assignment to turning it in (step by step questions through the process)

Each step illustrated on a poster (students would draw in the step as well as describing it)

Interviews video recorded and transcribed.

Judi showed an example poster of the processes of writing an assignment. Students did not ever write in a step about talking to librarians though they did include use of the library website/services. Various stages of outline and feedback etc. described and informative for libraries. Some students consulted with teaching staff, some consulted with family (a revelation to the librarians). Students included information about the stage where they got distracted and why, what spaces are better for studying.

Judi showed a senior student's honours paper process poster - a complex poster covering several years. His process much more closely resembled a graduate student.

The next step, after collecting the data, was to look at the data. Research team and librarian staff co-viewed videos, transcripts and drawings in viewings with discussion and brainstorming. The process engendered widespread staff participation and was used at every stage of the project.

What did they learn? That students:

Work on their papers in chunks, with days or weeks in between

Asked family and friends for help choosing a topic or editing their papers (one student said her dad had edited all her papers since the 3rd grade!)

Some students assumed that if they did a Google search it included the library resources (so they didn't go back and look at the library stuff afterwards) - so services must be better but also resources must be on Google!

Did evaluate resources - just not in the ways that librarians recommend (e.g. find and print out articles but don't read for a few days and then discard some)

Don't remember who gave their library session

Another technique used were photo surveys - this allows you to investigate environments you would not normally be able to see. When you look at photos with interviewees you learn more than you would be just talking to them. They gave students a disposable camera and a list of photos they should take (the places they study, what they always have with them, how they keep track of time etc.) as well as a few "free" pictures they could use. All images were developed and put on CD and then a session to discuss was scheduled.

Judi showed a students photo (from 2004): mobile phone is present (and almost all students on campus had cell phones). Dorm room pictures were rich in detail that research team would never have thought to ask about otherwise. The things that students take to class photo revealed that no students were taking their laptop to lectures. Once alerted to this the team knew they needed to ask about. An image of a colour coded diary shows the one thing a student couldn't be without. The team found that students were highly and complexly scheduled with work and activities but no days looking the same.

The team also gave students a map of campus and asked students (for one day) to write down where they went and when. It was very very easy to do but very informative and again gave information that was not being given in any other way. Example map Judi shows covers 8am to 12am the next day and covers a complex pattern over the campus covering 2.5 miles (on a fairly compact campus). Students are out all day and take all their stuff around all day. This explains the laptop issue - they didn't want to lug it round all day but would use it when sat in one place for a long time, usually the library at night.

The team also looked at the website design with Design Workshops

Create a device that did everything that students wanted it to do and yet could still be small and light

They also had to redesign the library webpage and, in another session, marked up a version of the current library homepage with any changes they would make if they could.

Judi explained that even the warm up session devices was informative in terms of the concerns of students.

The marking up of homepages was very valuable and they repeated this exercise with faculty and graduate students later on.

Another study focused on library space. There was an area they wanted to make into a collaborative work space so students input on what should be there was sought. Trying to get participation was tough. In the end recruited on the day with posters, pizza and small payment for taking part in a design workshop. This walk in workshop asked students to imagine that the library has a big new empty space that they can design to be their space in the library, it's build and you love it... what does it look like?

Students were asked for 20 minutes of their time but many stayed for over an hour and got very into the design process. Many pictures go into lots of details. Comfort, daylight, wifi, bookshelves etc. all important to different students. Quiet but not silent seemed to recur as an idea. Talking all the ideas and analysing them they cover five key areas:

flexibility to meet variety of needs (need to be able to move things in the space/do several things in same space

comfort with family room feel and attention to environment (natural light etc)

Additional student interviews took place in the student union at a time of year when papers were being worked on - student worker at the library (same age as students) recruited participants. Interviews were done by recent anthropology student who'd just graduated. By using young research team the libraries expected to get more honest answers to questions. They asked if students felt they had enough time for papers, if it mattered and who they have asked for help (and they were prompted as to whether they'd asked a librarian), also when had they last worked on their paper and when would they next work on it.

The results showed that most students had used library resources and had been able to find what they wanted. They also felt they had enough time. They didn't feel that organizing and writing (especially narrowing topic) was going as well. Professors and TAs are subject experts (specifically professors) but saw librarians as experts on finding specific books. All students expected to do well or as well as needed (many were prioritizing several papers).

Faculty study had included interviews in offices which had proved valuable. For studying undergrads it seemed important to go to their dorm rooms. It was felt this could be tricky but students were extremely welcoming and open (putting onus on researchers to be responsible with what they do with data). 2 Dorms were studied and the anthropologists went out between 11pm and 1am as other research showed that students worked at that time of night. Only went to rooms where explicit permission and asked students to do what they normally would do (students did this!) as team observed them. What was interesting was students use of technology, particularly use of computer desktops. Judi showed a video of a student using his computer (a mac with lots of items on dashboard - conversion tables, sticky notes - with reminders, quotes etc (mostly not assignments)). Judi said that you can see from this how little physical space was being taken up by assignments (one paper sticky note on monitor) - assignments are not so large in students general sphere of activity as librarians expected. Librarians have a better sense of proportion about discussing papers as a result.

Dorm observations- Lots of distractions - music, video games, people, IMs, Facebook, NOT a lot of reading- My room is your room culture - sort of communal - people wander in and pop through. Lots of sharing going on- Freshman much more active vs. Upperclassman dorms (busy but less chaotic) - makes sense but seeing that made librarians realise that the library is a refuge and it's a place where students count on a lack of distractions to get things done when they need to.

What are they doing differently?

Gleason Library - 24/7 collaborative space - when architect selected they had to incorporate ideas of students, architects willing and excited to do that. Work went on during summer when students not on campus else they would have been included. However members of research team worked with the architects about what students wanted though discomfort about properly consulting students. When it came to final layout students were returning to campus so were asked to place drawings of furniture around as they would like. Consistently the students did 2 unexpected things: the space had a new wall of windows and architects had put comfy chairs there but students wanted natural light for work tables so that actual work rather than relaxation could take place there; demanded quiet study areas not just open space (no doors but divisions to separate noisier from quieter areas. The students love the place. Judi showed images of the room - it's busy and popular all the time, the furniture moves all the time. Judi walks through the area to get to her office and it always changes. There are some frosted glass cubicle areas where the glass walls are whiteboards - they are always being used and prove really really useful. Flipcharts asking about what students thought of the space also asked for even more whiteboards so they are there. Students multitask (image of one knitting and reading!)

Night owl reference at paper crunch-time - Judi showed a "Whooo's working late" promoting help with assignment papers (at key times in the year) until 11pm several nights a week when students were busiest. Proved useful and it's now been fine tuned and happens every semester around crunch time to accomodate students needs.

Parents breakfast at orientation - given importance of parents to learning process the library now hosts the orientation breakfast and listen to concerns, talk about process and then they tell parents that every single class has a librarian who knows about that class and about resources for that class - so parents can refer kids back to library later on.

Experimenting with webpage redesign - will go live in the fall and the new design based on student ideas. Widgets are being used making it customisable and/or rearrangable for specific sessions. This will make pages more attractive but they will also be able to learn from what students choose to include/exclude.

Changing the way that library sessions are tought - now feel much more comfortable with experimenting with students, one librarian is now a writing instructor, sessions include discussion of what's going on. We understand how confident and competant students feel - pairs of students are given a resource that they can play with and must present to group about what it does, how to search, when you find something you like how do you get it and finally if you find something you like what are two many ways to get it. Librarians add extra info as needed but just facilitate, students lead and share and it works really well.

Long term benefits

understand how our undergraduates live and work on campus - this 2 and a half year project has been fascinating and really fun. We like and know our students a whole lot better and we are feeling motivated to make the library better for them

Understand their use of library

High staff engagement and participation - librarians now have a more personal perspective and communicative with students

Greater comfort and lower overhead for trying new ideas - major change

Continuing this type of research

All students and users are different so you need to find out about yours. You don't need an anthropologist. You can do great things with low tech low cost small programmes. Get a small team of interested staff together and build it from there, staff will become interested and it will be fun and find great results!

Q & A

Q: what was the sample size?A: varied. space designs formed from 19 drawings, 8 students did photo project, 20 students in interview etc.

We've written a book, Studying Students, which is available for free download and have a project website with lots more info:http://www.tinyurl.com/f63dj

Q: Are you reviewing this process?A: All the time. We know students love the library and we are listening to them. It's granular and hard to add up. Over time we're looking for more interaction though and writing and library classes very much following that and much more participative. No metrics yet though.

Richard Withey starts by saying that though a trained librarian his background is in newspapers, an area so stressful he's lost the long flowing locks of hair that once got him mistaken for a female prostitute in Paris... Now there's a start to a talk...

Newspapers are declining in print but there are areas for growth online. Many pages are printed online and not just in the US. Newspapers make their money from advertisting and online advertising is growing consistently with page level classified ads a particular breakthrough.

The UK is the biggest and most volatile part of the world for the internet and the advertising spend in the UK on the internet far higher than elsewhere in the world (including the US). In 2009 online advertising will be main medium. 60% of UK homes now have broadband and, at the same time, tv watching and other key advertising activities are on the decline.

If newspapers are to remain as print vehicles they would go against the trend. So analogue (print) will die out. One thinker has suggested the last paper will be printed on April 11th 2041 but Richard thinks this is wrong: there will always be print niches.

Business models have to change in the move from print to online. Newspapers make up to two thirds of their money from advertising. Only the biggest could survive on cover price alone. Newspapers are subsidised by advertising which has led to a cosy virtuous circle that all but excludes the consumers. Newspapers and advertisers are in constant communication with each other and prices reflect prestige not neccassarly value of that advertising spend. There remains a lot of discussion on fragmentation of audience, disintermediation and personalization. People can now do things on broadband that were only thought of in the '90s and were not possible on dialup.robinsloan.com (see Epic 2015 flash movie) imagine a future without newspapers. It's a concept that Rupert Murdoch has taken note of with recent re-investment in the web. Search engines invest constantly to keep their technology relevant and effective. Newspapers don't invest in search enough but it's hugely important. Thus many newspaper companies are undervalued as investors still see their development in old terms.

Usage at the moment of all medis channels is declining though there is an upward trend in multi-channel viewing in the Sky/Virgin/Freeview type direction. Traditionally print papers have done give aways to boost circulation but if you have to do that how useful is it and how relavent is it to the new audience?

And that is leading to a new emerging media ecosystem. Blogs, wikis etc are now much more important as newspapers become a community activity. Thus the work of newspapers have changed so much in the last 10 years.

In a very short time the whole industry has changed. Trained in the new media readers are now learning habits that have significant effect on publishing business. This is the first technology roll out that has been consumer rather than business led: they are beginning to control the agenda and this is the first time this has happened really.

Publishers are sometimes reluctant to engage with new models - the margins are less, audiences are fragmentary and new delivery mechanisms have flourished (Richard argues that content was never king, distribution always was). And the industry may only respond too late - most newspapers now running at a loss.

We can however learn from elsewhere. The music industry has been informative (terrible at first, much better now) as has the response of television (BBC iPlayer etc) etc. There are also new distribution formats and new devices (phones, Amazon Kindle etc). You do all engage personally with many of the new formats but whether you do so professionally is another issue - publishers certainly aren't doing this enough yet). We also need to engage well with issues surrounding copyright. You also have to engage with audiences on their own terms and release control of your content to remix (but it is hard to make money from it).

Technology isn't the only consideration. You need to engage with readers regardless of format, get the right type of staff in house and make sure your board is younger and more engaged. Richard also suggests you should "stop employing people who are building careers rather than businesses" as it takes 5 years in a high level job to bring about real change.

Richard concluded with showing the now infamous (and excellent) YouTube video by Mike Wesch: The Machine is Us/ing

The new media ecosystem: lessons from newspaper publishing

The last newspaper will be printed on 11th April 2041.

This is the view of one analyst who considers that the transition to online is fatally disruptive to the current model of newspaper publishing. As a librarian at large in that industry, Richard Withey shares with us some of the lessons being learned there.

Very few newspapers can survive on cover price alone - most need (dwindling) advertising subsidies. Will "analogue" exist at all in a few decades? Analyst Sam Zell considers that the newspaper industry has been too slow to predict and react to the changes: "the newspaper industry has stood there and watched while other media enterprises have taken our bacon and run with it." Circulation has dropped massively in recent years, and the old techniques for growing circulation (giving things away - everything from CDs to home insurance) are no longer as effective - and undermine the value of the supposedly core product.

A new media "ecosystem" is emerging, with user-generated content and interactivity adding to the original methods of news publishing. This rise of collaborative publishing and new digital formats (e.g. OLED) - driven by consumers - are changing the face of communication, and it matters to all publishers. Publishers have been reluctant to engage with new models; margins have altered, audiences have fragmented and new delivery mechanisms have flourished - but shareholder expectations have not diminished accordingly.

The music industry seems now to have begun to respond successfully to these changes (following a rocky period). Broadcast media also seems to have recognised and responded to the threat. Publishers must engage with new distribution formats - RSS, social networking - and with others (e.g. ACAP) to find solutions to those sticky problems, such as copyright, which continue to block progress. "Going where the user is" is critical, as is recognising the change from provider-driven to consumer-driven publishing. "Lower the average age of your board by at least 20 years", to ensure that you are led by people who understand these developments and want change to happen, and employ only people who are building businesses rather than careers: it takes time and commitment to make major change happen. Accept that your content will be mashed up - your message is still being disseminated (but you'll need to consider new ways to make money from it).

Bobbi and Robin began by explaining the background to an innovative training scheme undertaken at Missourri River Regional Library based on the 23 Things programme created by Helene Blower forCharlotte-Mecklenburg County libraries. Missourri were the first in US, and only the second in the world to implement 23 Things (though they actually did 29 things!)

Tools used were:

gmail (entire staff got one to get blogger access)

blogger

bloglines (this was pre google reader)

del.icio.us (all links from all classes and bundled by class no.)

Odeo (website that allows production of podcasts)

Email (staff were not completely technologically savvy)

Added:

Myspace (it was clear that users were there)

Gmail

Map of Visited States (from Steven Abrahams 42 things)

Search Engines (remind folks that Google isn't the only player!)

Google Labs (some things in Google Labs

Subtracted:

Life time Learners (just went ahead and skipped that part - that was probably a mistake!)

How the programme worked for themLessons went out on the blog, weekly emails sent to staff (fit with normal use of email) and added incentives. Everyone who completed the original programme got an MP3 player and they would give candy, certificates etc as they went. Incentives included public acknowledgement at staff meetings, little candy, oranges, toys, pens etc. on desks - peer pressure in a nice way! And a little mp3 player was given at close of course.

Staff demandsWe took the ideas from Charlotte Meklanberg and applied directly but that just didn't work for them. It took staff longer to complete (4-5 hours/week) so divided out lessons a little. The proximity of peers and distractions tricky so a dedicated computer space was set up by Bobbi's office so they could ask questions (though suppossed to be self-sufficient programme). The team are also running advanced workshops (e.g. Flickr, MySpace and tagging).

Staff reactionsGood responses, helpful for keeping up with users and the method and self-led style worked well for some. Meanwhile Bobbi and Robin thought it worked amazingly and wondered why they hadn't thought of it themselves!

Lessons Learned

Be flexible (so depending on your library, workload, other issues etc you need to flex to make it work)

Incentives work (the mp3 players were great but even the silly incentives still entertained and pleased people - even cheesy cards etc!)

Library Learning 2.1 next - didn't want to just be done with this programme. In fact Charlotte Meklanberg did the same thing too. We have taken what we learned from round . Each lesson goes up one at a time, and all are 1 hour micro lessons. All about interesting things if not all 100% work focused.

Public classes - Facebook class at end of April

The library goes 2.0 - library now on facebook, flicr and myspace etc. and also manage the blog (now a pool of writers though). So now we continue with it.

Demo:

Library Learning 2.0Week by week lessons (which look super!) which staff had to blog on each week.

Library Learning 2.1 Is similar but each week is a new tool - staff only have to comment on each lesson to show participation

del.icio.us shown - wide variety of libraries doing this programme and many many libraries are involved (as shown on the originating website: http://plcmclearning.blogspot.com/)

Q&A

Q: In Library Learning 2.1 do you intend to encourage choice in other blogging software for instanceA: We encourage staff to ask what they want to learn. Sort of wish we'd done Wordpress this time. But we use same platforms to track use for Incentives again - mp3 player again and a prize draw for a digital camera. Doing step by step instruction is harder across multiple platforms. Our staff weren't very tech savvy so telling them to go get a blog would not have worked well. Most of our staff were probably around 1 on a scale to 1 to 10 on a tech-savvy scale. Most had not heard about what they were teaching about. Some attitudes about MySpace dangers etc. changed when they actually learned about tools. Patrons were coming up and asking questions about how to, for example, block someone on MySpace but the staff in the computing centre could not answer those questions so the staff really need to know what's going on.

Library 2.1 is going really really well. Incentives from the dollar store are still appreciatted and it's working really well!

Q: you are doing technical training, what about cultural training?A: yup we're talking about personal identity safety, instant messaging etc. We talk about it a lot face to face as well. In the original programme al lot of the younger staff didn't take part, some of the older ladies jumped on and adapted really well and went for it. Online privacy is the second lesson in Library Learning 2.1 - including ad that gives a good idea of the risks. Once somethings online its there. The ad sparked some fascinating discussions online about this.

Q: In terms of mobile the first reaction is to ban and shut out of our systems. Younger people adopt regardless of that approach but as adults we should facilitate new environments anf you must take risks and learn to use tools and environments.

A: We spent a lot of time talking with someone whose library had blocked Myspace (in us myspace scare stories all over the place). Robin's son wanted a myspace page and she took him through a compromise set up - they set up a cat page for a neighbours cat and chatted through privacy etc. in the process of creating. Bobbi has had a personal and professional Flickr account - when she put a foot picture up though all kinds of weird types appeared out the woodback

Q: Increasingly our identity is online so management of that will have to be really important. Child identity issues etc. can be raised.A (Bobbie): I really want to teach a class on that but its hard to know where to go with that as its all very very new. And you can't teach your kids/users about online safety without knowing about it yourself

Q: we have an it department to pick a single platform so one blogging software for instance, very difficult.A: we're very lucky that our director buys into this and Robin is our IT manager so we are in a good position. You may just have to pick one and start with something. \

Q: we get students complaining about other students using Facebook etc on institutional PCs and we have to say that they may be doing workA: you do hear about places where VLEs have worked badly so moving to Facebook, Ning, Myspace etc. Loads of really creative work going on that are definitely educational - eg math equation son myspace. The more staff that know and understand this stuff, the easier it is to build defence.

There is 99% more to do: realising the potential of scientific data

Publishers, says Peter Murray-Rust, are not helping in the dissemination of scholarly research: they are destroying it by locking it up or scambling the data in bad formats (PDFs) and by failing to publish so much of it. "There is 99% more to do," and if publishers aren't doing it then scientists will do it themselves. They're not interested in the human-readable discourse, they're interested in the data behind it. We're still emulating the way the Victorians shared information, and barely including any data in it, or enabling that which is there to be readily extracted and re-used.

Some publishers do publish the data alongside the articles - reams of additional information which help to prove that the ensuing article is scientifically accurate. In some cases the data is protected by copyright, or firewall barriers, which Murray-Rust considers inappropriate. "This is not a work of art to be copyrighted by the publisher; this is a scientific fact and should be freely available ... We have to get away from this culture of restricting the flow of scientific information."

"Young people", however, are not held back from the future and "have no fear of changing the world". Nick Day's Crystal Eye robot searches for crystallography on the web (this data is usually exposed explicitly, and not subject to copyright, but the data is not well structured). Nick has stored his data in RDF so that it can be mashed up with other data - plotted on a map, for example, to demonstrate the changing geographical balance of research output. "We've got to get into the habit of publishing data, as well as text." When the data is not published, we need to resort to text mining. The Oscar tool (written by Cambridge undergrads) "reads" documents and mines them for specific subjects (e.g. chemistry). But, copyright restrictions on scientific literature limit widescale data mining.

Murray-Rust admits to being "slightly polemic" as he accuses publishers of "being desperate" to prevent literature being more widely opened up. Talis and the Royal Society of Chemistry, on the other hand, he praises for the promulgation of open data and their support for some of the activities discussed. Semantic authoring presents a huge opportunity, but "science will be impoverished" until data is widely, openly available.

IOP's Jerry Cowhig (and chair of STM) tries to redress the balance: if everything scientists did was simply given away to each other, but what publishers contribute to the process does incur costs and all the giant leaps of recent years (move to online publishing) has been funded by publishers. With costs of $2-3000 per paper is it reasonable to expect publishers to then give the content away free? Scientists do acknowledge that publishers' role is important. PM-R, in reply, points to the Wellcome Trust's initiative to make data available and then pay for publishing upfront: funder, rather than reader, paying. CERNE too are funding SCOAP3 to support upfront open publishing. Publishers will be remunerated but at the opposite end of the process. It's not an easy transition to make but it is a viable way forward. JC agrees; a wider debate is necessary, but certainly the NIH model of mandated deposit is "nobody pays" rather than reader- or author-pays.

Semantic Open Data in Scientific Publication - Peter Murray-Rust, University of Cambridge

Peter is using HTML NOT powerpoint - "Powerpoint destroys information" and pdf does too apparently.

The view is that journal articles is the means of scholarly communication but it is very much the case that journals are how your work is recognised. Data is really important though and, as far as Peter is concerned, publishers are a problem in terms of data. They mangle and restrict it.

A graph of carbon dioxide growth in the atmosphere (with no annotations) is up on screen and Peter is explaining why converting this sort of graph to a paper/electronic paper type format is a very inefficient way to do things. There is 99% more to do with scientific publications. Peter is showing (live) a paper from JoVE this is in the form of video. Unfortunately it's not working (maybe a publisher nobbled it?!) just at the moment. This video would give an idea of how scientists look and present their information (video and data).

At the moment journals contain human readable discourse (or Full Text) + facts and tables and this is usually only given on a subscription basis. Most scientists want the facts and the tables but the readable content is not as useful. Flicking through the electronic version of Nature you can see that it is inpenetrable - we emulate the victorians with our fomatting, abbreviattions, references etc. You can copy and paste but that's all. The hard information for reproducing experiments is discarded. Some journals require that information to be retained but this makes for a huge journal. But this data does help ensure that you have all the data and you can see how accurate the work is and judge it better. Some publishers reject this extra information or hide it behind a firewall, cover it in copyright notices etc. Peter contends that this material is not a work of art by the publisher but should be freely available to the world. Otherwise you spend your whole time photocopying and measuring this information. This is not fantasy - last year a student posted a graph with 10 data points on it on her blog and got a legal notice from a publisher.

Peter is showing us how young people - who use social networking sites and have no fear of changing the world - are using technology. The example is a robot built by a student to pick up information from across the web. The robot goes out at night and finds information on crystallography (Nick Days Crystal Eye robot) from tables of contents etc. It is almost maintenance and cost free though changes needed when changes to tables of contents and sources being searched.

Acta Crystalographica have done great work in creating a rich scientific item. We have to get into the habit of publishing data as well as full text. What Nick Day has done with this data is turn it into RDF and do mashups (mapping video shows one of these).

Text is important to the web. Robots can help filter through this information in pdfs - you can go through a thesis to find relavent information. The only stopper to go through and analyse and reuse data is the restrictions of publishers.

NIH mandate requires all research to be publically available for free. However you can only read it you can't reuse it or trawl it with robots and this is due to the destructive force of the publishers.

At the moment bioscientists spend enormous amounts of time and effort to annotate journals etc. Project Prospect is the first step in semantic publication - for instance it shows chemical compounds when you click on the name. Wouldn't it be great if this happens at the authoring stage? There is a huge opportunity in the semantic authoring of papers.

Open data is crucial to this whole process though. Restricted access helps publishers but cripples science. You also need to capture information on the fly and add to departmental repositories - projects ongoing to do this.

Funders should be requiring open data. It should not be held by publishers and you certainly shouldn't give your data to publishers to sell back to you. Use of scientific CC (Creative Commons) licences will go a long way to this.

Q (from a publisher): There are costs associatted with putting data onto the web and the infrastructures put in place by publishers. If scientists were just exchanging different that would be different.

A: Answering this would get us into a long debate but I would point to an initiative from the Wellcome Trust (who are actually paying for the system). This gives a difficult transition but there is a model where the funder pays not the publisher. CERN is also investing in open publishing in the same sort of way. The SCOPE project looks towards funding of publication

A (response from questionner): to some extent I agree. The weakness of the NIH system is that no-one pays effectively!

Coineyes at the Mostly Red dinner

Web 3.0 how to help users stop reading the web and get on with their work - Geoffrey Bilder, Cross Ref

Geoffrey Bilder started by saying that he absolutly hated the term web 2.0 and web 3.0 but it is using it regardless!

Geoffrey recalled his UKSG talk from 2004 when he talked about mashups, syndication etc. At the time these changes were already occuring and the ways people were using the web was changing significantly. We are beginning to see this again now - people are beginning to go beyond Web 2.0.

It's important to recall the history of the modern internet which is comparable only to the invention of the printing press: this was a huge change and a huge explosion of content. But what we take for granted in books and printing took centuries to develop. It took the steam press to get industrial printing on a grand scale, journals came a long time to come into beinig. Compare this timeline to a timeline of the web and hypertext. If you compare the two timelines (as Geoffrey has on his slide) you can see we still haven't reached out "Martin Luther moment" just yet. For those that say the internet moves faster it's important to know that a format, the incronabula (partly hand illuminated to look more friendly) , that came out to help people deal with the fact that material produced on the printing press was not user friendly and people didn't like to read it. We are uploading print style documents to the web at the moment. Still. These are the modern incronabula.

We are surprised now that people skip around and don't read things through but there is so much information to read and we do not have the skills to sift through that all. Researchers want to publish as much as possible but this makes their own reading life difficult as there is so much available. They want to do research not just be reading.

Geoffrey defines the original web as read only (on the whole). Web 2.0 is read and write (easily) - trails of bookmarks, blogs etc. Web 2.0 can also be defined as helping researchers help each other to work through what else is out there. Blogs aren't all useful but some blogs can really help the academic community. There is now a community trying to create consistent metadata and graphics etc. to help you find scholarly communications. Wikis are a tool for collaborative working and information sharing. Social bookmarking and categorization tools are great and very powerful: they are easy to use and they are centrally held so people can be pointed to those bookmarks (and you can access them all over the place). [see Geoffrey's bookmarks here: http://del.icio.us/gbilder/]

Compare that to, say, emailing out information: what do you say or include? Deciding who to send things to is frought with danger - you'll miss people, you'll offend people. Sites like citeulike are really easy for sharing information in a high bandwidth way. Video and image sharing can also be scholarly.

This means that webservices encourage linking through social software creating a virtuous circle of information. And these services effectively allow you to "subscribe to their brain". Lets build tools that help each other to find information and share information using social network tools.

Web 3.0 is, to summarise, to read, write, compute and identity. Identity on the web is a growing issue. I won't focus on identity today.

Compute means that the web will help you automatically define what will be useful for you. What printed matter does is make you read all the data and then use (human) search and retrieve systems and then you go back to the data again. This is perverse. You don't have to do this in web 3.0. You don't have to include all the data. Just using consistent metadata online is hugely useful. Data mining through computer digestible form of standardised data could be enormously useful although publishers don't love this.

But we can do more...Semantically enrich your data and treat the web as a database. To do this we use RDF which I will try and explain as Lee Dobbs just taught his 6 year old son lately so it must be possible!

RDF helps treat the web as a database. In a database you have rows - name/describe the thing - and then columns that define the thing. In RDF this might be: this thing on amazon has an author, this thing on wikipedia. You can then query this structure like a database. Web version of SQL is SPARQL and you can use this to pull out data just as you'd use SQL on a database. This makes the web more readable and much more automatically readable. You therefore take lots of linked pages and treat them as lots of linked databases.

When the book was formed a lot of things that when we look at a book now we take for granted (table of contents, running headers etc) was invented and developed. Our challenge now is to develop the new apparatus in ways that will help researchers find stuff more easily.

Q&A

Q: Looking at your example for social bookmarking does the technology exist to switch from journal or author focus to article focus for use on citeulike, nature etc.A: Short answer to that is yes. One of the reasons I didn't address was identity. Finding which person user name matched to a specific name is a problem and there are privacy issues with that too. Hopefully we can move to something where we start from identity and move on.

Q: Many of the social bookmarking tools have poor uptake, few tags per personA: Only "proppellerheads" are using it not other folks, that's part of the answer. But things like RSS has only really become useable in the last year or two as browsers have supported it (despite RSS being around for years). In order to use these tools effectively you want to subscribe to stuff not go out and look at it elsewhere. A classic growth pattern is to have stuff sent out to you.

Q (comeback): Many scientists can't see how this will save them any time: they see it as an additional media and extra work.A: what I can say is try it. What is our most high bandwidth way of transferring information? Face to face meetings and events. But you then follow up with resources. The reason I think this technology is useful is because social networks replicate this high bandwidth communication on the web. If you can show the value of this technology and how it's productive people will use it.

Translating Geek to English: exploring the possibilities of the Semantic Web

"I've made a good business of translating Geek to English," says Geoff Bilder (who hates buzzwords and can't remember submitting a paper with the title Web 3.0 - mea culpa, possibly).

Back in 2004, Geoff talked at UKSG about mash-ups, syndication, RSS and FOAF. Those were the balmy days when the term Web 2.0 had not been coined and we could talk about these individual technologies - and let them get on with changing the web - without lumping them together in a faceless buzzword bundle.

We can draw analogies between our current situation and the huge explosion of content that occurred shortly after the invention of the printing press. But if you compare the timelines, we're still in the primitive stages of developing our technology - "we haven't reached our Martin Luther moment". And just as we are uploading facsimiles of printed works onto the web, early modern European printers illuminated their incunabula to make them more palatable to an audience bred on monk-y manuscripts.

But we're uploading masses of this stuff. Too much. Who can read the glut of data that is available - and relevant - to them? Researchers are inundated. "People would really like to try to avoid reading," in order to get on with research rather than background tasks. Web 2.0's "read + write" capabilities help researchers to help each other find what's out there. Blogs are ubiquitous and emerging tools are enabling easier distinction between research-related and other postings. Social bookmarking allows us to share with others, quickly and easily, the information we are interested in. Tagging enables filtering of bookmarks; ultimately it's a process of subscribing to a colleague's brain.

Web 3.0 takes us beyond "read + write" to "read + write + identity + compute": it promises that we don't need to strip data out of published articles (extracting HTML from a print facsimile), and analyse it before stuffing it back in ... we'll create consistent metadata, structure it, share it in easily-computer-digestible forms (standard ones) and make better use of the content that is out there: it's the semantic web. Storing data in formats such as RDF allows for modeling of relationships between data; metadata encoded in this way allows HTML pages to be queried (using Sparql) to extract metadata NOT by harvesting and parsing (unreliable, prone to error) but by extracted tagged fields: the page is not only human-readable, but also machine-readable. This machine-compatibility is key to the semantic web and to Web 3.0 (whatever that is). Just as tables of contents, page numbers and many other tools were developed - over centuries - to make printed content more accessible and useful, so we are now developing new tools that make our current content formats more accessible, more useful.

Richard Gedye asks whether the technology exists to track how many times an article is bookmarked across multiple social bookmarking sites (answer: yes) and to drill down and explore who has bookmarked it (yes, theoretically, but there are privacy issues).

Mark Ware has been exploring Geoff's del.icio.us page during the presentation, and has picked up an article entitled "Scientists shun Web 2.0". Connotea has 50,000 users averaging fewer than 10 tags per user; Ginsparg's review of social bookmarking shows low uptake. Why? Answer: We had the same reaction to personal computers, to email, and to many other technologies at this stage of their development. [We're still fairly early on the adoption curve]. Some things like RSS have only really become useable in the last year or so, as browsers become more intelligent. Only when technologies mature and people recognise the value they add will there be good uptake. Mark responds that scientists don't see that value yet - no time is being saved, they think. Geoff says it IS more efficient; don't knock it till you've tried it. Our current means of interacting as a community, and sharing information, is going to conferences and networking with our peers. That's a much higher-bandwidth method than sharing content digitally.

Tuesday, April 08, 2008

First timer at UKSG Conference

As a first timer at the UKSG Conference I was unsure what to expect when I attended the opening of the Conference yesterday morning. Paul Harwood, UKSG Chair introduced the 31st UKSG Conference admitting that he had been a bit concerned that people might not come to Torquay in the usual numbers? He needn’t have worried though, judging by the 754 delegates at the event this year it seems people have battled with the snow to get to the new seaside location. In contrast to the 14 hours spent at Gatwick Airport by one of the speakers, Kevin Guthrie, my own 4-hour journey from Paddington Station in London was relatively straightforward.

So what did I learn yesterday? I watched and listened with interest to James Gray’s presentation ‘The Digital Supply Chain’ but was a bit worried when the first few slides contained pictures of warehouses and machines in Nashville. Fortunately Gray, CEO of Ingram, was an enthusiastic speaker and skilfully presented an overview of the complex world of electronic content distribution with a focus on integration. I was interested to hear how Ingram are working with Microsoft, who, he said ‘are trying to catch up with Google’. The Google aim had seemed ambitious a few years back but now everyone is interested in indexing all their content and offering it in new formats. The presentation was interesting and covered a wide range of topics from digital printing to widgets under the digital supply chain umbrella. While it seemed initially like a sales pitch for the variety of products and services offered by Ingram, including barges strangely, it successfully offered an insight into the complexity of electronic content distribution and the opportunities offered by the rapidly changing landscape.

Next up was another interesting and entertaining speaker, Muir Gray, NHS National Knowledge Service. He talked about the way medical evidence is presented and expressed in journals and how there are many flaws in the process of research reporting, peer reviewing and editing. He argued that errors due to chance and incomplete reporting of research leads to an unfair positive bias. He joked that he should publish a ‘Journal of Negative Research’ and mentioned that on a recent visit to Google he was the oldest one there by about 80 years…

He also suggested that researchers need more training and highlighted the growing influence of industry in trials leading to constraints on data made available. He also mentioned his work on reducing the carbon footprint of the NHS and its supply chains. See www.knowledgeintoaction.org for further information.

Kevin Guthrie, Ithaka, talked about sustainability and, like the first two speakers, mentioned Google in his presentation. He said that we are now all involved in the academic enterprise and that we use the same tool (i.e. Google) for very different searches.

He spoke of the speed of growth and innovation which now makes today’s value-added feature (which would have give an individual or organisation competitive advantage for years previously) tomorrow’s commodity.

He used the newspaper industry as an example and warning of the possible future of all publishing. Newspapers successfully sold advertising, in particular classifieds, and this revenue plus subscriptions ensured that the market was buoyant for a long time. Now, the market for newspapers is shrinking with the growth of the web and print advertising revenue has slumped and even online advertising has slowed down. Some argue that with a vast increase in the different ways we can access news (and be sold to) the newspaper industry is in permanent decline. He argued that the traditional insulators for scholarly publishers were no longer available and that more focus on the end user was the key to survival for the publishing industry. At least I think that's what he argued but with the low lighting and my poor notetaking I could be mistaken...

Grand Champagne

I've been advised that there is a distinct lack of tabloid-type news stories on this blog so far, so here's a little snippet I heard yesterday.

Sunday night at the Grand Hotel saw an early start to the social part of the programme. Several publisher and vendor representatives kept bar staff busy with frequent purchasing of prosecco until the wee hours, some reportedly lasting until after 4am. The group were sustained by the Noel Coward-esque performance of Bowker's Darren Roberts at the bar's grand piano.

The festivities culminated in a certain eBooks supplier running off with the cardboard cut-out of a man from the hotel lobby, at which time it was decided to call it a night... er, morning.

Expanding access to serials and other holdings through Faceted Browse

James Mouw's breakout session gave a fascinating overview of how the University of Chicago's library has launched a new discovery tool that they are calling Lens.

By way of background, he explained that Chicago has 9.5 volumes in its main library (one of the largest collections under one roof). The library has had multiple interfaces for accessing its various resources, including a clunky catalogue and an underused and somewhat un-useable CrossSearch system.

Lens seeks to bring all of the library's resources together and allow users to search and browse across the entire collection in an intuitive manner. The library's procurement process resulted in the choice of MediaLab's Aquabrowser: a system widely used in public libraries but les known in the academic community. James explained how the system was adapted for the academic library, had the initial features required by Chicago added, and had the library's main records (5.3 million MARC

records, 58,000 e-journal records) loaded ready for a beta launch - all in a rather speedy four months.

It's worth going to take a look at Lens for yourself to see the searching and browsing features it has to offer. A search from the homepage returns the expected relevance-ranked results, but also gives you a word cloud of related terms and the option to refine your seach in all kinds of ways including author, genre, date, format, availability, topic or library call number. A nifty breadcrumb feature reminds you how you've refined your search, but also lets you remove or "lock in" any of your refinements for the rest of your session.

External resources are now being added to Lens, and the next phase of development will see metasearch added. James and his team are looking at the use of the system to help determine its development, and have already found some interesting trends: users rarely click on to the next page of search results; the most commonly used refinement is by year. User testing has shown positive results, with seasoned researchers discovering new content in their disciplines that they didn't previously know was available.

Large-scale digitisation: the £22 million JISC programme and the role of libraries - Jean Sykes, London School of Economics and Political Science

The JISC digitisation project came out of a HEFCE £10 million underspend in 2004. An advisory group was set up and proposals considered. Initially extant proposals considered but community consultation exercise was also undertaken including and leading to a unique community open ranking exercise. This resulted in a mix of extant and new proposals.

6 projects funded:

18th Century British Parliamentary Papers - all existing 18th century papers (over a million pages) digitised using first robotic scanner in the UK.

Medical Journal Backfiles - joint project with Welcome Trust

Online Historial Population Reports - 200,000 recordings

British Library Archival Sound Recordings - a panel of users helped select items to archive to around 3000 hours.

British Newspapers 1800-1900 - expert panel selected items

Newsfilm Online - this is the only phase 1 project not to have been launched yet. This has been one of the most difficult projects but will bring ITN and Reuters news archives to the web.

Phase 1 of this project has already been one of the largest digitisation projects in Europe with an advanced integrated programme covering a broad range of formats and subjects. IPR frameworks have come out of these projects (especially Newsfilm Online) These projects are exemplars for increased interoperability including open source. This whole series of work has also helped a JISC digitisation strategy to emerge.

Lessons have already been learned as the project heads into phase 2:

User consultation ("do it!" is the main lesson), procurement, metadata (very important from the start)

IPR, quality assurance, indexing

Project management, risk register

User interface is very important.

Evaluation must take place throughout the project (capturing lessons learned as you go)

In phase 2 there are 16 new projects (out of 48 proposals, half were in long list). The guidelines and criteria for phase 2 draw on experience from phase 1. Again community ranking used along with expert rankings.

Phase 2 Projects

19th Century Pamphlets Online - Polemical voices from the past on the great debates of the 19th century - phase 1 (1 million images from 30,000 pamphlets)

A Digital Library of Core e-Resources on Ireland - Visit a one stop shop for Irish studies e-resources

Archival Sound Recordings 2 - A critical mass of rich audio material from all over the world, at your fingertips

British Governance in the 20th century – Cabinet Papers, 1914-1975 - In its own words: the British government at peace and war

British Cartoon Archive Digitisation (BCAD) project - Browse the largest online archive of cartoons in the UK

British Newspapers 1620-1900 - Read the first three centuries of newspapers from all regions of the British Isles

Digitisation of the Independent Radio News Archive - From Callaghan to Thatcher, a contemporary audio archive from the only UK radio news archive outside the BBC (led by Bournemouth University) - 7000 reel to reel tapes

The East London Theatre Archive - Putting the spotlight on East End music hall heritage

Electronic Ephemera – Digitised Selections from the John Johnson Collection - Discover hidden treasures of everyday life from the 16th century to the 20th

First World War Poetry Digital Archive - Preserving and sharing memories of the Great War through the words of its poets (led by University of Oxford)

In View: Moving Images in the Public Sphere - Watch the key social, political and economic issues of our time unfold (led by British Film Institute)

Pre-Raphaelite Resource Site - Trace a movement that changed the face of English art

UK Theses Digitisation project - Opening access to over 5,000 of the most popular British research theses

Welsh Journals Online - Free online access to the best Welsh periodicals – past, present and future (co-funded with Higher Education Fund of Wales, led by National Libary of Wales)

All phase 2 projects underway (started 2007) and will be completed in 2009. Very tight timings. No detailed usage stats as yet for some of the phase 1 projects so a wider impact analysis will be used in next bit to HEFCE. WIll take evidence from July 2007 Conference (held in Cardiff) and also from phase 1 and phase 2 evaluations. Government initiatives and strategies also fit with the type of request to be made.

Phase 3Underspend from Phase 1 will be used to signal some thinking. This will include an updated JISC standards catalogue as well as a gap analysis into what the community needs and the availability in the comunity of relavent large or significant collections (outside HE that is as well as within). Also some of the underspend will go towards capturing some of the IPR guidelines formulated in the project. Investigated will also take place into the development of thematic portals to make resources more comparable and usable, possible extended to cover JISC Collections. Could be great if combined, cross-searchable etc. The group also think a UK Forum for Digitisation should take place.

Wider landscape and implication for libraries

All Phase 1 and 2 projects will be free at the point of use to UK, HE and FE (via authentication)

Some will also be available to schools and public libraries (via authentication)

A few will be on completely open access

Thematic portals will be a great step forward for enhancing user experience

Major challenges re: access and sustainability

Even finished/finite projects will need to develop to stay

Many projects will want to continue adding content

Programme may be able to attract income from wider access overseas

Ultimately librarians will probably have to be prepared to pay for licensed access (as with JISC collections)

Jean Sykes concluded that the programme had put the UK at the forefront of global digitisation. HE and FE users will benefit from phase 1 projects already available. But librarians but be ready to promote and at some point subscribe to this type of content.

Q: how do you decide which resources are open access and which are authorized?A: generally decided by owner of the content not by JISC.

When HEFCE underspends: a £22 million JISC digitisation project

In 2004, a £10 million HEFCE underspend [crikey Moses!] resulted in a windfall for JISC: Jean Sykes recounts being told to "spend this in 2-3 years on large scale digitisation projects, please."

JISC reviewed a list of extant proposals for content digitisation, but considered it important to consult the community and bring new bids to the table. 6 major projects were selected for Phase 1 - the largest digitisation activity in Europe - ranging from 18th century British parliamentary papers to British Library archival sound recordings. [Was it these chaps who had Charlotte Green in stitches last week?] The latter group set up a user panel to help decide which of the masses of recordings in the archive should be prioritised for digitisation.

Standards had to be agreed across all projects, and multimedia in particular presented a variety of obstacles. But from this, a JISC digitisation strategy is emerging. Lessons were learned:

user consultation (do it - and get some experts in)

procurement (technical and commercial issues)

metadata (metadata, metadata - importance cannot be overstated - build it in from the outset)

quality assurance and evaluation throughout the project

impact assessment (an increasingly big deal - projects now need to build in licences and metrics from the start)

project management - and capturing of lessons learned

interface accessibility

promotion of the finished service.

Phase 2 covers another 16 projects with a further £12m funding from JISC (big up those crazy HEFCE underspends!). Seven thousand reel to reels! Four thousand hours of recordings! Fifteen thousand Giles cartoons! Three thousand high quality Pre-Raphaelite images! Fifteen thousand theatrical objects! Half a million pages of Cabinet Papers! Over one million pages from national, regional and local newspapers! Five thousand university theses! Great War poetry and contextual archive material! [Apologies for all that terribly unliterary exclamation, but really, the breadth and scale of this stuff is staggering - did I already say Three cheers for HEFCE underspends?] Phase 1 and 2 projects will be free at the point of use to UK HE and FE, and some to schools and public libraries.

And now they're already preparing for Phase 3 (and here was I thinking Phase 3s are merely the product of an over-optimistic imagination). Work is underway to assess impact/usage of Phase 1 projects, which unfortunately did not have statistics built in from the outset so some qualitative indicators will need to be used. A gap analysis will be conducted to assess the community's needs, and the development of thematic portals will be investigated to make resources more comparable and usable (these could be extended to cover JISC collections, too). Future sustainability remains a big challenge - keeping digitised content accessible; migrating it to future formats and platforms; updating collections with new content. Ultimately, librarians may need to be prepared to subscribe to this content to ensure its preservation.

Mass digitisation of historical records for access and preservation - Dan Jones, Head of Business Development, National Archives

The National Archives holds 175km shelves hold government records but increasingly they operate as a digital archive. For each physical document delivered, 100 are delivered online. Over 60 million documents are already available electronically and the National Archives are at the heart of Government policy on information.

Use technology to add value through indexing, contextualisation, search etc.

Also use the web to add quality to site visits too and to segment the stakeholders well

Only wholesale digitisation works for this model.

The National Archives are in competition/affected by the likes of Apple, Google, broadband uptake, web 2.0 (wikis/blogs and the "wisdom of crowds"); emergence of specialist provides (esp. genological and military historians in the National Archives case). Thus users want everything now, everywhere, for free.

However in reality the scale of the collection is vast. Over 100 million catalogue entries. The real cost of digitising the whole collection would be around £5 billion. This shows the importance of the strategic partnership with public and private sector. The scale may be vast but we need to make attempts to begin this work.

Models of DigitisationThe National Archives exploits a "mixed economy" to develop these access services with work being internally funded; commercially fundedGrant funded

Services can be free at the point of use or paid for along the lines of agree stakeholder segmentation. To address different needs, different solutions are needed:

Strategic partnerships - consistent, repeat, high volume demand

Internal delivery - more manageble resources - specific one off items e.g. the Doomsday Book

Digital express - you can request digitisation on demand if you can find what you want. But the catalogue is often not at item level.

Strategic Partners - Awarding contracts

Avoid costly time consuming services contracts and tenders

Requirements are "output driven" rather than "activity driven" - so specificying the what rather than the how and allowing innovation and flexibility

1911 CensusScanning takes place on an enormous scale working with Scotland Online. It's over 0.5 petabytes of data. Also very commercially attractive so additional services have been built in: academic, schools, statistical analysis etc. will roll out sequentially as well as a service for home users. Launches 2009.

Dan outlined the advantages and disadvantages of strategic partnerships with commercial partners: finanical risk is on the commercial partnerm, maximum access, re-use of data in knowledge economy, allows many products to be developed at once, but potential loss of control, potential divergence of interests of respective parties, have to agree the agenda, it can be a fragmented user journey and you do need to invest a lot of money and time up front to approve and develop processes.

Organisational ImpactThis type of approach means a sea change in attitude and means enabling rather than providing services. You entrust the resources to 3rd parties to preserve. There can be a drain on training resource, supervision etc. And you don't spend less resources but you do apply those resources differently.

Cabinet Papers 1916-1976 is an internally delivered project which is funded by JISC, delivered by documents online, big project that will launch in 2009.

The National Archives are improving search substantially to cross search all databases and present it more intuitively. They are also using a wiki (Your Archives) to allow individuals and experts to exchange ideas and information and it recognises the high expertise of users.

Future Challenges

NA will continue to digitise collections but also need to look to new markets, new technologies and new partners (e.g. maps are rich resource but not in high demand at the moment).

Provide expertise online

Continue to develop and apply customer insight tools - Facebook etc. change all the time and we must be able to develop all the time if we want to deliver services well.

Financial sustainability is key to all projects and programmes - the answer may in developing cost effective platforms for delivery but it's not a simple question by any means.

The impact of the Digital Archive on NA use has been immense. Over 80 million NA documents delivered digitally in 2007. The growth has been huge and continued. 81.5% of users are satisfied or very satisfied with online services (surveyed 2007). 95% of users are satisfied with our onsite experience. There is global reach and access is being maximised.

Maximising access to, and understanding of, major archives

Dan Jones owns the Domesday Book.

Well, not quite, but it is housed in the National Archives, where he works. The Domesday Book is just one of the 60 million documents available for immediate electronic download (cripes!). Their approach is driven by changing user behaviour (increasing web literacy and expectations) and the pervasiveness of high bandwidth broadband. But in digitising their archives they must contend with over 175km of shelving and over 10 million catalogue entries; Dan's "back of the fag packet" estimate of costs to digitise all this data is over £5 billion (double cripes!).

Models of digitisationThe Archives digitisation activities are funded from internal budgets, commercial investment and grants. Segmentation of the target markets [and presumably funders' mandates?] informs decisions about which services are charged for, and which are free at the point of use.

Strategic partnershipsContent is digitised in different ways depending on demand: strategic partners are contracted for high-demand items, and digital assets, once created, are non-exclusive - i.e. available for repurposing within other services. One current project is the 1911 census. 5 scanners are running round the clock to create 40,000 images per day; these are QAd and transcribed in the Philippines, enabling details of over 35 million individuals to be comprehensively searched. The data will lend itself to use by genealogists, academics, schools and for statistical analysis. The strategic partnership through which the project is operated minimises the risk for the National Archives and allows them (as a facilitator) to simultaneously carry out other work. But there is a potential for the partners' interests to diverge, and the project's agenda has to be balanced to represent the interests of a broader stakeholder group.

Internal deliveryThe JISC-funded project to digitise Cabinet Papers from 1916-75 is complex - the papers are handwritten, and don't lend themselves well to digitisation.

Providing contextAutonomy search has been deployed to provide an integrated search function across all databases and websites. Newer archives have been loaded into a Wiki-based resource which allows individual experts to contribute their ideas and information; "some of our users are far more expert in particular areas of these holdings than we are ourselves".

Added Value: subscription agents vs publishers

Agents, publishers and added value: how librarians view the performance of subscription agents and journal publishers

Jill Emery, U Texas, Rick Anderson, U Utah

Publishers challenged the value of subscription agents at a US conference and so Rick and Jill got a group together to investigate further: are librarians getting the same service from the big publishers as they would from the subscription agents? Areas covered included customer service overall, management of title list, accuracy of renewal and invoicing, timeliness of renewal and invoicing, administrative metadata, technological services, accurate pricing, correct initial access activation and resolution of access problems.

Total of 179 responses filtered for librarians working at university libraries - 77 final responses - 90% had the Elsevier package, the six main publishers were all covered.

Results show that librarians feel that subscription agents provide much better service. Interestingly there was a 50/50 split on whether librarians go via their subs agents to resolve access issues or go direct to the publishers.

Some of the results that took Jill and Rick by surprise - questions on satisfaction on customer service. Subs agents did better than publishers - of course, that's where their money is - but only about 19% better.. Why are agents better at resolving access problems than the publishers themselves? Only a 4% difference, but it is an extra layer to go through, so this seems odd. And, if publishers can't produce accurate renewals accurately who can? Agents handled the nuances of special deals better than the publishers themselves.

"Lack of management" at the publisher is a recurring theme, but there was some criticism of the agents too, so no clear cut; at the Charleston conference (www.katina.info) feedback included lots of support for publishers and vocal discontent at the subscription agents - but room for improvement for both parties.

Concluding thoughts:

in terms of Big Deals, things won't get much worse for agents - most of those deals have been done (although, the coverage of the major publishers ranged from 40% or so for T&F to 90% for Elsevier, so room for growth here?)

however if agents improve their handlings of Big Deals things could gradually get worse for publishers (who may prefer that libraries go direct)

Subscription list findings should be wake-up call for publishers to get their acts together

Agents are so much better at back office functions and dealing with administrative data; should publishers get out of the subscription management business altogether? Cede it to a third party - - not necessarily an agent - but publishers could consider outscourcing the print administration to the agents and stop competing on that service and just focus on the areas where they have competitive advantage.

Q: isn't there a difference between the agents themselves in terms of qulaityA: they didn't specify agents by name, but did for publishers - will re-do the survey and test differentiation between agents although most respondents are only dealing with one agent so no comparitor there.

Q: Most publishers are "relatively" agnostic on whether libraries through agents or not - it's just one large publisher that want them to go direct, and for consortial deals it is difficult for one subscription agent to manage that. The questioner would love to give up the back office function of this. Ebsco respondent doesn't think this is the case and think they work successfully with Swets on joint management

Q: fundamentally disagree with the idea of outsourcing service - this the core of an organisation; look at it the other way around and question how efficient libraries and agents are at sending us the moneyA: do the survey to publishers and agents to question their perception of how libraries and agents..

Q: was the survey based entirely on e-journals? the service has got worse since the move to e-only and Big Deals in terms of accurate subscription listsA: print world was very straightforward, now we are in the 4th dimension.. making life harder for everybody on both sides

Q: do you have geog info on who answered?A: all US libraries

Q: focus on bigger publishers and agents - how about publishers who don't have the resources? what is the perception of that group who may be less able toA: problem with getting a representative sample to be able to generalise out to all smaller publishers- use ALPSP and SSP

Q: significant differences between the publishers on the survey questions - 2 or 3 seem to have got it reasonably well, the remainder are very radical: those with most market penetration are not necessarily the ones that are good at the basics, such as getting a correct invoice out.. And if the biggest players can't get it right..

Ken Chad started this session by establishing how many in the audience were from libraries, how many from vendors: most were from libraries.

Ken has been doing some work with JISC on the current status of library automation and the growing open source movement in the UK. All attendees are also given a recent article from Cilip Library + Information Gazette.

Ken wanted to talk about the library systems as it is, then look out at issues in the conference, then out to Open Access and Open Access in the library systems market.

Section 1: The Current Situation

The library function is big business: "...organize the world's information and make it universally accessible and useful.." - Google's mission statement (Ken sees this as clearly a library statement!)

Conventional libraries have competition: Google has huge revenues and over 1m digitised books (it built its library in reverse - search then collections); Amazon; AbeBooks; LibraryThing (over 20m books catalogued - "Who would have thought? MARC records to build a social networking site?!"

After 25 years of the library automation market it has matured. At this point basically all systems are the same - the vendors have said as much. It's all a matter of subtle differentiation.

In response to the mature market the LMS vendors are changing their ownsership (indeed almost everyone has in the last 2 years) - Ken gave examples from 2005 up to today. NB: In USA OCLC (who brought Fretwell Downing in 2005) is seen as a library vendor but in the UK it is seen as a commercial enterprise, this is Europe not the US though. Bowker (CIG) owns MediaLabs (Aquabrowser) now... lots of ownership changes.

The role of private equity is important: it's characteristic of a mature market. Private equity forces companies to be leaner, effective and be a growing company. Investment is only in opportunity. One of the reasons is that LMS vendors have been under-performing (especially some of those cooperatives) and therefore not reinvesting in new product. Ownership will change again as current owners sell on in a few years.

Consolidation is also taking place on a big scale - in UK HE 4 (Talis, SirsiDynex, Innovative (most profitable but fewest clients) and Ex-Librisc) companies control 90% of the market. In some countries fewer companies/bigger share. In Sweden Ex-Libris is the only LMS vendor!

These companies can be characterised as doing vertical search. Encore, Primo, Worldcat Local etc. compete with Google by being more relevant to users. Increasingly products are more interoperable with other systems. Also work in new markets - Middle East, China etc.

Aggregation - a move to Saas (accessed over the web) and 'platforms' (platform is an extension of brand - e.g. Amazon, Abe, Google, OCLC etc)

Value in "context", "intentional data" clickstreams (e.g. Amazon gets more useful as you use it more so that it has more context and value for you. Libraries don't do this - but could! They sit on a goldmine of data on what's needed but they don't use it. Herbet mentioned CSU and they are trying a pilot recommender service based on resolver stats) etc. especially in HE

Universal (Uniform) Resource Management - one system for print and electronic resource etc. ("ExLibris talking most about this at the moment")

However is the market failing?

In the US it became the "OPAC sucks" debate to describe why products were not good and competitive. The concept of market failure is crucial to supporting open source - see report from 2006 Andrew W Mellon Foundation funded report on Open Source.

Duke University Openlib project wants to make a bid to the Andrew W Mellon Foundation to design open source LMS.

Section 2: Emerging Trends and Technology

There is a real change in technology, we need a new strategic approach. Ken recommends work by Yochai Benkler, Professor of Law, Yale Law School. Benkler talks about the networked information economy: there is a new form of production which is social production. Classic example is Linux, built for free! The net can galvanize social production.

"Technology is unleasing a capacity for speaking that before was suppressed by economic constraint. now people can speakl in lots of ways they never before could have, becaouse the economic opportunity was denied to them" - Mother Jones Magazine (website) - Interview waith Lawrence Lessig

Charles Leadbeater, Think Tank Demos, talks of the rise of the "Pro-Am" - it's not amateurish but in the spirit of amateurs. You hear a lot about "The Cathedral and the Bazaar" - the idea that creativity is not only for "special people".

Open access is a hugely differing business model. Bands such as RadioHead and Crimea gave their albums away for free/allow remixing but make their money from merchandising. Publishers are not happy by this trend although scholarly articles are a slightly different ballgame.

Open source software can be the backing of a commercial business - this new business model is possible. See also: Charles Leadbeater's (free) eText: "We Think. The Future is us" - Profile Books Ltd. 2008Breakout Session

Comment: The RadioHead example is interesting: 80% of people paid for the download. Also some restaurants charging what they think it's worth still make money.

KC (Ken Chad): Pricing can work on that model: people do have an inate sense of value so will pay for things

Comment: Open source is written by people who you don't know who they are, they may not be there tomorrow. How do you ensure it's sustainable?

Comment (from vendor!): How sustainable is it when people leave companies all the time though!

KC: Sustainability is important. Collaborative nature of open source is crucial. Peer-to-peer collaboration generally means that being modular and having open standards is a characteristic of many open source softwares. Some open source products have maintenance agency (e.g. Linux) and deals with some of those issues.

KC defines open source as free like kittens rather than free like beer. The kittens need looking after, but they are still free.

Open source licensing denies anybody the right to exclusively exploit the work. And OA licensing stops this:

I cannot copy the work (beyond single personal use)

I cannot make derivatives of the work

I cannot authorize anyone else to do anything with the work

Almost everyone uses OA software in critical applications (e.g. Apache). Mozilla had problem with Netscape in the historical issues of licensing (e.g. many elements of something you own may be licensed in ways in which you can't reuse them).

Benefits of Open Source

Bug fixing - "given enough eyeballs, all bugs are shallow"

Security - you find problems far faster!

Customization - much easier to do and can pay of develop in-house the add ons as needed and you make use of the social economy.

Translation - Wikipedia is an example of this, again use of social economy

Avoiding vendor lock-in - formats tend to be standard and interoperable by design. Much cheaper to change systems too if full retraining and new formats aren't neccassary.

Mitigation of vendor/product collapse - code isn't owned

Being part of the community - adopting OA software is on "theological grounds" almost, community support and spirit (e.g. Koha).

Impact of OSSBig busines snow - adds 263 billion Euros to the European economyOSS programmers (many in Europe) volunteer at least 800 million Euros worth of labourHugely important to Africa, Asia, developing areas etc. either

DiscussionOne Delegate from Adis Ababa commented that his library uses Koha. It has worked well although work has been required in house (this has not been expensive to do).

Comment (from vendor): the Royal Homeopathic Library is using open source and the whole system - including hardware - cost £7000 and it looks super. They have a contract with a support agent

"Open Source is about distributed innovation and will become the dominant way of producing software"

OSS LMS Products

Koha

Evergreen

Emilda

Openbiblio

PMB

UNESCO now have a portal for OSS which is worth seeing and which includes LMS. New companies coming along to support and develop OSS.

Libraries are now migrating from LMS to OSS. However OSS makes the procurement process tricky. How do you demo software or calculate price etc. for OSS. Its a real challenge for those processes. Support and development services will be what you need to procure. This prompted discussions of how this works - effectively you bypass some of that procurement process in orgs.

Richard Gorman, venture partner at Bay Partners (2007): says that it's very difficult as a commercial software vendor to compete with an OSS company - by the time you adjust it may be too late.

Thus OSS challenges the business model by which LMS vendors operate. Some commercial LMS vendors deliver OSS for libraries - e.g. supporting OSS products from others (VTLS), delivering their own OSS products (e.g. Talis) but it's noticable that these are smaller vendors. Maybe the larger vendors are unconcerned when young markets for products still available to sell to.

Concluding thoughts

Technology, cost and complexity barriers are coming down - hugely important

More open source and open data components and products

Pro-ams in the library sector

An increasing contribution from non "traditional" LMS companies

Copyright clashes will continue - OS is huge threat to IPR

Libraries are at the heart of the wider culture and technology debate.

Further Discussion

A delegate added that OSS really makes you reexamine what the right tools for the job is and what the role you have actually is now - as librarian, tour guide etc etc.

Another delegate responded that some organisations must work within the bounds of their own systems capabilities. Sometimes you just want a systems supplier who can deal with all the detail for you.

KC: yes, that's what Care Affiliates and Liblime have appeared. It's a very significant change. Takes OS out the "wacky" category and into the mainstream. You need to think of LMS as a tool. Is OS the best way of building software and developing it.

Another delegate asked whether OS was used in other contexts: is it used for payroll?

KC: this is a fair argument but OS is used in VLEs and certainly other contexts if not banking and payroll.

Another delegate added that if a product goes wrong you could sue the vendor if needed though this does not happen in practice very often.

For another delegate it's important that OS allows much faster and more custom development for cheaper.

Further discussion that you can never please everyone! A delegate suggested that you give away the backend and build your own interface, they now use Facebook as their system.

A delegate from Imperial College London are offering Olivia (Online Virtual Information Assistant) for free - if a commercial vendor they would be charging thousands (Bradford have rolled this out as "Lollypop"). We have done podcasts and video as resources BUT the college is wary of uploading to Youtube!

LiveSerials is operated by the UKSG, a non-profit organisation that connects the information community. It provides realtime coverage of news from the information industry, including reports from the UKSG's Annual Conferences held each April.