Archive for the 'Folksonomy' Category

You’ve got to hand it to those Google guys for coming up with out-of-the-box thinking.

Take Google Image Labeler for instance. The worst thing about this latest Beta from the World Domination stable of ideas is the name. As John Battelle points out.

As John also points out, what Google call labels the rest of the planet know as tags.

I just wish Google would use the terminology the rest of the web has already settled upon. It’s not a label. It’s a tag. “Tag” means something – an intentional attribute given to an object on the web. That’s what we are doing here. How about we help Google come up with a new name?

So what is it then? It is two things:

An addictive bit of simple fun. You are randomly partnered with someone else then the two of you have 90 seconds to agree on at least one label for each of the images [from within Google Image Search] you are presented with. If you both enter the same label, you gain 100 points and another image is presented.

An ideal bit of fun to dip in to for a few minutes the next time you fill your coffee cup. Be warned though, be prepared for you to be still playing it as you finally drain the cup!

An innovative way of building up folksonomy around the images that Google reference. By harnessing peoples natural addiction to this sort of game, [As of the moment someone named eGrunt has amassed the staggering total of 1,324,400 points – does this person sleep!] they are rapidly building up a human-validated set of search tags for their images – all for free. At the moment there does not seem to be any value, other than qudos, attached to the points gained.

Google, like many of us who have tried to find relevant images from their Image Search, have identified that just scouring the page [that contains an image] for relevant keywords is not as useful as you would expect in cataloguing the image its self.

One benefit unique advantage Google have in launching such an initiative is their global reach. They launch a new Beta, within hours the Google watchers blog about it, within a day or so thousands are playing with it.

Would something like this work for cataloging tagging your dusty collection – probably not as most players would grow old waiting for a partner. But how long before a Google Book Search version appears? In which case the question will be, will Google see this as more secret-source or would they provide an open api to it?

“With all due respect to the author, we remain unsure how to categorize this particular work,” said the chair of OCLC’s Editorial Policy Committee

I bet the social taggers, building up folksonomies, don’t have the same problems. To be fair though they are not trying to shoe-horn the book in to a rigid classification system – mind you isn’t that the point.

Wikicat’s basic premise is to become the bibliographic catalog used by the Wikicite and WikiTextrose projects. The Wikicite project recognizes that “A fact is only as reliable as the ability to source that fact, and the ability to weigh carefully that source” and because of this the need to cite sources is recognized in the Wikipedia community standards. WikiTextrose is a project to analyze relationships between texts and is “inspired by long-established theories in the field of citation analysis”

In simple terms the Wikicat project is attempting to assemble a bibliographic database [yes another one] of all the bibliographic works cited in Wikimedia pages.

It is going to do this initially by harvesting records via Z39.50 from other catalogues such as the Library of Congress, the National Library of Medicine, and others as they are added to their List of Wikicat OPAC Targets. Then when a citation, that includes a recognizable identifier such as ISBN or LOC number, is included in a page the authoritative bibliographic record can then be used to create a ‘correct’ citation. Eventually the act of citing a previously unknown [to Wikicat] work should automatically help to populate the Wikicat catalogue. – Participative cataloguing without needing to use the word folksonomy!

Putting aside the tempting discussion about can a Z39.50 target be truly described as an OPAC, the thing that is different about this cataloguing project is not what they are attempting to achieve but how they are going about it. The Wikicat home page states:

Reading more it is clear that once the initial objective of creating an automatic lookup of bibliographic records to create citations has been achieved, this could become a far more general open participative cataloguing project, complete with its own cataloguing rules managed by the WikiProject Librarians.

Because they are starting with FRBR at the core of the project, the quality, authority and granularity of the relationships between bibliographic entities potentially could be of the highest quality. This could lead to many benefits for the bibliographic community, not least a wikiXisbn service [my name] that is ‘better’ than OCLC’s xISBN.

So does the world need yet another cooperative cataloguing initiative? – working for an organisation that has cooperative cataloguing in its DNA for over thirty-five years, I should be careful how I answer this!

Throwing care to the wind – Yes. When you consider that all the other cooperative cataloguing initiatives [including as of today the one traditionally supported by Talis] are bounded by project, geographical, institutional, political, subject area, commercial, exclusive licensing, or high financial barrier to entry issues. What is refreshing about Wikicat is that, like Wikipedia, the only barrier to entry, both for retrieving and adding data, is Internet connectivity.

Unlike Wikipedia where some concerns about data quality are overridden by the value of it’s totally participative nature, the Wikicat team are clearly aware that the value of a bibliographic database is directly connected to the quality, consistency and therefore authority of the data that it holds. For this reason, the establishing of cataloguing rules and training for potential editors overseen by the WikiProject Librarians is already well detailed in the project operational stages roadmap.

Dave Pattern at Huddersfield posts some initial usage figures for the various enhancements and enrichments he has added to his local catalogue, including alternate spellings, ‘also borrowed’ functionality, and more.

Although the figures may not be statistically robust, they provide some interesting pointers to the ways in which actual users are beginning to make use of the enhancements being made available to them.

Perhaps unsurprisingly, users would appear to value the added functionality delivered to them when we actually start working with the data we already hold, and I look forward to seeing more libraries following Dave’s example.

I also remain convinced that the biggest benefits will come when we do more to aggregate these data across libraries; ‘also borrowed’ across similar institutions to Huddersfield must surely be more relevant to a borrower than the data drawn from Huddersfield alone, where circulations are of a scale where odd edge effects (I borrow this and that, so when you borrow this you are also recommended that) must be more likely to surface?

Do you have any thoughts in this area? Feel free to share them in the TDN…

Richard MacManus recently posted to his Web 2.0 Explorer blog on ZDNet, outlining five features of Web 2.0 that – he suggests – are now so mainstream as to not be that special anymore.

“A lot of the features and functionality of so-called Web 2.0 sites are now common elements in most current web apps and sites. It’s really gone beyond what was labelled ‘Web 2.0′ last year, because so many mainstream websites are now using these elements. It’s no longer a niche trend.”

The features and functions he highlights are:

tagging

aggregation

filters and ranking

syndication

mash-ups

Each of these is certainly more common than last year, but I’d argue that none of them are yet in mainstream deployment across even a significant minority of the sites that might beneficially use them.

Whilst an increasing number of commentators in this space take such fundamental shifts in approach as syndicating content and services and inviting user/customer/audience participation for granted, the reality on the ground remains very much Web 1.0. Just because the places to which we (most readers of this blog, probably) choose to give our attention are rushing to adopt these models, doesn’t mean that we are that far up the adoption curve yet. For example, Michael Arrington does a great job with TechCrunch, one of those blogs I make a point of reading every day, but the constant stream of innovative new companies discussed on his blog can be misleading. There’s a far larger pool of equally innovative companies, for whom transforming the way in which they invite interaction and custom online is perhaps less of a priority. In their particular industry, that different emphasis may well be appropriate… for now.

There’s a long way to go, and a lot of hard work to be done in evangelising about the visionary companies that Michael tracks, and what the changes they have made could mean to those following along behind. It’s not about copying, but about learning what works, what doesn’t, and how any of it helps you to meet the needs of those seeking to gain value from your offerings. We also need a better understanding of the ways in which existing organisations like Talis are adapting and evolving; you don’t need to be a start-up to be – or to do – Web 2.0.

When you live it every day, Web 2.0 is obvious. When you talk about it every day, Web 2.0 is old news. When you observe those to whom you are talking about it, you realise just how radical some of it is. We could all do, perhaps, with remembering how exciting these ideas were the first time we heard them or thought them.

Take a library example. We’re talking about sweeping aside the financial, technical and procedural barriers that make it so hard for libraries to tell both other libraries and library users about what they hold. We’re talking about making a Platform of data and services available, upon which third parties can orchestrate (possibly a better term than mash-up) new services in a way that fundamentally challenges existing software and data supply models in the sector. Once you’ve heard it, it seems blindingly obvious and eminently desirable. But despite that obviousness, no one else has done it.

I’ve bumped in to a couple of these so called personal Library services, LibraryThing and Reader²,recently.

They both are working on offering similar functionality – Add a book to your personal catalogue, search Amazon/Library of Congress to find it, publish you catalogue for others to see, tag your books to help build up a folksonomy, provide an RSS feed of a catalogue, list your catalogue on your blog page, etc. etc.

Now if we could just link in your account at your local library to auto add borrowed books to one of these personal catalogues…… Or even better wrap the folksonomy, tagging, list publishing, etc. in to your Library system account. Now that would be cool, especially if the aggregated folksonomy could be searchable.

What about linking in iTunes, to let you catalogue, rate, tag, and publish your music or favourite podcasts; the possibilities are endless.

As an aside, part of the reason I was attracted to these sites was their use of AJAX style programming to deliver the functionality directly in to the browser. For instance adding a new entry to Reader² is an interesting experience – as you type words in to the search prompt, a list of books previously catalogued [by others] complete with Amazon book jacket images dynamically refreshes in a panel on the right of the screen.

The use of AJAX to build sites such as these is becoming the way to do this sort of stuff. Quite right to, because the user experience it delivers is a step up from traditional web sites. No coincidence that AJAX is a significant part of the UI development work in Talis Research at the moment.

Folksonomies remain in the news with Jack Schofield in last Thursday’s Guardian reporting back from the Emerging Technology conference in San Diego, California that “Folksonomy was the big, bad, buzzword”

Schofield asserts that folksonomies, as used by sites such as Flickr for sharing photos or del.icio.us for web links, would not “impress a librarian”. But “they are also important because this is probably the only viable way of tagging billions of items on the net. No one is going to hire millions of trained librarians to do the job”.

It’s not that librarians haven’t tried. In 1998 OCLC launched the CORC project turning its vast cataloguing expertise to “taming the Web” with the prospect of a catalogue of Web content on the scale of OCLC’s huge bibliographic database, the WorldCat. “Both full USMARC cataloguing and an enhanced Dublin Core metadata mode will be used” it was announced. More modestly at Talis we have Talis List, a web based reading list system that allows academics and/or librarians to “harvest” (in a manner not unlike delicious) and categorise web sites very simply and add them to a course “resource list” for students.

It’s not just “tagging” technology that is challenging librarians. As regular readers of this blog will know, Talis has been engaged in a project around RSS technology that has now expanded to include OpenSearch. Coincidentally OpenSearch was also featured at the Emerging Technology conference by Amazon’s Jeff Bezos. Richard Wallis has discussed our take on RSS and OpenSearch in more detail including his Talis Prism (library catalogue) OpenSearch proof of concept.

So coming back to the first point, librarians may not be impressed with what seem to be simplistic approaches to cataloguing, classification or search. We know the problems are complex. The point though is that we can see our comfortable, complex, feature rich but domain specific technologies and standards like MARC, Z39.50 etc being challenged from outside the domain by companies with a bigger problem to solve: –Web 2.0. That’s why, at Talis, we take them seriously and get involved.