Monday, March 28, 2005

The Open Wars

With the launch of Yahoo Search for Creative Commons we are reminded once again that in the digital networked world two different publishing models continue to battle for mindshare: one open, the other closed. Which is the more sustainable in the long term? And what relevance does the wider movement for open content have for the debate about Open Access?

Richard Poynder

Last week web search engine Yahoo launched the beta version of a new web search service for locating content made available under Creative Commons (CC) licences. The new service — Yahoo Search for Creative Commons — enables users to limit their web searches to CC content, including text, photographs, songs, web pages, articles etc.

The aim of the new service (and indeed of Creative Commons licences) is to help people distinguish between content that they cannot reuse or adapt without first obtaining the creator's permission, and content for which the creator has given prior permission for reuse.

Although it was already possible to conduct searches on CC content utilising Yahoo's advanced search functions, the search company has created a new interface designed specifically for doing so. In addition, users of the service can further refine their searches to locate CC material designated for specific types of reuse, including content that can be used for commercial purposes, and content that can be modified, adapted, or built upon.

The new service, explains Yahoo on its web site, is designed to help searchers find the works of authors that "have [been] marked as free to use with only 'some rights reserved.'" It adds: "If you respect the rights [these authors] have reserved (which will be clearly marked, as you'll see) then you can use the work without having to contact them and ask. In some cases, you may even find work in the public domain -- that is, free for any use with 'no rights reserved.'"

Significant impact

Launched in 2002, Creative Commons has had a significant impact on the Internet, and today there are an estimated 14 million web pages containing CC-licensed content.

The appeal of Creative Commons lies in the way it separates out the various rights associated with copyright and allows creators to specify those rights they want to keep and those they are happy to waive, but within a set of parameters that they themselves define.

This "some rights reserved" approach — coupled with the ability for creators to give blanket prior permission for certain uses — encourages the sharing and reuse of content, say CC advocates, making it far better suited to the open and co-operative ethos of the Web than the traditional copyright notion of "all rights reserved", and providing a greater stimulus to creative endeavour.

Creators can stipulate, for instance, that anyone can make whatever use they want of a work, so long as the author is credited; that they can make whatever use they want, provided it is not done for commercial purposes; that they can use it, but not create any derivate works, or verbatim copies; or the work can be offered on a “share-alike” basis — thereby allowing others to make derivative works, but only on condition that the resulting work is then distributed under the same share-alike terms. In total there are 11 CC licences.

Once they have chosen a licence that meets their needs, creators place a CC logo on their site. The logo also inserts machine-readable metadata into the HTML of their webpage linking it to a commons deed on Creative Commons' web site that sets out the usage conditions associated with the licence. As a result, both passing surfers and search engines can quickly establish what permissions apply to the content associated with the CC-licensed page.

While the Creative Commons web site already offers its own search engine this is restricted to searching on CC metadata. What Yahoo brings to the party, explains assistant director of Creative Commons Neeru Paharia, is that it also searches on backlinks in order to "figure out what license is associated with what page." In addition, she adds, Yahoo is "indexing a ton more pages than we are."

The expectation is that in bringing its formidable web presence to bear on CC-licensed content, Yahoo will increase both the credibility and visibility of Creative Commons. As the chair of Creative Commons Lawrence Lessig commented to ZDNet when the service was launched: "By giving users an easy way to find content based on the freedoms the author intends, Yahoo is encouraging the use and spread of technology that enables creators to build upon the creativity of others, legally,"

Open Access

What relevance, if any, does Yahoo Search for Creative Commons have for the Open Access (OA) movement? For OA publishers like BioMed Central (BMC) and the Public Library of Science (PLoS) the service is of immediate interest, since both publish under CC licences.

This means, for instance, that as all BMC's OA journals have CC rights information embedded in their articles BMC papers will be immediately visible to the new Yahoo service.

This, says BMC’s technical director Matthew Cockerill, is good news for OA. "Scientists make increasing uses of Internet search engines such as Yahoo to search the literature. The collaboration between Creative Commons and Yahoo is important as it means that scientists can now easily identify articles (such as those published by BioMed Central) that can be freely downloaded, redistributed and used to create derivative works. Possible applications of this include text mining, specialised subject specific databases, and digital archiving/preservation systems."

The new service will also be of great interest to scientists who publish with OA journals, adds Cockerill, since it will raise their visibility on the Web. "Scientists generally want the articles that they publish to be redistributed as widely as possible," he explains. "We hope that other search engines will follow Yahoo's lead, and perhaps go even further by highlighting Creative Commons content on their main search results listings."

For researchers who provide OA to their papers by self-archiving them the new service will have less immediate appeal, since self-archiving usually means continuing to publish in traditional subscription-based journals and then depositing the papers in an institutional repository. Since publishers routinely acquire the copyright in papers they publish self-archiving authors will not be able to archive them using a CC licence. As such, the papers will not be visible to the new Yahoo service.

However, some believe that the new service has little to offer anyone in the OA movement (or indeed content provides generally), certainly in the long-term. Joe Esposito, a management consultant specialising in the intersection of publishing and digital media, for instance, suggests that any benefits will be short-lived. "It temporarily raises something in the search rankings. But search is an arms race. The lead is soon lost."

Decrying the way in which the Creative Commons has become "caught up in a quasi-religious fervour that makes it hard to talk about without voices rising to a shout" Esposito characterises the Yahoo move as little more than a cunning marketing ploy, since it was already possible to search on CC content using the advanced search features of the main Yahoo search engine. "Yahoo wades into this religious controversy with all the cunning of a shrewd marketer, for which advocates of free enterprise will be pleased."

Two models

But whatever the impact of Yahoo's service its launch reminds us once again that in the digital networked world two very different content models continue to battle for mindshare: one open, the other closed. On one side of the barrier are those, like OA publishers, who believe that content must be made freely available. On the other sit those who continue to believe that readers must be made to pay the freight.

In the news business, for instance, a fierce debate continues to rage as to whether content can be charged for on the Web. Thus while an abundance of free news services are now available, powerful print brands like The Wall Street Journal and New York Timesremain adamant that a closed "pay-to-read" model is the only effective way to maintain their "brand value" and ensure their survival in the long-term (although the NYT doesn’t lock its articles up until they are a week old).

However, critics point out (here and here for instance) that this "walled garden" approach means that publishers practising it are all but invisible on the open oceans of the Web, since search engines are unable to index closed sites. In the age of the Web — where openness and visibility are essential — they argue, this is a risky long-term strategy.

Similarly in the world of scholarly publishing Reed Elsevier continues to maintain that pay-to-read subscription–based publishing remains the only stable model for scholarly communication. OA publishers, by contrast, believe that authors (more accurately their funders) should pay to publish, thereby allowing the content to be made freely available on the Web.

Interestingly, Esposito does not dispute the need for more openness. He also believes that the greater willingness of OA publishers to adapt to the Web could see them outflank proprietary publishers. He does not, however, believe that alternative copyright licenses have any meaningful role to play here.

As he puts it: "I advise commercial clients to make content freely available, to syndicate content, to work through networks of resellers. Most of them refuse. If OA publishers are more aggressive about taking advantage of the inherent properties of the Web, then the OA publishers will outstrip the proprietary publishers. But this has nothing or very little to do with CC. It has to do with McLuhan: understanding the properties of a particular medium."

Maybe. But since digital content can so easily be copied it is no surprise that the struggle between open and closed publishing models is now firmly centred on issues of copyright. After all, on the Web walled gardens are so easily and so frequently breached that recourse to the law is hard to avoid in the proprietary model.

Refuge of last resort

Indeed, the boundary between acceptable and unacceptable use of content is often so fuzzy now that it is becoming difficult to avoid litigation, and copyright has become the refuge of last resort for proprietary content providers when shipwrecked on the Internet.

The launch of Yahoo Search for Creative Commons, after all, comes hot on the heels of news that fellow search engine Google has been sued by Agence France Press (AFP) for allegedly publishing copyrighted content without permission. Google News gathers photos and news stories from around the Web and posts them on its news site, which is free to users.

AFP is seeking more than $17 million in damages and an injunction barring Google from further publishing its photos, news headlines or story leads on the Google News service.

Here is clear proof — were it needed — that copyright has become one of the key areas of conflict on the Internet. And in providing a tool set to better enable content providers delineate their permissions Creative Commons is a practical response to the current situation.

But where do we go from here? Commenting on the new Yahoo service on his blog OA researcher Peter Suber says: "As copyright locks down more content more tightly, searchers will want reuse rights almost as much as relevance. Search engines that find both will have an advantage. Conversely, authors and publishers who consent to grant more reuse rights than fair-use alone already provides should make their consent machine-readable for the next generation of search engines."

In short, the hatches are being battened down for a new offensive in the online content business, and copyright has become the weapon of choice for those engaging in battle. What better way, then, to counter the increasingly aggressive "all rights reserved" battleships of proprietary publishers than to flood the seas with fleets of "some rights reserved" destroyers?

The scary thing for publishers is that it is currently difficult to see how the closed model can prevail. If it can't, then the challenge they face — be they scholarly publishers, newspapers, or lone bloggers trying to make a living from their writing — is to find a viable alternative business model in a world where readers are no longer prepared to pay. And it is in wrestling with this conundrum that the OA debate shares so much in common with the wider open content movement. It's the Open Wars. Get used to it!

What's your view: Can the closed model of publishing prevail on the Internet? What impact will Yahoo for Creative Commons have on the Web in the long term? And what relevance does the wider open content movement have for the debate about Open Access? E-mail me at richard.poynder@journalist.co.uk, or to comment publicly hit the comment button below.

2 comments:

Richard, grateful for your thoughts on this. As you pointed out, there is - and always has been - a great demand to reuse content. However, most publishers tend to see that as a threat rather than an opportunity. As Creative Commons makes is simple for people to understand reuse rights, so do copyright holders need to implement capabilities to make it easy for people to understand how they may make use of their content without having to resort to elaborate schemes. More thoughts on this at:

Thanks for the comment. I agree with you that "The greater issue to be addressed by Yahoo! and other search providers both public and institutional is how to make their users aware of effective use for ALL content based on its rights structure."

Clearly rights and permissions will increasingly need to be machine readable. In this respect it is perhaps interesting to contrast what Yahoo is doing with what the Open Access movement is doing, particularly with the protocol for metadata harvesting being developed by the Open Archives Initiative (OAI). Here it seems there remains much to be done. While Version 2.0 of the OAI-PMH (http://www.openarchives.org/OAI/2.0/guidelines-rights.htm) addresses the issue of rights in the metadata itself, it has yet to address the issue of rights in the source object (i.e. the scholarly papers that the metadata describes). In other words, the metadata is able to specify the rights applying to itself (using, say, a Creative Commons licence) but it cannot yet specify the rights applying to the scholarly paper that it describes!