The Internet's most evil company?

Information wants to be free? Au contraire, information wants to tell you all about itself, where, how and if you can use it, and it reserves the right to sue the crap out of you if you don't pay attention. Or at least, that seems to be the way a growing number of traditional publishing organisations view it - the internet has been getting a free ride off their backs, they reckon, and if the news business is going to survive, that's got to stop.

Refighting an old war? (Dutch National Archive/flickr commons)

The Associated Press in particular has been mad as hell and not going to take it any more, opening fire on the Drudge Retort last year over its use of snippets from AP stories, and winning itself instant promotion to Great Satan of the Blogosphere status. Eccentrically, the AP's takedowns resulted in calls for a boycott of AP stories, plus a weird combo of boycott and 'solidarity steal' from a noted professor of journalism.

The argument then was over fair use, AP's position being that someone taking the headline and introductory sentences from an AP story without a licence was infringing its copyright. The items it objected to were as little as 33 words long, but from statements at the time it appears that AP was differentiating between quoting from a story (which could still be fair use), and using headline, intro and link as a signpost to a story.

If that was the point, it was lost on the blogosphere - but you can see the possibility that the AP was trying to create a precedent, to evolve or even create copyright law for the Internet.

The latter case is worth noting, because here again AP was seeking to make (or remake) copyright law. In 1918 (no, really) AP had persuaded the US Supreme Court to create the "hot news doctrine" to protect its stories. AP had sued a rival wire service that was writing news reports using AP's reports as a source. AP's argument, accepted by the court, was that in recognition of its investment in news gathering, it should be able to stop other news outlets using "its" facts for a period, and that a quasi-property right should surround the information while it retained value.

AP sued AHN over that company's use of AP source material, attempting to re-establish the hot news doctrine as it did so. It didn't get the precedent (this time?) because the case was settled without being tested in court, but a joint statement issued by the two said:

As a result of the lawsuit, and as part of the settlement, the defendants agreed that they would not make competitive use of content or expression from AP stories. Defendants acknowledge there were many instances in which AHN improperly used AP's content without AP's consent. Defendants further acknowledge the tort of "hot news misappropriation" has been upheld by other courts and was ruled applicable in this case by U.S. District Court Judge P. Kevin Castel.

AP may not have got its reaffirmation of the hot news doctrine here, but it secured what from its point of view was a satisfactory result by going after a highly visible target. It hasn't yet got the courts to agree with it on where the boundaries of its ownership lie (not even the New York Courts), but it's got AHN to back off, to agree it did wrong, and to publicly not disagree that the hot news doctrine might still be valid.

Ah, but while that might work for larger commercial outfits perceived by AP as living off its back, what about the rest of the internet? This is possibly where AP's "industry initiative" comes in, along with something similar that was baked earlier, ACAP.

In June a group of European publishers, under the auspices of the European Publishers' Council (EPC) and World Association of Newspapers (WAN), hammered out the Hamburg Declaration, This was - we feel, somewhat unfairly - summarised by Ars Technica as "European publishers want a law to control online news access".

As Ars itself notes, shortly after describing the Declaration as "a long-winded rant against the Internet for stealing their news", it's really all about ACAP (Automated Content Access Protocol), a proposed standard for news intended to convey in machine readable language the content owner's policies on access and use. It is, say its originators, "all about making copyright work on the web."

AP's industry initiative, developed in conjunction with the Media Standards Trust, Value Added News, occupies similar territory, and is currently being tested on AP's own stories. But AP is also one of the members of ACAP.

"Publishers are for the most part content with copyright law as it stands"

AP claims that VAN will "make it easier for readers to find articles from more established news providers amid the ever-expanding pool of content online", and that its tags are "a nutritional label for your news". But VAN does also convey information about copyright and permitted use. ACAP is more explicitly presented as an aid to copyright enforcement. In a blog post the organisation quotes the Hamburg Declaration, and stresses that it is a call for improvements to mechanisms for IP protection on the internet, rather than a call for new laws: "In our experience in ACAP, publishers are for the most part content with copyright law as it stands... ACAP is, in a transparent and open way, creating tools so that existing copyright law - and the licences that depend on its operation - can work in a machine-to-machine way without needing a human somewhere in the middle."

But they would say that, wouldn't they? And tools for enforcement clearly do imply enforcement. But it's still some distance from demanding a law to control online news access - ACAP, the Hamburg signatories and AP are essentially trying to control how their property is used on the Internet, and the tools that they're proposing are in some senses similar in nature and intent to Creative Commons' plans for automating copyright, licensing and ownership details. Is it therefore possible that AP and ACAP are evil and grasping, but that Creative Commons is not? How does that work?

Recent events suggest that Creative Commons is well-meaning, but pushing a fatally flawed system that can't be fixed via the wisdom of crowds, whereas AP and ACAP should at least be able to succeed in producing accurate tags, metadata and RDFa (up to a point - this gets a lot harder when you start claiming to own facts via the hot news doctrine).

But where does that get them? ACAP seems to see the way forward as consisting of getting Google the search engines to recognise it, and to get legislators to apply pressure to Google the search engines in order to make this happen. This is where the notion of "European publishers" (and AP) wanting a law to control the internet comes in. But as they say, they don't want any new laws, they just want mechanisms that help enforce the laws we've got more effectively.

Ultimately if ACAP was widely adopted by Google the search engines and medium-to-large web publishers, then we would to an extent have 'HTML that says no', but that wouldn't be DRM, and would still leave scope for outlaws and refuseniks to just ignore the metadata.

This also applies to AP, which is testing some of its technology now, and intends to roll out a news registry in November. The AP release on this hits all the wrong buttons; AP will bundle its stories in "an 'informational wrapper' that will include a built-in beacon to monitor where stories go on the Internet" and the registry "will track electronic tags applied to the stories" and the "beacon will send information to the AP's registry where the cooperative's [AP is a cooperative] content is accessed."

It's almost as if AP is trying to make Jeff Jarvis' head explode

Creeped out yet? It's almost as if AP is deliberately trying to make Jeff Jarvis' head explode. But rewind - this is not DRM, it's metadata, or similar, designed to be read by machines, not humans. Humans can still quote, steal or link to AP words without necessarily having anything to do with any wrapper AP ships them in. There may still be legal consequences depending on what the humans are doing with these words, and how blatantly, but these consequences will not be unleashed by some beacon bug phoning home.

What AP is doing, and what ACAP and, yes, Creative Commons are trying to do, is to build a system that governs how machines, not humans, handle content. AP is aiming for news aggregators, not bloggers, and it even says so. Speaking to the Columbia Journalism Review (our thanks to Angela Gunn of Betanews for spotting this useful piece of actual journalism), AP VP Jane Seagrave said that AP has no intent to nail individual bloggers, but is aiming at sites guilty of "wholesale misappropriation" of AP content, "people who are copying and pasting or taking by RSS feeds dozens or hundreds of our stories". And as Ryan Chittum of CJR comments, "If you haven’t run across these rewrite or sometimes copy-and-paste mills that steal content, you haven’t looked very hard."

That does not, however, necessarily mean that AP is cuddly, even if you discount the scary press releases and the threatening speeches from the High Command. The combination of VAN and the news registry (if this combination is what the system turns out to be) certainly could be seen simply as a neutral tool, as something that "says this is how you can", but alongside this we have the attempt to re-establish the hot news doctrine, and the related contention that headline/link/snippet can equal misappropriation. With these AP is clearly trying to push the boundaries of its intellectual property, and in doing so it cannot help but challenge current accepted definitions of fair use.

How worried you get about that might relate to how deserving a case you think the next site AP takes a pop at is. Is, for example, Drudge Retort just a leech? AP thought so last year. Or is Daylife, a sort of aggregator's aggregator, in which Jeff Jarvis is a partner?

You can get some kind of answer to that question if you consider what might fly as a hot news doctrine case, and what might not. Back in 1918 it was all pretty clear cut - on the one hand you had AP investing large sums of money in reporting World War One, and on the other an outfit diligently rewriting the output of all AP's expensive reporting. The justice of AP's case there is entirely understandable to any journalist who has had a good scoop and then seen dozens of other outlets run off with it - they didn't lift any particular words, and as you don't own the facts there isn't a damn thing you can do about it.

But clear-cut cases are few and far between, given the nature of journalism - take the other extreme as an example. A global news agency has a stringer* in a medium-sized town in Eastern Europe. A large part of the stringer's output derives from reading the local tabloid press, looking out for the odd and eccentric, and filing it into the agency's 'funny old world' section.

Now, when somebody else lifts that story from the agency, in what sense could the hot news doctrine apply, and if it does, might it not apply to the originating tabloid (provided the tabloid didn't lift it from somewhere else), rather than the agency? Similarly, the trad Paris correspondent reads the local press, looks out for local colour stuff he can lift from Le Monde or Le Figaro, files it for tomorrow's paper then heads for the bar-brasserie.

The thing that gets ignored when the newspaper barons huff and puff about wholesale theft is that journalism is by nature heavily derivative (which is the kindest way of putting it), peppered with repetition, follow-ups and plain old-fashioned lifting. Look too hard under that particular stone and it all falls apart. Not that it isn't all falling apart already, of course.

It's really a lot more like Web 2.0 than they'd like you to think, and this limits the extent to which the hot news doctrine, or some kind of metadata description of ownership, could be legally accepted. Try making it stick for something that isn't pretty much clear-cut, and where you might not entirely be the original conduit for the facts, and the court will toss both it and your precedent.

AP, probably, is not that stupid. Where it, and the ACAP members, likely are stupid is in thinking that stopping bunches of low-rent copy'n'paste operations stealing their stuff is going to make their currently unprofitable business models profitable. Now that really is difficult to believe. ®

* A freelance journalist with a relationship with a news organisation, generally operating in an area where the organisation does not have it own correspondent, and generally paid only for submissions that are published. So basically they're based in some hick town where hardly anything ever happens. OK, Colin Barfoot? Happy? (see comments)