What The Associated Press’ tracking beacon is — and what it isn’t

When The Associated Press said last month that it was building a “news registry” of AP content, most reaction focused on the so-called “tracking beacon” that will monitor usage across the web. I use quotation marks because, well, those are metaphors for technology that’s still in development: The AP document we’ve obtained says the registry, set to launch on Nov. 15, will “require capabilities not currently available.”

But there’s nothing particularly magical about the beacon, which will amount to JavaScript embedded in the online feeds that are distributed to clients. So when you read an AP article on the New York Times website, a script running in the background will take note of that usage. (It’s unclear how news organizations like the Times, which is particularly neurotic about the weight of its pages, will feel about the script.)

Tracking readers

The point, of course, is to identify uses of AP and potentially member content that isn’t licensed. So if someone copied an article’s source code onto his own site, by hand or automation, the beacon would follow along and, according to the document distributed to some AP members, “send reports back to the core database each time the item is clicked on by an end user. The beacon will identify each piece of content, the IP address of the content viewer, the referring Web server and the time of use.”

I immediately flagged “IP address of the content viewer.” In recent years, the recording industry has used the IP addresses of downloaders to pursue legal action against people sharing music online, leading to lots of ill will toward the RIAA. That said, recording such data isn’t all that unusual. Websites using basic analytics software already record the IP addresses of their users.

When I asked the AP’s general counsel, Srinandan Kasi, about it, he said the AP wasn’t interested in monitoring who specifically reads their content on unauthorized sites: “In writing this” — he meant the document — “obviously, theoretically anything is possible. But what you actually make the final available piece is a different thing. This is simply: These are the capabilities that are possible.” Later, he added, “If at some point this business goes there, they’ll be completely transparent about it. There’ll be all the disclosure and compliance issues.”

Removing the beacon

There was another passage in the document that struck me as weirdly written:

Because the News Registry’s active tracking beacon would not be effective if the beacon were removed, the Registry also has a backup enforcement system. Based on current Web behavior, it is safe to assume that some users will intentionally or inadvertently remove the beacon. A “passive” tracking service will crawl the Web searching for AP content and identify the publishing Web page, an image of usage and the time of discovery. Matches will be queried against the active tracking database, and unauthorized uses will be pursued.

By “intentionally or inadvertently remove the beacon,” doesn’t the AP simply mean copy and paste the text of an article? While there’s been some innovation in the field lately, it’s difficult to imagine how the beacon could survive the magic of Ctrl+C, and that’s an obvious limitation of the tracking system. But referring to that as removing the beacon called to mind the anti-circumvention portions of the Digital Millennium Copyright Act, which criminalize attempts to get around copyright controls like burning an encrypted DVD. Now, I can’t imagine the AP would have a legitimate claim there, but I gave it a go with Kasi, who said, “You may be giving it a lot more heavy reading than” intended.

So why use that language? “We need to worry about that sort of thing because, right, Zach, we’ve seen that happen….If some of these formats get stripped out, including the mythical beacon, then we need to have a way of knowing and being able to address that.” (I had just referred to the beacon as mythical.)

Rhetoric and reality

I told Kasi that there seemed to be a persistent disconnect between the AP’s rhetoric on copyright and what it actually cares about. He acknowledged that the consortium had not always effectively communicated its intent: “It’s easy to think that, when you read ‘beacon’ and given the issues of some other companies and so on, you can immediately jump to the conclusion, ‘oh, this is a persistent cookie that’s going to track this user across all kinds of sites.’ No.” That was surely a reference to Facebook’s poorly received advertising platform, also known as Beacon.

The AP’s graphic explaining the beacon and a new microformat was easily mocked and labeled “magic beans” by prominent tech blogger John Gruber. In the clip from the actual graphic at right, doesn’t it look like a faceless news consumer will be deposited in a toxic waste receptacle?

But in talking to Kasi, I came away with the sameimpressions as Columbia Journalism Review’s Ryan Chittum: that the AP isn’t interested in broadly pursuing copyright claims against republication of its content. In fact, they’re hoping to encourage distribution in certain ways, and there’s plenty of innovative stuff in the microformats they’re adopting. The AP just needs to clear up what kind of “rampant unauthorized use of AP content” — that’s from the document — they want to combat.

And that’s the topic of my next post.

This is the third in a series of posts on the AP’s online strategy. Photo of beacon, by Brenda Anderson, used under a Creative Commons license.

“But in talking to Kasi, I came away with the same impressions as Columbia Journalism Review’s Ryan Chittum: that the AP isn’t interested in broadly pursuing copyright claims against republication of its content.”

Either the AP is having extraordinary difficulty in communicating its intent, or it is trying to hornswoggle you and Chittum. Apply the same skepticism to AP as you would to any other business caught saying different things through different people. I’d really like you to get Curley on the record about this.

http://www.tucsonsentinel.com Dylan Smith

The AP doesn’t seem to realize that every AP hosted article includes a link that allows users to post articles – why aren’t they publicizing that instead of trying to scare off distributors?

But then, AP also tried to fend off the evil bloggers from embedding AP YouTube videos, when that’s the entire point of YouTube in the first place.

http://www.universalhub.com Adam Gaffin

That graphic is just another example of AP cluelessness and inability to tell people in English what they’re really up to (because they don’t know themselves or are trying to obfuscate?). Something that looks like a barrel of sludge is a pretty standard network-design icon for a “database.” Do a Google search on “database icon” and you’ll get zillions of examples. So AP had some artist who normally works on networking schematics do a diagram that to him is perfectly reasonable, but which to most people, yes, looks like a person about to be dumped in toxic waste.

For those wondering, i think it is save to assume that the” “passive” tracking service will crawl the Web searching for AP content and identify the publishing Web page, an image of usage and the time of discovery. Matches will be queried against the active tracking database, and unauthorized uses will be pursued.” that is referred to in the APs document is Attributor.

Thanks, Gerd. Though the AP hasn’t joined the Fair Syndication Consortium, they’re customers of Attributor, so I think that’s a fair bet. I asked Kasi if the “‘passive’ tracking service” was Attributor, but he wouldn’t say. —Zach

If you’re lucky enough to have the right deep-pocketed owner buy your paper and steady it, you’ve won the lottery. If you’re in a town whose paper is owned by the better chains, or committed local ownership, your loss will probably be mitigated. Otherwise, you’re out of luck.