Publisher vs Aggregator: A case study

Before getting into the nitty-gritty of copyrights and patent protections of crosswords (and software), it would be useful to understand why the world of commercial digital content is in such a messy situation especially with what technology has enabled.

I will illustrate one type of conflict with the curious incident, back in June of this year, involving The New York Times and an iPad newsreader app called Pulse written by two Stanford University students.

The reason I have picked this incident is because it combines a number of issues – aggregation, fair use, framing, commercial use of third party content, the technology genie, consumer expectations, and publisher control of content. The people on the old-media publishing side will understand the reasons for the NYT actions perfectly. But the NYT would not have found much understanding or sympathy anywhere else. That gap is a problem for publishers in general.

Pulse is a well-executed newsreader that provides a very convenient browsing and traversal user experience for looking at RSS feeds from multiple content providers.

As widely reported in the media (All Things Digital/WSJ, The Guardian, Wired), the Pulse app was pulled for a while from the App Store in response to a complaint from the legal department of the NYT. It had been the top-selling paid app for iPads for a while. Ironically, this happened on the very day that Steve Jobs mentioned the app in his key-note address at the Worldwide Developers Conference. A technology reviewer for NYT had also praised the app just a week earlier.

Most would not understand why NYT complained. Much of the Internet audience showed outrage. Surely, it is a clear case of a greedy corporation going after poor developers? After all, from their perspective, it was just a fancy RSS reader incorporating feeds that NYT itself puts out for reading via RSS readers. If people wanted to get more information on a news item, they could click on it and visit the NYT web page in an embedded browser in the app. Isn’t that how RSS feeds work? Where is the problem?

This case is a good example of taking something to its logical extreme and making the very assumptions behind it invalid. In an earlier post, I had mentioned a third-party service that would collect free cheese samples from every grocery store and deliver it to its customers for a fee which it kept for itself. Most will likely see legal problems with this service. Now, imagine extending that into a hypothetical where the service scrapes enough free stuff in grocery stores that it can deliver a complete meal (if not necessarily a healthy one) so that its customers didn’t have to go to any grocery store at all. Most will think this scenario extreme and unfeasible but if forced to accept the hypothetical scenario, will see legal problems with it.

Why are the perceptions in the digital world so different? There is the very obvious difference that digital content can be copied rather than taken from a limited supply. The latter would be equated to “stealing” while the former would not be. Technology makes it feasible to copy such content from anywhere in the world, combine it, transform it, package it and deliver it anywhere else. Unfortunately, an average person (incorrectly) equates the concept of copyright infringement to the concept of “stealing” to understand it. From that simple perspective, if there is no “stealing”, there is no copyright infringement. So while they would be sympathetic to grocery stores taking action against the “food aggregator” in the hypothetical above, they might not be so sympathetic to a content producer whose content was “just copied”. After all, it was just another copy, what harm could it have done?

Copyright laws already cover that part even if it is not “stealing”. Most consumers seem to think copyrights are for greedy corporations because it invariably works against them (not that they didn’t have justifiable reasons because of the music industry practices). The fact that it is those very copyright protections that make the business models possible for the content producer is difficult to understand for most consumers in the digital world.

It gets even more blurry when the content providers provide the content under a limited or restricted context of use, typically captured in the Terms of Service that hardly anyone ever reads. Technology makes such content easily copied, transported and consumed outside that context. So the content provider faces two problems. One, most consumers do not understand why transgressing that limited or restricted use may be a bad thing for both, especially when it is more convenient to them. Two, available technology infrastructure is very poor when it comes to fine-grain control of distribution of content and often poses problems for valid use of the content (e.g., anti-piracy devices for digital content).

I will not try to justify the NYT action legally from the reported information which is the only information I have (see this article for a quick analysis from a copyright infringement perspective). It is interesting to note, however, that the main point of the NYT complaint was not copyright infringement but violation of the Terms of Service under which the RSS feed was provided. (Note: Crossword aggregators are in a much more precarious position than RSS readers because they will fail most of the 4 copyright infringement tests in addition to failing the Terms of Service. But that is for a different post).

Let us look at the broader picture of what leads to such situations where one side or the other could be right for legal reasons under available laws but be wrong for both of them in the long run.

RSS feeds are useful tools to a content producer to push snippets of content as a way to get the attention of its readers, especially when the producer already has a commercial (or monetizable) relationship with the reader. If the monetization depends on the reader visiting the site for the full content (the justification for any free feed), then providing that feed makes sense only if it causes the reader to do so for at least part of the content. But RSS feeds are rather blunt tools for such finesse in requirements. They have no way of restricting the context of use from a technology perspective, so the RSS feeds usually have Terms of Service associated with them. The NYT complaint hinged on the commercial use of their feed violating the Terms of Use.

Looking at some of the arguments in this case from the Internet audience, they confuse the concept of commercial use from that of a commercial product. For example, they point out – aren’t most consuming this feed in a computer that is a commercial product? Does that make the computer manufacturer liable? If not, the newsreader app should be legal as well. The computer is not making commercial use of the feed itself, it has no concept of such feeds in its construction, distribution or marketing. The Pulse app at the other extreme was making commercial use of the feed directly (on a side note, the app is currently distributed free) making such feeds the reason for its existence. But, in reality, the content provider has already lost in the Internet court of opinion by having to make such fine distinctions. The technology genie is already out of the bottle.

The problem with such apps for publishers go beyond whether they are commercial or not. It just so happens that the current laws are easier to enforce when it is commercial use. The broader problem is the role of the publisher in the digital consumption of content.

The aggregation model breaks the traditional business model in many ways. One, it potentially dilutes the brand when mixed with a wide variety of sources. Two, it prevents cross-selling and up-selling of content by separating out the individual bits of content and the aggregation platform becomes a horizontal delivery mechanism of its own. Three, it prevents publishers from amortizing the costs over all content, so they do not have to bet which ones are likely to be more popular or stop providing some things (say coverage of Somalia) because very few in their audience select such pieces to read. None of these will be obvious to the consumer. It is like making an argument for climate change. Very few see the long-term effects sufficiently to think different or to modify their behavior.

… [Pulse app] is an excellent piece of work and one that NYT should be embracing. After all, shouldn’t they have come up with something like this themselves?

Ummm… the main attraction of Pulse comes from the aggregation of content from many sources and the convenient organization of that content, not the fancy touch capabilities, something NYT itself cannot do without shooting itself in the foot. It is but a single source.

… The application, which costs $4, has been downloaded 35,000 times. It was the top paid app for a while….

… The poor old Times has managed to gain 35,000 subscribers in a few weeks, without doing a thing. Those are pretty good numbers, and you’d think that the paper would be happy about this free exposure, especially as readers will click-through direct to its ad-filled pages…

Anyone familiar with the publishing business would smile (and grimace at the same time) at the naivete of this statement. It assumes that everyone that downloaded the app, read NYT feeds and clicked through to its pages and they would not have seen or visited the NYT otherwise. The reality is quite different. Think why this app used New York Times as the primary image when it first came out. It needed NYT lot more than NYT needed the app to get visitors. There is data to show that almost half of the readers who use aggregator services never actually click through to the detailed articles even if they read the available snippet from that source. The downside for NYT is the loss of branding and creative control over how its content is organized as the app becomes a portal for too many of its readers rather than for its own app or its web site. It gets worse if it encourages more me-too apps.

While technically it is still a RSS news reader, the app utilized RSS feeds which publishers used primarily to give a heads up for people to follow up on, to create a competent and aesthetic digital magazine on its own that had the potential to dilute the content owner’s identity, branding and its own portal destination potential. In other words, it converted a mechanism that used to be a funnel to keep bringing its readers back to a web site, to a destination of its own where the click through to the site was an exception rather than the rule.

While technology made it feasible, this logical extension of RSS feeds also invalidated the assumptions under which the publisher provided the original feeds as a funnel to its own site. In addition, the iPad is emerging as a vehicle for digital magazines and NYT may have its own ideas on how its content is made available on that platform via a different business model than its web site. This requires the publishers to prevent overflow of content from one channel (web site) to another (digital magazine) so as not to cannibalize it. I have no idea what NYT reasoning really was but it would be my bet that the above is behind the cause of the complaint.

Careful readers would note the similarity to the argument made for crosswords in earlier posts to control the overflow from the free syndication publication to the direct delivery of paid content to consumers to establish parallel business models for the content creators as a matter of survival. Aggregators can subvert this.

The incident demonstrates a number of things relevant to crosswords as a part of that digital content in the publisher channel:

Very few people understand what copyrights mean. Even people in the industry who understand copyrights in their own silos, seldom understand it and even participate in or promote copyright infringements in other areas. For example, crossword producers would recognize and scream against plagiarism of their own creations but hardly notice or care about such violations in the software they may use or recommend. Software vendors are very conscious about plagiarism of their software (I will have some illustrations of this in the future from the crossword industry) and yet don’t care much about violating the copyrights of the content they provide via their software. Apple itself promotes (until challenged) and benefits from apps that may have copyright violations (or You Tube for that matter with videos) with convenient ignorance but is quick to protect its copyrights on its own products pro-actively.

There is an uneasy truce between aggregators and the content producers based on fair use. This isn’t a problem until the aggregators become parasitic to the business model of the content producer. At that point, very blunt instruments of law gets thrown around hurting everybody and creating ill-will. Not all aggregators are parasitic or need to be parasitic. Making this distinction is crucial for the crossword industry and digital content industry in general. Unfortunately, most aggregators only look at one side of the equation – the consumption side and become parasitic to the producer side, intentionally or otherwise. Hopefully, some products that look at both sides will come up in the future to set the right examples.

It would be my hope that with the issues raised in this blog, there is at least awareness of the needs of all participants in the industry rather than the myopic view within each silo that leads to incidents like this. Such incidents only serves to further polarize the producers and consumers each of whom need the other for mutual benefit.