Is There Really A 'Piracy' Problem For Newspapers?

from the totally-overblown dept

A few years ago, newspapers were all blaming Craigslist for their own business model problems. Then, of course, it became popular to blame Google. However, there's been an odd shift recently, to a claim that the problem is from "pirates" and "parasites." We see this in the AP's sudden desire to DRM the news by tracking how it's used and going after those it feels are using its content unfairly. We see it in the Marburger brothers' plan to put legal pressure on "parasitic aggregators." The problem, as we discussed, however, is that these parasitic aggregators are few and far between and the complaints against them just don't ring true at all.

Being a "publisher" of sorts, this is actually an area we have some experience with. As we've noted many times, there are plenty of "parasitic aggregators" (we usually refer to them as "spam blogs") that copy all our content. We track them, just because they tend to show up in searches, and one thing quickly becomes clear: they get little to no traffic at all, and any advertising revenue they bring in has to be close to nil. The average lifespan of such sites is usually about 3 months before they go away, and the argument that they take money away from us is silly. If anyone sees those copied posts, it doesn't take long to figure out that Techdirt is the originator of the content, and from that to learn it's probably easier/faster/better to just read the content here -- plus, by reading it here, they get to take part in the conversation that's actually happening here. The Marburgers admit that any one of these parasitic aggregators might not bring in that much money, but in aggregate (yes, aggregating the aggregators), they represent a substantial loss. Yet, they offer no evidence of that whatsoever, and as a publisher whose content is regularly used in this manner, I've seen no evidence that this is a real problem at all from a revenue standpoint.

Of course, that's the lowest of the "low" on these parasitic aggregators. But the Marburgers' define parasitic aggregators to include sites that don't have reporters on the scene, but still have journalists who write up stories based on others' reporting. But, oddly, the properties it names, such as Newser and The Daily Beast are both relatively small -- and both try to position themselves as sort of "premium" sites, rather than (as the analysis implies) ones trying to push down CPM ad rates. If these sites are taking away any traffic from major media sites, it's minimal at best, and it's quite unlikely they're really putting any pressure on newspaper ad rates.

It really just seems like a problem that isn't there.

But adding a bit more fuel to the fire, recently, was an article in the NY Times that read more like a press release from a company called Attributor (who's been banging this misguided drum for years), where it claims that a recent study found "publishers were losing $250 million a year from unauthorized copying." This number is creating all sorts of questions and controversy. And, it should. Because the number is bunk. Attributor is pretty cagey with how it came up with the numbers, but it involved looking at how many pages were "copied" from 25 major publications and then extrapolating out to other media sites. Even companies that work with Attributor think the claims are ridiculous. On top of that, even if you grant the premise on these "losses," that still represents a tiny amount of money spread across the entire industry.

But, just as with the music industry and its complaints about "piracy," this is yet another case of people falsely declaring sales not made (or, in this case, ad impressions not loaded) as being "losses." The reality is that you don't know if people would have seen the content otherwise. And you don't know if, having viewed the content at one of these other sites, they aren't later convinced to just go directly to the source. Like music "piracy" the issue isn't "parasites" or aggregators "free-riding." The problem is the originating sites not adding enough value to make it worthwhile to visit them, rather than using one of these other (still tiny) sites. I've said it before, and I'll say it again: if you're a publisher, and someone paraphrasing your content is enough to keep people away from your site, you're not doing a very good job adding enough value on your site to get folks to visit.

This is a problem that just doesn't exist. It's being blown way out of proportion. There is no real problem with "parasites" or "pirates" when it comes to news content. It's a distraction, and publications that spend a lot of time or money on it, will find that they're taking their eyes off the real issue: providing value to bring in more users and adapting to the new media marketplace.

Restricted bandwidth strikes again

I'll admit to being someone who uses these aggregators when there's a news piece I really want to read. Quite often the major news sites pages have far too much stuff extraneous to the news that rapidly uses up what little bandwidth I have free to browse with.

Aggregator sites, on the other hand, use up a LOT less bandwidth. Browsing CNet over three articles cost me nearly 5Mb in data once, while even an ad-heavy aggregator rarely costs me more than a few hundred kilobytes per news article.

Techdirt's the 'cheapest' in bandwidth terms for me, I can read everything that gets posted here and still be straining to hit more than a couple of Mb by the end of a nine hour shift.

Gotcha Journalism = First Mover Advantage

Doesn't first mover advantage really define ALL the market benefit of "on the ground" journalism? It's not like bloggers and aggregators aren't mentioning and directly linking to the sites where the "original" news broke. The only places I see that offer miniscule to zero links to original sources are the wire services like AP and Rueters, who only seem to name snippet authors and not the publication they usually call home. How can providing greater exposure and more links (which also create higher PageRank) to your ad-laden content ever be a bad thing??? I would agree if it were just all mass-plagiarism, but that's not what's happening at all. Quoting and commentary are defined as fair use precisely because they take steps to define the original source and avoid actual plagiarism. I guess this is just more proof that newsmen make terrible businessmen.

I saw a word used somewhere on the Internet the other day, it may have even been on this site in a post somewhere: "anecdata." I thought it was one of those sort of tongue-in-cheek portmanteau words at first - when you don't have any real data about an issue, just tell a few stories instead and you have some "anecdata." Apparently it has been used in this sort of humorous sense in the past. Unfortunately, the person who used it where I read it was completely serious. I'm not sure if he/she intended to write "anecdota" (i.e., a plural form of anecdote) or whether the use was actually serious (i.e., that there was such a thing as "anecdata.")

Here, if I have ever seen one, is an opinion brimming with anecdata, on both sides unfortunately. I always find it funny when Techdirt writers bristle about the lack of foundation for the studies of others, but do no studies of their own. The appropriate (and unfortunately labor-intensive) way to "debunk" a bad study is with...a good study. Just pointing and saying "nuh uh!" or "never seen it!" or "well there's just no way that's true" doesn't really sway me. In fact, as sketchy as some of the cited study methodologies may be (and many of them are indeed sketchy), at least there's a methodology there.

I don't find the comparison between Techdirt's spam-blog problem and the larger issues of aggregation in the journalism and newsgathering fields to be very appropriate, for a number of reasons.

First, as Mike frequently and vociferously points out, he is not a journalist and Techdirt is not a news site. He is not going out and gathering any facts, he is merely reproducing and commenting on the facts gathered by others. Or worse, reproducing and commenting on others' incorrect stories. That any given article here seems just as apt (if not moreso) to link to an earlier article on this site rather than the sites who provided the facts that he relies upon calls into question Techdirt's own status as a "parasitic aggregator" of the work of others.

Second, the aggregation situation in the world of newsgathering seems to me to be far more complex than simple spam blogs. If I go to Google News right now and type in "Michael Jackson drugs," there is a link where I can see 5,454 articles on the same subject. I am sure that there are not 5,454 reporters out there gathering Michael Jackson news firsthand. How many of these represent (and are monetized by) the people who actually did the original news reporting, and how many of them are rewrites, copies, or blog posts about the original news reporting? Does Google do anything at all to at least prioritize the newsgatherers' versions of the stories up top? I don't have inside knowledge about how their algorithm works, but I doubt it. Do those 5,454 versions of the story link back to the people who gathered and reported the facts originally, so at least they get some publicity and maybe some ad revenue? Again, probably not. Do we need 5,454 versions of the same story? No. Is having 5,454 versions of the story helping the originators, who did the real work of reporting, to get paid so they can do more good work, or is it simply diluting the attention to the point where nobody is getting paid much at all?

Additionally, I don't really have to click on even one of the 5,454 stories to find out what's going on, since Google has helpfully provided the important parts of the story right there for me. The "inverted pyramid style" of reporting lends itself to this particular abuse because it encourages the writer to put all the important stuff right up front, indeed in the first sentence.

Are these issues comparable to Techdirt's spam blog issue? It seems that they're only tangentially related. If 5,454 people were actively rewriting and reposting, on legitimate non-spam sites, everything Mike wrote, with no links back to Techdirt, and Google was happily putting Techdirt at the 1,843rd position for searches pertaining to the topic, then I wonder if the issue would be so insignificant.

Finally, I have quite a bit more sympathy for a real reporter than any opinion blogger when it comes to aggregation. Real, honest-to-goodness research or reporting is hard work. Reading stuff that interests you and then running your mouth about it, well, isn't. I've done plenty of both. If 5,454 people came and rewrote and disseminated this stupid-ass post I'm writing right now without giving me any sort of credit or link, the most I'm going to be is confused, because why would 5,454 people care that much about my opinion, informed or otherwise? They'd be free-riding on something that just isn't worth much and didn't take that much effort to create. If, on the other hand, 5,454 people came and republished my research results or sold pastiches or near-copies of my books, I would be pretty peeved.

Ultimately, we need better reporting, not worse. We need more facts, not more opinions. We need more data, not more anecdotes. We need more fact-checking and less speculation.

Traditional journalism, for sure, fails to meet this standard sometimes (as is pointed out about weekly here on Techdirt). However, I fail to see how glorifying aggregators and simultaneously dogging on and making life harder for the people who DO deal in reporting and facts and data is going to improve the situation.

I always refer to links

"Ultimately, we need better reporting, not worse. We need more facts, not more opinions. We need more data, not more anecdotes. We need more fact-checking and less speculation."

You said it better than I could Dr. Strange.

In my writing (which is mostly music and audio oriented; album reviews, show reports, industry news, etc) integrity is very important. It is after all my chosen form of artistic expression (aside of some music production. So it pains me when I see people plagiarize a press release instead of putting their own thought into an article (especially ripping off PR people's words in music reviews).

I also work for a couple of music sites that do act like aggregators; the main page is full of summaries - but there is ALWAYS a link to the original article, from the original source. We never take credit for writing the pieces, we just seek them out, especially if somebody else caught it before our staff. WE NEVER CLAIM TO BE THE WRITER IF WE FOUND THE STORY ELSEWHERE

People that do, well, you're right: They don't make much and they won't last.

Re:

the problem is seo

Part of the problem here for newspapers is SEO. Many of them do stupid things like block google, move articles or take them offline after a couple of days, or have websites that aren't optimized for search engines.

The only reason that these spam sites actually get any traffic at all is because they're able to pick up the slack and rank in the search engines for phrases that the newspapers "should" rank for, but don't because of their own misguided efforts.

I've run a few of those "parasite" sites in the past. (what better way to learn about something than to do it right?) They're easy to auto generate, and about 2-3 minutes worth of work per site can bring in an average of $5-$10/month with almost no effort on the owner's part.

Obviously it's not much income, but when you scale that over a few thousand sites, it can be pretty decent.

Confessions of a Newser user

As a frequent user of Newser.com I have to tell you that I visit far more of the original sites than I ever would with out it.

My normal process is to scan all of the "headlines" first thing in the morning and pull up the brief summary of anything interesting. If it is interesting enough, I will go to the source. I rarely look at the longer Newser summary because it never has enough detail to really satisfy.

I have discovered a variety of original content providers though Newser.com that I would have never found otherwise. (e.g. Heeb, The Root, or any of the local news sites) Also, I would never take the time to look at some or the stories on the mainstream sites (NY Times, Blomberg, BBC, etc) without the Newser "tease".

As an "average" reader with attractive demographics (upper-middle class, educated professional), I find it hard to understand why content providers would have any problem with Newser.com or similar sites.

aggregators

I do a lot of work at the aggregator in my link (we're trying to get it to scale before we open it to the public). We banned AP domains over a year ago when they went after Drudge and do you know what happened to our public traffic? Nothing. Our dupe algo continued to run fine, and we still source bloomberg, reuters, conde properties, and dozens of independent sites which put up articles, often better than AP's versions, and often faster.

Before today, I had never been to newser.com. I went just to see why the traditional news industry was making such a fuss over it (and others). I went in with the preconception that it would be a lot like slashdot for every day news. Low and behold, other than look and feel, it was. Many tech sites realized long ago that being mentioned on /. was a good thing. Years ago, (as an example) TomsHardware would see a 30% increase in traffic due to the slashdot effect. (Granted, now it's apparently less than 10%.) I'm sure Mike can attest to the increased traffic /. brings in whenever one of his posts makes it onto the front page. I don't think it's hard to see that TomsHardware is where it is today partly because of /., and I'm sure that a large number of other tech sites are in the same position.

Of course there is a huge disparity between traditional news and the tech sites that /. links to. The fact that /. works so well for tech sites doesn't mean a similar service will automatically work for news. However, the fact that the news sites are completely dismissing such services tells me that they haven't even considered that it works for someone else.

SOMEONE has to keep the content available, right?

Anecdote ahead: I run a music fan website that's kind of large, but not large enough to have an effect on the sales of periodicals, necessarily. About ten years ago, I realized that just linking to an article was not sufficient enough - publishers quit the business, or links change, and the content that was so convenient to hyperlink to disappears into the ether.

So that's when I decided to make backup copies of articles we linked to, and made that database accessible to our readers. When we'd post news to our front page, we'd be linking to the official website of the publisher, but we'd make a copy the text of the article, headline, byline, author, name of the publication, date of the publication, etc.

Because of this, I've been able to capture content - dozens, if not hundreds of articles - that otherwise have been completely erased due to the closing of companies and organizations. I don't feel bad about copying everything because I insist on attributing the original source, I don't sell ads on my archive, and I don't promote my archived articles ahead of the original articles. They're just there for later.

I realize (or I hope) I am not the kind of organization the AP is targeting with their really silly DRM (kudos to the scum who talked them into buying that. master salesman!) since mine is a very manual process on a fraction of a percent of news, but for some reason I felt the need to share this.

With the plethora of ways Techdirt gives their content away (RSS, e-mail), I still come to the website regularly throughout the day to get my news & analysis.

Though, I use AdBlocker Plus, because after years of irresponsible banner ad usage & shady advertisers, I don't appreciate or trust the on-line banner ad market anymore. So, sorry about not giving back with ad monies. But, I'll likely buy into the CwF + RtB program this weekend. I support this site, but not banner ads.

Data Sinks

The reason the whiners are losing share isn't Craig's List, Google, or Pirates. It's SUPER INTELLIGENT TALKING MICE! Yes, that's right boys and girls. Super intelligent talking mice are invading households all over the world, sneaking into dens, bedrooms, family rooms and studies, and whispering all there is to know about everything into the ears of everyone who can hear. I'm not making this up.

It doesn't seem like this would impact the reporters who are just trying to earn a magnanimous living, but think about it. Would you read the news again if you already knew the whole story? Of course not. As a result, pageviews, clickthoughs, and ad sales are at an all-time low all across the mainstream media. Sad, but true.

Congress needs to Act Now, and nip this in the bud. NIP IT! Otherwise, before you know it there won't be any alternative to commercial cat litter, hamster cage nesting material, or fish wrap.

"Piracy' Problem For Newspapers"

Wholesome opinion and Kind gesture in every published articles demonstrates the capacity of reasonable human being, to acknowledge the value in each person, that you are a worthwhile person in more ways than a million.