RIAA wants to issue unlimited takedowns to Google

Data shows that labels aren't using current system to its limits, though.

If Google really hated piracy, it would let copyright holders supply takedown lists of unlimited length. That's the view of the recording industry, which yesterday issued a blog post from a top anti-piracy executive that blasted Google in the wake of the company's updated Transparency Report tool. That tool made clear that the major music and video copyright holders were not actually using Google's takedown tools to their full extent; indeed, Microsoft dwarfed any other single rightsholder in using Google's takedown system.

In the last month, for instance, the infringement detection company Marketly LLC topped the takedown list with 380,000 takedowns, all on behalf of Microsoft (read our recent interview with the Marketly CEO). The next three spots on the list belong to NBCUNIVERSAL (209,000), the British music trade group BPI (193,306), and a company called Takedown Piracy LLC (133,756). No one else on the list even cracks 100,000 takedowns.

The last month of takedown requests for Google's Web search

Music labels have reasons for this. Generating the takedown lists takes real time and money, and it often feels like Whac-A-Mole—take down a link to a song and another identical copy takes its place.

"In a recent one month period, we sent Google, and the site in question, multiple DMCA notices concerning over 300 separate unauthorized copies of the same musical recording owned by one of our member companies," wrote Brad Buckles, executive vice president for anti-piracy at the RIAA, in a blog post yesterday. "Yet that song is still available on that site today, and we reached it via a search result link indexed by Google. This highlights the futility of the exercise: if 'take down' does not mean 'keep down,' then Google’s limitations merely perpetuate the fraud wrought on copyright owners by those who game the system under the DMCA."

Fundamentally, the labels just don't believe they should have to do this much work for so little in the way of results. Google should do more, they argue. Perhaps it could refuse to link to files that have an identical hash to files already taken down; perhaps it could "prioritize" sites like iTunes and Amazon's music store above more dubious destinations.

These are all legitimate concerns to raise about the sometimes futile nature of DMCA takedowns, and they are debatable responses to the situation. They all provide good reasons for the RIAA to not rely too heavily on DMCA takedown notices—but the trade group says that the real problem with Google is that the search giant simply won't let the labels submit enough DMCA takedowns.

Buckles again:

You can’t notify Google about the scope of the problem if it limits the notices it will accept and process through its automated tool. And that is what Google does. On top of the query limitation, Google also limits the number of links we can ask them to remove per day. Google has the resources to allow take downs that would more meaningfully address the piracy problem it recognizes, given that it likely indexes hundreds of millions of links per day. Yet this limitation remains despite requests to remove it.

This complaint permeates the blog post. ("But Google places artificial limits on the number of queries that can be made by a copyright owner... These limits significantly decrease the utility of Google’s take down tool... The number of queries they allow is miniscule... Yet Google has denied requests to remove this barrier to finding the infringements.") But it's a little hard to square with the actual data.

Yes, Google has limits on take downs. In the picture below, note how one of Google's Webmaster tools allows only for 1,000 URLs per copyrighted file, and allows ten such files per notice (for a per-notice total list of 10,000 URLs).

Perhaps limits like these are too low to allow major rightsholders to effectively use the tools, yet what stands out is that the labels aren't even using such tools to their fullest extent. Marketly, which apparently consists of a Seattle-based exec team and some workers in India, blows away the movie and music industries. In the last month, the US music labels aren't even in the top three takedown submitters; when they do appear, they come in far below Marketly and even the British music labels.

While takedowns may be expensive or worthless or both, the labels simply aren't using Google's systems even to their current maximums, and they aren't even using them as extensively as other copyright holders. Complaints that this is a central problem with Google just look like a misguided method of bashing Google over broader frustrations with the working—or "not working"—of the DMCA's takedown system.

Sob, cry, think of the starving artists who are not going to get paid because of these illegal links.

OH...Wait. They don't get paid already.

Cry me a river RIAA, you had your chance to update your monopolistic archaic practices, yet you fought the internet every inch of the way. If you had decided to provide your customers the materials in the format they want, when they want and how they want, instead of going on a lawsuit rampage this may not have happened.

I win at Whack-a-mole by not playing. If they made the legit avenues easier and more convenient than finding an illegal copy then the problem solves itself. I think they are going down that road anyway with iTunes and Amazon, removing DRM, etc.

"In a recent one month period, we sent Google, and the site in question, multiple DMCA notices concerning over 300 separate unauthorized copies of the same musical recording owned by one of our member companies," wrote Brad Buckles, executive vice president for anti-piracy at the RIAA, in a blog post yesterday. "Yet that song is still available on that site today, and we reached it via a search result link indexed by Google. This highlights the futility of the exercise: if 'take down' does not mean 'keep down,' then Google’s limitations merely perpetuate the fraud wrought on copyright owners by those who game the system under the DMCA."

So first the RIAA issues a take down notice for the actual infringing file, and then they issue take down notices for the URLs linking to that specific file. Is this how it works? If the file hasn't been taken down off the site yet, is it really surprising that Google is simply going to automatically index it again? Seems like he is complaining that there needs to be tougher ways to treat the symptom without curing the source of the problem.

Oh yeah.... let's remove a phone number from the phonebook so no one could ever call that number again!!! And while we are at it, let's hold the telephone directory publisher liable for having that phone number in it.

Well, I mean, asking Google NOT to reindex something matching a given hash of a legit takedown is not something I consider terribly out of the realm of reasonability.

While I have no specific desire to help the *AA's, reasonable optoins like this do not seem terribly onerous or unacceptable.

To do that they would basically have to hash every single file on the Internet. Do you know what kind of time and processing that would take? Google would go from indexing daily to indexing once every couple months or so. Talk about slow response times.

I'm confused. If google is just linking to non-google domains, they aren't breaking any laws, are they?

OTOH, if the copyrighted material is actually hosted on google anywhere, then they should take it down after a real person verifies the material is not protected (spoofs, fair use, etc).

If anyone reports a non-infringing file as infringement, then that entity should be cut off for an hour (or a day or a week) from reporting. The bogus reports are breaking the law just as much as hosting copyrighted work without permission. I'll never believe that 100,000+ infringing link submissions daily have been verified. md5sums for similarly named files, I can believe, but when it becomes a random filename, with the same md5sum, then a human needs to listen to at least a few seconds before calling it "infringing."

I must not understand.

Google and the government are not responsible for finding other people's copyright infringement any more than I expect the government to track down and fine someone for borrowing entire articles from my posts here and elsewhere. That is part of the cost of doing business for every copyright holder.

Well, I mean, asking Google NOT to reindex something matching a given hash of a legit takedown is not something I consider terribly out of the realm of reasonability.

While I have no specific desire to help the *AA's, reasonable optoins like this do not seem terribly onerous or unacceptable.

Because you have no concept of algorithmic complexity.

Google is notoriously secretive about how they crawl, but it doesn't take an engineering degree to understand that checking if you should index something AND THEN indexing it will always take longer then just indexing it.

Let's say, for argument and this is a very, very inaccurate example, that the time to retrieve or insert one record is C. It's not but let's just say about google that it is.

And then lets say that the time to find a record is Cn^2. And then let's say that the time to index a site containing records is Cn.

Adding in the check would make the time to index a page Cn^3.

Meaning google's indexing happens a lot slower and with a lot more searches. Potentially enough searches that google would either need to reduce its indexing (reducing its value to the customers), or increase its searching capacity (increasing its margin cost).

RIAA pays google nothing to do any of this. Nothing. Google is just doing what it is obligated by law to do, so they will do the minimum that the law stipulates.

So they think Google should do their job for them, while bearing the cost of having a take down system, and they still aren't doing enough? Anything more than zero is a huge favor to the RIAA. Google is not obligated to bend to their wishes. Google is immune to the DMCA due to section 230 third party liability protection. Google could rightly tell them to stuff it, but they are being nice enough to provide a take down notification. They should be thanking Google for cooperating beyond what the law requires. Instead, they're bitching that Google isn't doing enough. The RIAA should just cease to be. They've proven to be nothing but an annoyance and completely fail to accomplish what they set out to do.

Well, I mean, asking Google NOT to reindex something matching a given hash of a legit takedown is not something I consider terribly out of the realm of reasonability.

While I have no specific desire to help the *AA's, reasonable optoins like this do not seem terribly onerous or unacceptable.

Because you have no concept of algorithmic complexity.

Google is notoriously secretive about how they crawl, but it doesn't take an engineering degree to understand that checking if you should index something AND THEN indexing it will always take longer then just indexing it.

I think you're missing the true problem with hashing a file before adding it to the index: to create an MD5 hash, Google first has to download the entire file. Currently, they do nothing of the sort; they just crawl the HTML *list* of files on TorrentFreak et al. If they then had to download the files via bittorrent, however many GB that might entail, plus run a hashing function over it... forget about ever getting around to building a search index.

There's no way Google would contemplate spending billions of dollars on bandwidth, storage, and CPU to download and compute hashes on the many yottabytes of files available on the Internet.

Article hits it on the head on the summary. RIAA is just reaching for complaints now, because they can't even use their preferred model of "blast takedown requests" as effectively as other groups. They really are bad at everything they do.

Would not allowing the lables unlimited DMCA take down lists also mean an increase in DMCA take down challenges and additional expense and personnel for google and also an increase in legal system expenses at the tax payers expense challenging those DMCA take downs in court if it came down to that?

Why don't the lables simply contract a company to do it for them like Microsoft did? It does not seem that Microsoft had any problem with filing take downs with google.

I guess the RIAA and lables want to keep it all in house, so their dirty little secrets like their overly inflated numbers, will not be known.

Well, I mean, asking Google NOT to reindex something matching a given hash of a legit takedown is not something I consider terribly out of the realm of reasonability.

While I have no specific desire to help the *AA's, reasonable optoins like this do not seem terribly onerous or unacceptable.

Because you have no concept of algorithmic complexity.

Google is notoriously secretive about how they crawl, but it doesn't take an engineering degree to understand that checking if you should index something AND THEN indexing it will always take longer then just indexing it.

Let's say, for argument and this is a very, very inaccurate example, that the time to retrieve or insert one record is C. It's not but let's just say about google that it is.

And then lets say that the time to find a record is Cn^2. And then let's say that the time to index a site containing records is Cn.

Adding in the check would make the time to index a page Cn^3.

Meaning google's indexing happens a lot slower and with a lot more searches. Potentially enough searches that google would either need to reduce its indexing (reducing its value to the customers), or increase its searching capacity (increasing its margin cost).

RIAA pays google nothing to do any of this. Nothing. Google is just doing what it is obligated by law to do, so they will do the minimum that the law stipulates.

Epic naive-CS troll... well played!! I'm sure you'll get some more bites on it from CS 101 flunkies.

I think you're missing the true problem with hashing a file before adding it to the index: to create an MD5 hash, Google first has to download the entire file. Currently, they do nothing of the sort; they just crawl the HTML *list* of files on TorrentFreak et al. If they then had to download the files via bittorrent, however many GB that might entail, plus run a hashing function over it... forget about ever getting around to building a search index.

That would also make Google the mother of all infringers themselves. But let's say Google gets to be an exception to the law... you think they'd be willing to seed?

Well, I mean, asking Google NOT to reindex something matching a given hash of a legit takedown is not something I consider terribly out of the realm of reasonability.

While I have no specific desire to help the *AA's, reasonable optoins like this do not seem terribly onerous or unacceptable.

Oh and putting the file into a ZIP or RAR archive won't change the hash right? Seems like you could easily write a tool to inject a known set of bytes to create a wholly new hash for the same file and then just have a remove extra bytes tool on the other end to decode out the original file. Whack-a-Mole-2.

As I've mentioned before, it doesn't seem right to expand DMCA take down powers until there are REAL penalties for filing bad take downs. I have no (serious) personal issue with the take down system for protecting rights holders. But the number of malicious, bad faith, or take downs of materials one doesn't have rights to just keeps going up every day.

How about a three-strikes rule for take downs? Once you've issued three bad take downs you get locked out of the system for a year. If not that, then a fine for every bad take down request.(Money split between the processing company and the actual rights holder). Then give the **AAs unlimited take down requests and see what happens.

I think you're missing the true problem with hashing a file before adding it to the index: to create an MD5 hash, Google first has to download the entire file. Currently, they do nothing of the sort; they just crawl the HTML *list* of files on TorrentFreak et al. If they then had to download the files via bittorrent, however many GB that might entail, plus run a hashing function over it... forget about ever getting around to building a search index.

All that and it STILL won't do shit.

Because all the pirates have to do is upload a trivially-different version of the file. Add a millisecond of silence to the end of the song? New hash. Add a single character to the metadata? New hash. Flip a single damn bit anywhere in the file? New hash.

Google wouldn't just be spending billions on bandwidth, processing, and storage -- they'd be spending it on an antipiracy measure that is trivial to circumvent.

(Or can we all just take it as read that "antipiracy measure that is trivial to circumvent" is a tautology at this point?)

Anyhow, I have a solution:

Let the RIAA issue an unlimited number of takedowns.

And make the penalty for a fraudulent takedown notice the same amount of money the RIAA claims as per-song damages from piracy.

Well, I mean, asking Google NOT to reindex something matching a given hash of a legit takedown is not something I consider terribly out of the realm of reasonability.

While I have no specific desire to help the *AA's, reasonable optoins like this do not seem terribly onerous or unacceptable.

To do that they would basically have to hash every single file on the Internet. Do you know what kind of time and processing that would take? Google would go from indexing daily to indexing once every couple months or so. Talk about slow response times.

That is how the entertainment execs wants it, as that was how it was before the net. If you did not kiss their pinky ring and beg for the privilege to publish, you were a heretic to be burned on the stake.

I win at Whack-a-mole by not playing. If they made the legit avenues easier and more convenient than finding an illegal copy then the problem solves itself. I think they are going down that road anyway with iTunes and Amazon, removing DRM, etc.

How hard is it to go to Amazon.com, search for your song, and click one button? It is that easy if you already have an amazon account.

Well, I mean, asking Google NOT to reindex something matching a given hash of a legit takedown is not something I consider terribly out of the realm of reasonability.

While I have no specific desire to help the *AA's, reasonable optoins like this do not seem terribly onerous or unacceptable.

Because you have no concept of algorithmic complexity.

Google is notoriously secretive about how they crawl, but it doesn't take an engineering degree to understand that checking if you should index something AND THEN indexing it will always take longer then just indexing it.

Let's say, for argument and this is a very, very inaccurate example, that the time to retrieve or insert one record is C. It's not but let's just say about google that it is.

And then lets say that the time to find a record is Cn^2. And then let's say that the time to index a site containing records is Cn.

Adding in the check would make the time to index a page Cn^3.

Meaning google's indexing happens a lot slower and with a lot more searches. Potentially enough searches that google would either need to reduce its indexing (reducing its value to the customers), or increase its searching capacity (increasing its margin cost).

RIAA pays google nothing to do any of this. Nothing. Google is just doing what it is obligated by law to do, so they will do the minimum that the law stipulates.

It's also beyond trivial to defeat the "match a given hash" on a webpage with something like <!--#config timefmt="--" --> <!--#echo var="DATE_LOCAL" --> and any number of trivial methods, but very nontrivial to remove trivially added salt without stripping the content to get "a simple hash". Google doesn't do something stupid like that because it would work for about a day (if that) till people realized & trivially coded around it.The SSI date_local isn't even the best method, just an obvious method

Well, I mean, asking Google NOT to reindex something matching a given hash of a legit takedown is not something I consider terribly out of the realm of reasonability.

While I have no specific desire to help the *AA's, reasonable optoins like this do not seem terribly onerous or unacceptable.

This is actually COMPLETELY unreasonable. Infringement is not solely the existence of something, it's the unauthorized use of it. This means that it must be evaluated in the context of who is making it available. You can't just match the hash with any other file you find because you may end up removing a copy that was shared legally.

I win at Whack-a-mole by not playing. If they made the legit avenues easier and more convenient than finding an illegal copy then the problem solves itself. I think they are going down that road anyway with iTunes and Amazon, removing DRM, etc.

Please stop with this nonsense, legal content has never been easier to find than it is now.

I win at Whack-a-mole by not playing. If they made the legit avenues easier and more convenient than finding an illegal copy then the problem solves itself. I think they are going down that road anyway with iTunes and Amazon, removing DRM, etc.

Please stop with this nonsense, legal content has never been easier to find than it is now.

This is not nonsense. Just because it's easier now than it has been does not mean the entertainment industry didn't totally f**k up the transition. They still have not adapted completely, but they have made progress.

I win at Whack-a-mole by not playing. If they made the legit avenues easier and more convenient than finding an illegal copy then the problem solves itself. I think they are going down that road anyway with iTunes and Amazon, removing DRM, etc.

Please stop with this nonsense, legal content has never been easier to find than it is now.

I guess I wasn't clear. I think that the rise of iTunes and Amazon's MP3 service has made it easy enough to make it more attractive than pirating, especially once they removed DRM. The RIAA shouldn't waste time and money on this because these are the people who won't buy it anyway.