Thanks To Copyright, We Already Know How Aggressive Content Moderation Works: And It's A Disaster

from the lessons-learned dept

One of the reasons why I'm so adamant about the negative impacts on free speech from making internet platforms liable for the speech of their users, or even just by pushing for greater and greater content moderation by those platforms, is that this is not a theoretical situation where we have no idea how things will play out. We have reams and reams of evidence, just in the US alone (and plenty outside of the US) by looking at the copyright world. Remember, while CDA 230 makes platforms immune from a civil lawsuit regarding content posted by users or for any moderation choices regarding that content, it exempts intellectual property law (and criminal law). On the copyright side, we have a different regime: DMCA 512. Whereas CDA 230 creates a broad immunity, DMCA 512 creates a narrower "safe harbor" where sites need to meet a list of criteria in order to be eligible for the safe harbor, and those criteria keep getting litigated over and over again. Thus, we have quite a clear parallel universe to look at concerning what happens when you make a platform liable for speech -- even if you include conditions and safe harbors.

And it's not good.

Corynne McSherry, EFF's legal director has put together an excellent list of five lessons on content moderation from the copyright wars that should be required reading for anyone looking to upset the apple cart of CDA 230. Incredibly, as someone who lives and breathes this stuff every day, it's quite incredible how frequently we hear from people who haven't looked at what happened with copyright, who seem to think that the DMCA's regime is a perfect example of what we should replace CDA 230 with. But... that's only because they have no idea what a complete and total clusterfuck the copyright/DMCA world has been and remains. Let's dig into the lessons:

1. Mistakes will be made—lots of them

The DMCA’s takedown system offers huge incentives to service providers that take down content when they get a notice of infringement. Given the incentives of the DMCA safe harbors, service providers will usually respond to a DMCA takedown notice by quickly removing the challenged content. Thus, by simply sending an email or filling out a web form, a copyright owner (or, for that matter, anyone who wishes to remove speech for whatever reason) can take content offline.

Many takedowns target clearly infringing content. But there is ample evidence that rightsholders and others abuse this power on a regular basis—either deliberately or because they have not bothered to learn enough about copyright law to determine whether the content they object to is unlawful. At EFF, we’ve been documenting improper takedowns for many years, and highlight particularly egregious ones in our Takedown Hall of Shame.

There are all different kinds of "mistakes" embedded here as well. One is just the truly accidental mistakes, but it happens quite frequently when you're dealing with billions of pieces of content. Even if those mistakes only happen 0.01% of the time, you still end up with a ton of content taken down that should remain up. That's a problem.

But the larger issue, as Corynne points out, is abuse. If you give someone a tool to take down content, guess what they do with it? They take down content with it. They often don't care -- even a little bit -- that they're using the DMCA for non-copyright purposes. They just want the content to come down for whatever reasons, and the DMCA provides the tool. Imagine that, but on steroids, in other contexts.

2. Robots aren’t the answer

Rightsholders and platforms looking to police infringement at scale often place their hopes in automated processes. Unfortunately, such processes regularly backfire.

For example, YouTube’s Content ID system works by having people upload their content into a database maintained by YouTube. New uploads are compared to what’s in the database and when the algorithm detects a match, copyright holders are informed. They can then make a claim, forcing it to be taken down, or they can simply opt to make money from ads put on the video.

But the system fails regularly. In 2015, for example, Sebastien Tomczak uploaded a ten-hour video of white noise. A few years later, as a result of YouTube’s Content ID system, a series of copyright claims were made against Tomczak’s video. Five different claims were filed on sound that Tomczak created himself. Although the claimants didn’t force Tomczak’s video to be taken down they all opted to monetize it instead. In other words, ads on the ten-hour video could generate revenue for those claiming copyright on the static.

Again, this has been an incredibly popular claim -- most frequently from people who are totally ignorant both of how technology works (the "well, you nerds can nerd harder" kind of approach) to those who are totally ignorant of what a complete clusterfuck existing solutions have been. ContentID is absolutely terrible and has a terrible error rate and it is, by far, the best solution out there.

And that's just in the narrow category of copyright filtering. If you expect algorithms to be able to detect much more amorphous concepts like bullying, terrorist content, hate speech and more... you're asking for much higher error rates... and a lot of unnecessary censorship.

With the above in mind, every proposal and process for takedown should include a corollary plan for restoration. Here, too, copyright law and practice can be instructive. The DMCA has a counternotice provision, which allows a user who has been improperly accused of infringement to challenge the takedown and, if the sender doesn’t go to court, the platform can restore to content without fear of liability. But the counternotice process is pretty flawed: it can be intimidating and confusing, it does little good where the content in question will be stale in two weeks, and platforms are often even slower to restore challenge material. One additional problem with counter-notices, particularly in the early days of the DMCA, was that users struggled to discover who was complaining, and the precise nature of the complaint.

The number of requests, who is making them, and how absurd they can get has been highlighted in company transparency reports. Transparency reports can both highlight extreme instances of abuse—such as in Automattic’s Hall of Shame—or share aggregate numbers. The former is a reminder that there is no ceiling to how rightsholders can abuse the DMCA. The latter shows trends useful for policymaking. For example, Twitter’s latest report shows a 38 percent uptick in takedowns since the last report and that 154,106 accounts have been affected by takedown notices. It’s valuable data to have to evaluate the effect of the DMCA, data we also need to see what effects “community standards” would have.

Equally important is transparency about specific takedown demands, so users who are hit with those takedowns can understand who is complaining, about what. For example, a remix artist might include multiple clips in a single video, believing they are protected fair uses. Knowing the nature of the complaint can help her revisit her fair analysis, and decide whether to fight back.

Another point on this is that, incredibly, those who support greater content policing frequently are very much against great transparency. In the copyright space, the legacy companies have often hit back against transparency and complained about things like the Lumen Database and the fact that Google publishes details of takedown demands in its transparency reports -- with bogus claims about how such transparency only helps infringers. I fully expect that if we gut CDA 230, we'll see similarly silly arguments against transparency.

4. Abuse should lead to real consequences

Congress knew that Section 512’s powerful incentives could result in lawful material being censored from the Internet without prior judicial scrutiny. To inhibit abuse, Congress made sure that the DMCA included a series of checks and balances, including Section 512(f), which gives users the ability to hold rightsholders accountable if they send a DMCA notice in bad faith.

In practice, however, Section 512(f) has not done nearly enough to curb abuse. Part of the problem is that the Ninth Circuit Court of Appeals has suggested that the person whose speech was taken down must prove to a jury the subjective belief of the censor—a standard that will be all but impossible for most to meet, particularly if they lack the deep pockets necessary to litigate the question. As one federal judge noted, the Ninth Circuit’s “construction eviscerates § 512(f) and leaves it toothless against frivolous takedown notices.” For example, some rightsholders unreasonably believe that virtually all uses of copyrighted works must be licensed. If they are going to wield copyright law like a sword, they should at least be required to understand the weapon.

This gets back to the very first point. One of the reasons why the DMCA gets so widely abused any time anyone wants to take down anything is because there's basically zero penalty for fling a bogus report. There is merely the theoretical penalty found in 512(f) which the courts have effectively killed off in most (hopefully not all) cases. But, again, it's even worse than it seems. Even if there were a better version of 512(f) in place, how many people really want to go through the process of suing someone just because they took down their cat videos or whatever? In many cases, the risk is exceptionally low because even if there were a possible punishment, no one would even pursue it. So if we're going to set up a system that can be abused -- as any of these systems can be -- there should be at least some weight and substance behind dealing with abusive attempts to censor content.

5. Speech regulators will never be satisfied with voluntary efforts

Platforms may think that if they “voluntarily” embrace the role of speech police, governments and private groups will back off and they can escape regulation. As Professor Margot Kaminski observed in connection with the last major effort push through new copyright enforcement mechanisms, voluntary efforts never satisfy people speech censors:

Over the past two decades, the United States has established one of the harshest systems of copyright enforcement in the world. Our domestic copyright law has become broader (it covers more topics), deeper (it lasts for a longer time), and more severe (the punishments for infringement have been getting worse).

… We guarantee large monetary awards against infringers, with no showing of actual harm. We effectively require websites to cooperate with rights-holders to take down material, without requiring proof that it's infringing in court. And our criminal copyright law has such a low threshold that it criminalizes the behavior of most people online, instead of targeting infringement on a true commercial scale.

In addition, as noted, the large platforms adopted a number of mechanisms to make it easier for rightsholders to go after allegedly infringing activities. But none of these legal policing mechanisms have stopped major content holders from complaining, vociferously, that they need new ways to force Silicon Valley to be copyright police. Instead, so-called "voluntary efforts" end up serving as a basis for regulation. Witness, for example, the battle to require companies to adopt filtering technologies across the board in the EU, free speech concerns be damned.

Over and over we've tried to point this out to the platforms that have bent over backwards to please Hollywood. It's never, ever enough. Even when the platforms basically do exactly what was asked of them, those who wish to censor the networks always, always, always ask for more. That's exactly what will happen if we gut CDA 230.

Again, it's worrisome that the debates over changing intermediary liability rules don't seem to even acknowledge the vast amount of empirical evidence we have in the copyright context. And that's maddening, given that the copyright world shows just how broken the system can become.

Reader Comments

But not at all surprising

Again, it's worrisome that the debates over changing intermediary liability rules don't seem to even acknowledge the vast amount of empirical evidence we have in the copyright context.

Much like the 'It is difficult to get a man to understand something...' quote with regards to someone's job, it is hardly surprising that those pushing for ever more stringent laws and control ignore the historical evidence showing just how bad their ideas are, and how damaging they stand to be.

Acknowledging those points would severely undermine their arguments, so of course they'd ignore them and pretend that it's all 'hypothetical' and 'fear-mongering over nothing', to do otherwise would be highly counter-productive to the narrative they're trying to spin about how with just a few, harmless tweaks to the law things would be ever so much better for everyone.

Because

_ it is hardly surprising that those pushing for ever more stringent laws and control ignore the historical evidence showing just how bad their ideas are, and how damaging they stand to be._

Because the people pushing this aren't the ones who will be damaged. And because the people pushing for more restrictive laws have more money than the people who would benefit from more open laws. (The Common Good is underfunded.)

Re: Hey guys, you kinda messed up something in your quote block...

Why is it that all discussion of copyright laws and enforcement is being driven by those who buy a minute amount of copyrighted works to be published, and who benefit most from those works, which were created by others?

Why should the commercial interests of the legacy labels, studios and publishers be allowed to eliminate the possibility for individual creators to make money from their works, without using the legacy industries services?

A sensible examination of copyright requirements would ask what copyright rules maximize the output of new works, rather than what rules maximize the ability of the legacy players to maximize their profits, and their control over markets.

Re: Re: "what copyright rules maximize the output of new works"?

Re: Re: "what copyright rules maximize the output of new works"?

I gather from rest you wrote that what you actually want is some way to legalize copytheft so can enjoy for free the work of others.

Why do assume that, when I explicitly stated that the push to strengthen copyright law is damaging the ability of many creators to make money, and trying to limit the ability to make money to the publishers, labels and studios who are notorious for short changing the artists whose works the gain control over.

Re:

Why should the commercial interests of the legacy labels, studios and publishers be allowed to eliminate the possibility for individual creators to make money from their works, without using the legacy industries services?

Because, as disgusting as it is, from the view of politicians they're the ones throwing the money around at the politicians, and so long as they get their 'donations' now then the public can hang for all they care.

A sensible examination of copyright requirements would ask what copyright rules maximize the output of new works, rather than what rules maximize the ability of the legacy players to maximize their profits, and their control over markets.

Mostly true, I'd also add in 'and ensures that said works make it into the public domain as soon as possible', as the entire point of copyright law is to serve the public, and a robust and ever expanding public domain(which leads to more private works, which feed into the public domain, which leads...) is the best means to do that.