Erroneous DMCA notices and copyright enforcement, part deux

A few weeks ago, I wrote about a deluge of DMCA notices and pre-settlement letters that CoralCDN experienced in late August. This article actually received a bit of press, including MediaPost, ArsTechnica, TechDirt, and, very recently, Slashdot. I’m glad that my own experience was able to shed some light on the more insidious practices that are still going on under the umbrella of copyright enforcement. More transparency is especially important at this time, given the current debate over the Anti-Counterfeiting Trade Agreement.

Given this discussion, I wanted to write a short follow-on to my previous post.

The VPA drops Nexicon

First and foremost, I was contacted by the founder of the Video Protection Alliance not long after this story broke. I was informed that the VPA has not actually developed its own technology to discover users who are actively uploading or downloading copyrighted material, but rather contracts out this role to Nexicon. (You can find a comment from Nexicon’s CTO to my previous article here.) As I was told, the VPA was contracted by certain content publishers to help reduce copyright infringement of (largely adult) content. The VPA in turn contracted Nexicon to find IP addresses that are participating in BitTorrent swarms of those specified movies. Using the IP addresses given them by Nexicon, the VPA subsequently would send pre-settlement letters to the network providers of those addresses.

The VPA’s founder also assured me that their main goal was to reduce infringement, as opposed to collecting pre-settlement money. (And that users had been let off with only a warning, or, in the cases where infringement might have been due to an open wireless network, informed how to secure their wireless network.) He also expressed surprise that there were false positives in the addresses given to them (beyond said open wireless), especially to the extent that appropriate verification was lacking. Given this new knowledge, he stated that the VPA dropped their use of Nexicon’s technology.

BitTorrent and Proxies

Second, I should clarify my claims about BitTorrent’s usefulness with an on-path proxy. While it is true that the address registered with the BitTorrent tracker is not usable, peers connecting from behind a proxy can still download content from other addresses learned from the tracker. If their requests to those addresses are optimistically unchoked, they have the opportunity to even engage in incentivized bilateral exchange. Furthermore, the use of DHT- and gossip-based discovery with other peers—the latter is termed PEX, for Peer EXchange, in BitTorrent—allows their real address to be learned by others. Thus, through these more modern discovery means, other peers may initiate connections to them, further increasing the opportunity for tit-for-tat exchanges.

Some readers also pointed out that there is good reason why BitTorrent trackers do not just accept any IP address communicated to it via an HTTP query string, but rather use the end-point IP address of the TCP connection. Namely, any HTTP query parameter can be spoofed, leading to anybody being able to add another’s IP address to the tracker list. That would make them susceptible to receiving DMCA complaints, just we experienced with CoralCDN. From a more technical perspective, their machine would also start receiving unsolicited TCP connection requests from other BitTorrent peers, an easy DoS amplification attack.

That said, there are some additional checks that BitTorrent trackers could do. For example, if the IP query string or X-Forwarded-For HTTP headers are present, only add the network IP address if it matches the query string or X-Forwarded-For headers. Additionally, some BitTorrent tracker operators have mentioned that they have certain IP addresses whitelisted as trusted proxies; in those cases, the X-Forwarded-For address is used already. Otherwise, I don’t see a good reason (plausible deniability aside) for recording an IP address that is known to be likely incorrect.

Best Practices for Online Technical Copyright Enforcement

Finally, my article pointed out a strategy that I clearly thought was insufficient for copyright enforcement: simply crawling a BitTorrent tracker for a list of registered IP addresses, and issuing a infringement notice to each IP address. I’ll add to that two other approaches that I think are either insufficient, unethical, or illegal—or all three—yet have been bandied about as possible solutions.

Wiretapping: It has been suggested that network providers can perform deep-packet inspection (DPI) on their customer’s traffic in order to detect copyrighted content. This approach probably breaks a number of laws (either in the U.S. or elsewhere), creates a dangerous precedent and existing infrastructure for far-flung Internet surveillance, and yet is of dubious benefit given the move to encrypted communication by file-sharing software.

Spyware: By surreptitiously installing spyware/malware on end-hosts, one could scan a user’s local disk in order to detect the existence of potentially copyrighted material. This practice has even worse legal and ethical implications than network-level wiretapping, and yet politicians such as Senator Orrin Hatch (Utah) have gone as far as declaring that infringers’ computers should be destroyed. And it opens users up to the real danger that their computers or information could be misused by others; witness, for example, the security weaknesses of China’s Green Dam software.

So, if one starts from the position that copyrights are valid and should be enforceable—some dispute this—what would you like to see as best practices for copyright enforcement?

The approach taken by DRM is to try to build a technical framework that restricts users’ ability to share content or to consume it in a proscribed manner. But DRM has been largely disliked by end-users, mostly in the way it creates a poor user experience and interferes with expected rights (under fair-use doctrine). But DRM is a misleading argument, as copyright infringement notices are needed precisely after “unprotected” content has already flown the coop.

So I’ll start with two properties that I would want all enforcement agencies to take when issuing DMCA take-down notices. Let’s restrict this consideration to complaints about “whole” content (e.g., entire movies), as opposed to those DMCA challenges over sampled or remixed content, which is a legal debate.

For any end client suspected of file-sharing, one MUST verify that the client was actually uploading or downloading content, AND that the content corresponded to a valid portion of a copyrighted file. In BitTorrent, this might be that the client sends or receives a complete file block, and that the file block hashes to the correct value specified in the .torrent file.

When issuing a DMCA take-down notice, the request MUST be accompanied by logged information that shows (a) the client’s IP:port network address engaged in content transfer (e.g., a record of a TCP flow); (b) the actual application request/response that was acted upon (e.g., BitTorrent-level logs); and (c) that the transferred content corresponds to a valid file block (e.g., a BitTorrent hash).

So my question to the readers: What would you add to or remove from this list? With what other approaches do you think copyright enforcement should be performed or incentivized?

Comments

The post correctly recognizes that “wiretapping” and “spyware” raises serious legal and ethical and ethical questions, What continues to concern me is that when it comes to copyright infringement, the concern for due process seems to simply evaporate. So here we have a situation where terrorists intending to destroy the US have greater legal protections than a person who infringes!

To phrase my position a bit differently, We seem to accept the premise that a content creator can reach out and “trespass’ on anyone for the sake of protecting their content. By implication then, I would have a universal unquestionable right to break into anyone’s house, on a whim, just to check if they may have taken something from me. While infringement may be deplorable, the content holders should not have a supreme right to overrule (abuse) the rights of others.

An issue that periodically surfaces is that some content is in the public domain. A corollary to that is that copyright will expire and that content will enter the public domain. How do we add the concept of identifying content that is in the public domain?

We already have seen cases were some assert a copyright privilege that they don’t even possess.

While this posts asks for thoughts on “copyright enforcement”, I think that those who abuse their copyright privilege should be provided with negative incentives, such as payments (fines) to those wrongfully “accused”.

As it stands now, accusations can be made with little repercussion. Power that is not realisticly restrained will result in corruption and abuse.

As I wrote, it occurred to me that we are using the current copyright regime as the “play field”. With that in mind, one of the “enforcement” questions really relates back to reducing the perceived “rights” that the content holders seem to believe that they have. (Please note that copyright has become “stronger” over time, so why should the existing situation be considered correct?.) Why should I help with formulating an “enforcement” strategy based on existing onerous laws. For example we can reduce the enforcement burden by eliminating concepts such as region coding and allowing users who bought (yes bought) a CD to copy the content to another device as legitimate fair use.

Corollary? It’s not a given that copyright will expire. The 105th Congress passed both the DMCA and the Copyright Term Extension Act by unanimous consent with weeks of each other, and the Supreme Court upheld serial extension of the copyright term in Eldred v. Ashcroft.

On the legal side, I’d like to see stricter standards for verification of copyright complaints, and harsher, more readily applied penalties for those who file incorrect complaints. Let’s balance the stakes so it’s not so easy to make ungrounded complaints.

The DMCA currently lets someone complain about erroneous takedown (under sec 512(f)), but requires them to show “knowing material misrepresentation” — a difficult standard to meet, though we did so in the Diebold case. Thus a copyright claimant can be careless without penalty, as long as he didn’t know he was wrong. The DMCA provides for costs and attorneys’ fees to the person who can prove that knowing misuse, but no further penalty. As a result, not enough people complain about DMCA misuse or impose costs to deter it.

Allowing people to sue for “material misrepresentation” that caused them harm (removing the “knowing” element), and offering a penalty including statutory damages (why not $200-$150,000, to balance the statutory damages for infringement?) would enable more people to bring claims for misuse, and perhaps cause claimants to investigate more carefully before filing copyright claims.

“(why not $200-$150,000, to balance the statutory damages for infringement?)”

I think the fine should be multiplied by the relative difference between the wrongly accused’s per capita income and that of the accusing organisation and of the rights holder.

That is if you persue a 18 year old with $10,000 / year income being wrongly accused by say the VPA. Then the VPA per capita income and the percapita income of the rights holder should be added to gether say for argument’s sake $500,000,000 and the resulting multiplier 50,000 applied.

This would have the effect of causing the industry to actually persue the real serious crime infringers. Not the individuals who have no ability to defend themselves against corparate inhouse lawyers who specialise in this area of law.

Importantly that level of expertise is not available to the individual (it’s nearly all in house in the organisations), such high fines would encorage independent lawyers to specialise in the area and thus balance the playing field in two dimensions.

You mention that “Wiretapping…is of dubious benefit given the move to encrypted communication by file-sharing software.” I’d want to see a much more careful analysis of this before concluding that encryption will be of any help against ISPs in this sort of situation.

ISPs are perfectly placed to execute man-in-the middle attacks on encrypted connections, and these can be defeated only by authenticating the partner with whom you’re communicating. (I go into detail on this issue an item in issue 25.87 of the RISKS Digest.)

Sharing information with anonymous peers obviously poses some difficulty when it comes to authentication.

Encryption need not necessarily apply solely to the connection, but the file being transferred as well. No DPI solution will break a PGP encrypted file irrespective of whether or not the client connection is encrypted. In a circumstance where the file itself is encrypted a man-in-the middle attack is useless and it also raises serious privacy issues when dealing with personal internet banking and other allegedly secure sites.

you also lack understanding of the hard cryptographic problem that is the exchange of data between two nodes without a pre-existing shared secret.

PGP (and pgp-like schemes) require either an out-of-band communication of a shared secret (a password, trivially) or a trusted introducer – a signing agent already known to and trusted by the sending party.

Now, it is possible to impose such a scheme externally – https already does so, by using a “trusted CA” model, and a private tracker could provide a trusted signing key (by which a new node could “prove” it is the owner of an IP; the requirement would be that the tracker present its public key when a client registers, AND provide a signature for the client’s own key via an authenticated https connection, said connection requiring the username and password of a registered user) but no client currently supports this, it would remove the option of distributed peer databases (so no trackerless discovery, although you could have an edge tracker whose only role is to introduce a small set of the clients, and then THOSE clients introduce nodes in the distributed database) and in general – there is no easy solution to the “hard” problem of key distribution, no matter how much cryptographers wish there were 🙂

“PGP (and pgp-like schemes) require either an out-of-band communication of a shared secret (a password, trivially) or a trusted introducer – a signing agent already known to and trusted by the sending party.”

Yes, and the man in the middle doesn’t know whether you have done that or not. If they start widely deploying MitM attacks, they will eventually get caught. One day they’ll MitM between two people who have actually cross-signed keys and then those people will know that someone is trying to do something naughty.

I’m not saying people should stop having keysigning parties, but I do think MitM is an overstated risk. Widely deployed MitM attacks aren’t going to happen, and if they do happen, the cat will soon be out of the bag. Wiretappers lose if users fight back.

“For any end client suspected of file-sharing, one MUST verify that the client was actually uploading or downloading content, AND that the content corresponded to a valid portion of a copyrighted file. In BitTorrent, this might be that the client sends or receives a complete file block, and that the file block hashes to the correct value specified in the .torrent file.”

Of course, if you received a complete file block from a distributor duly authorized by the copyright holder, you have a strong defense against an accusation of infringement. If from a distributor not authorized, the distributor is potentially on the hook for copyright infringement. Ditto transmitting: to an authorized distributor, you’re sending a copy TO THE COPYRIGHT HOLDER, can that really be infringing?; to an unauthorized one, they are as guilty as you are.

I don’t think that your suggestions about the verification of content are sufficient.

Verification of the hash only confirms that the block/piece received is part of the intended content of the .torrent file. It doesn’t verify that the content described by the .torrent is actually what the .torrent’s title says that it is. A more appropriate test would involve a more extended precedure:
first, verify that the block/piece received matches the hash; then
complete the download (from any source), and
verify that the assembled file is actually infringing.
If the hash were correct, but the completed torrent turned out to be the works of William Shakespeare repeated over and over again, it wouldn’t matter whether the .torrent were named Copyrighted-Movie.torrent, or Artist-Discography.torrent: sharing that piece of it still wouldn’t be infringing.

In the spirit of conducting better file sharing investigations, researchers at the University of Colorado recently published a paper called “BitStalker: Accurately and Efficiently Monitoring BitTorrent Traffic” (paper available here) that evaluates the feasibility of collecting more concrete evidence of file sharing.

Their approach relies upon the investigator actively exchanging BitTorrent control messages with the list of suspected peers obtained from the tracker server. BitTorrent handshake messages are requested (which implies that the BitTorrent protocol is running at the expected IP address and port), followed by BitTorrent bitfield requests (which lists all of the pieces that the peer currently possesses), followed by a block request and verification that the block is correct.

This active method, while not entirely immune from the possibility of false identification of file sharers, is a significant improvement over the current investigative approach consisting of tracker list queries.

It’s a simple, straightforward mathematical fact that Digital Copyright enforcement simply does not and cannot be made to work. There isn’t any escaping from this fact, but that has not stopped the corporations from trying, because they believe that they stand to lose money by not trying to restrict access.

Once they wake up and realise that they should make it easy to pay, and affordable, they will begin to gain the respect of the viewers. You only have to look at the Japanese “Anime” film community to realise that this model does actually work. In that community, it is realised by the viewers that the Director of the films simply will not be able to make the films that they want to see, unless they pay for them to be made. So, the entire community SUPPORTS the production of the films.

Yes, they copy the films using bittorrent – but only so that they can watch them, collect them… and collaborate in providing translations and even voice-over tracks in several different languages!

The moment that the film comes out on “DVD”, the USERS THEMSELVES pull the bittorrent, and they all go out and buy the DVD!

So, instead of putting “Piracy. It’s A Crime” at the beginning of DVDs, how about putting “If you have not paid for this film, please consider doing so by visiting http://www.name-of-film.com/pay“. Then, if any parent catches their child downloading a film illegally, then after watching it themselves, they can make their child squirm by getting out their credit card and meaningfully glare at them whilst handing over a small sum of about $0.50 to $1.00.

If the film corporations could actually be bothered to get with the 21st Century, I would be quite happy to help write a protocol which could actually be embedded into the Digital format itself which, when distributed (legally or illegally) could be decoded by viewing software, and the software itself present the viewer with the *OPTION* to pay very reasonable sums of money.

It’s a completely different kind of approach – one where, god help us, the Film Industry actually TRUSTS people, rather than actively treating them as criminals.

Likewise to be an effective P2P node you have to store “blocks” of various files as a payment in kind for using the service.

If blocks are encrypted and the identifiers are just tag numbers then it is difficult for the P2P node owner to know what each block holds.

Thus the node owner who might say use the P2P network for downloading public domain / Open Source / Shareware / Open Copyright material would end up having illegal content unknowingly as part of the P2P usage contract.

Then of course there is the question of malware.

All of which makes saying a specific person has downloaded a movie problematical at best.

All you can actually say is that one or more blocks of data have been sent to the host IP address not much more.

It would be very easy to see how a P2P network could be designed where it was possible to prove that the node owners could not know what was in a file block and further that each time you made a request for which taged blocks you required for a particular piece of copyrighted material it would be different. Also for the blocks to change dynamicaly.

Thus looking at or for blocks would be a pointless excercise.

Thus you would have to look at the node responsable for converting a particular file name into a particular tag list.

Again this can be done in a maner where it can be shown that a node owner could not know what they had on their system.

This is a problem for the legal system which although it deals mainly with intangable information (in documentry evidence), it relies 100% on it being in a physicaly tangable and traceable form (the document).

The legal proffession have been working on this problem for hundreds of years in one form or another and so far it has not come up with a consistant and practical solution.

For a P2P network as I losely described it would not be possible to say where the “information” that is subject to any particular copyright is. Only that it went in and has come out at a particular point on a specific requesting machine. Thus the information is both on all machines and on none of them.

The Senator Orrin Hatch solution is to say “they are all guilty” that is they must have conspired together, however there is no evidence to support this.

Further it is possible to include compleatly unknowing parties into such a scheam in that for arguments sake you could us one or more US Census documents to make a “difference file”.

The only sensible way to deal with this is as the copyrighted materials entry into the network and where it comes out.

If as is possible there is no one key piece of information required to generate the file you cannot use the information about a specific block to say “this is part of X” as it is also part of “A to Z” as well.

It is all a little problematical and is a situation that will get worse with time.

Simply saying possesion of the P2P tool is a crime is going to lead to further problems.

The solution to which is almost bound to be worse than finding another solution to such ptoblems (the question is what…).

Presuming intent is widespread — there are different renderings, all meaning something very like “A person is presumed to intend the reasonably foreseeable consequences of his voluntary act”. It’s at least a significant part of the basis for outlawing “fire!” in a theater and fighting words in a bar, or driving drunk: you know what happens next. For that matter, it’s why we can outlaw pulling a trigger.

It has to be direct: if there are other responsible parties closer to the criminal consequences, it’s them, not you.

But participating in BT with content you have no reason to believe you have authority to distribute is, I think, as direct as it gets: if you’re running a torrent titled ~Avatar~ via BT and it turns out to be the movie “Avatar”, I think that’s enough to presume intent.

BT doesn’t fetch any content you didn’t ask for: if you have the torrent file at all, you asked for it.

Freenet does fetch and store everything you’re on a path for — it’s like an auto-configuring anarchicl encrypted USENET — but that’s an entirely different kettle of fish.

The problem is there will always be a cost proportional to the accuracy of the test.

You want enough actual pirates with very few false positives, it will cost more than anything it might hope to prevent.

Libraries are pervasive, but kindles are selling very well. And what of Radio?
And I would really not like to have to edit, clone, move my MP3s (mostly audiobooks from CDs).
Now with 3G and Wifi, they should just have services where I can do my own playlists and keep everything in the cloud. I’d pay monthly access, and not have to hoard the files though I might cache things for offline use. It will happen, but it may take a while for the eLuddites to go away from the scene to allow it.

Even assuming they could end peer-to-peer tonight, tomorrow, the collage dorms would just start swapping and hoarding recordable DVDs full of the same files. And hoarding is the term – most people with huge collections can’t possibly listen to a fraction of the collection. They might want the choice only available in a vast hoard of files, and currently the only way is to create the hoard yourself.

Copyright is a legality which creates a monopoly where you can attempt to charge monopoly rents. But that doesn’t work well in a digital world.

Instead of trying to preserve their monopoly rents by enforcing copyrights, they need to find a way to make it more economical and practical to become media providers.

It was said Steve Jobs told the record companies their competition was not Apple, but the P2P networks. And they listened enough for the $1/track. But they need ot go the next step and have movie companies follow.

Interesting post, you raise many great points that could and can be proven valid in a court system that has an intimate knowledge of technology. However, as we all know the court systems and lawyers involved have a vague understanding of what is really happening.

One argument I would like to raise is that you mention that you outline that if a file block verification must be performed. I agree with this statement but I do not agree with the fact that you limit it to only one block for verification. Depending on the block size of the file in proportion to the total file size this could be entirely inaccurate and meaningless. Media owners copyrighted the content in its entirety, they do not have the copyright on the the file blocks themselves. Making a statement that one file block is sufficient evidence for prosecution is completely obtuse. Unless the media can be deciphered in a manor that the media can be distinguished as copyrighted material (can be played or viewed, etc.) then it is not copyright infringement. Until these file blocks be played or viewed or read they are meaningless and not the rights of any person.

Just as it is the case that words that may appear in the center of a book, such as “The man walked” will not be granted a copyright, neither should/could ‘1010001000100100010’ (example means nothing don’t read into it) be granted this right. Bad example but anyone that has done computer forensics professionally understands that data in transit and in slack space on the disk may appear to be incriminating however they also could be benign. As for why the TCP connection was established, well that is another topic I will avoid but it is not criminal to establish connections with internet facing services and producing requests.

Like I said, I could argue many other points but this one is the one I felt like talking about tonight. I mean someone should talk on how wiretapping is more accepted in Media Rights Infringement cases than in attempts to catch criminal/terrorists?

“Only one thing is impossible for God: To find any sense in any copyright law on the planet.”

Freedom to Tinker is hosted by Princeton's Center for Information Technology Policy, a research center that studies digital technologies in public life. Here you'll find comment and analysis from the digital frontier, written by the Center's faculty, students, and friends.