Posted
by
samzenpus
on Monday January 28, 2013 @02:00PM
from the never-again dept.

Frequent contributor Bennett Haselton writes "With the announcement of
Verizon's "six strikes plan"
for
movie pirates
(which includes reporting users to the RIAA and MPAA),
and content companies continuing to
sue users
en masse for peer-to-peer downloads,
I think it's inevitable that we'll see the rise of p2p software that proxifies your
downloads through other users.
In this model, you would not only download content from other users, but you also use other
users' machines as anonymizing proxies for the downloads, which
would make it impossible for third parties to identify the source or destination of the
file transfer. This would hopefully put an end to the era of movie studios subpoenaing ISPs for the
identities of end users and taking those users to court." Read below for the rest of Bennett's thoughts.

Now, I'm not advocating the creation of software that enables piracy. And I don't
mean that in a nudge-wink kind of a way, I'm serious: I think people should reward movie studios
for making content that they like, if only because that means studios will make more
of that type of content. For my last cross-country flight I paid
an honest-to-God four dollars to download a movie from Amazon Unbox to watch on the plane,
even though I fondly like to think of myself as smart enough that I could have figured
out how to find and download the movie for free. (Well, not all that smart; the
movie was Lockout.)

However, the idea of users anonymizing each others' downloads is so elementary, that I literally mean
it's inevitable that we will see the rise of such software. Whether I'm in favor of it or not,
it's going to happen. In fact, under certain assumptions, there's really only one logical
direction that it can evolve in.

First, some background.
Under the current BitTorrent protocol -- with no built-in support for anonymization -- some server S
makes a large file available for download. When the first downloader, say user D1, requests
a copy of the file, they have to begin the process of downloading it from S.
But when the next downloader, say user D2, requests a copy of the same file while user D1 is
still downloading, the BitTorrent server S
tells D2 to start downloading the file from D1 instead of from S directly. (D1 is required at this
point to share out the file for download, in order to earn enough "credits" to continue downloading
from S.) Subsequent downloaders are
similarly told to download from other downloaders instead of from the original server S. In
this way, the server S avoids incurring massive bandwidth charges (since S only actually served the
file one time), and each user on average only has
to share out the file once in return for downloading it themselves.

Note that this still means that in order to initiate the download, the server S has to serve out
the whole file at least once, to the first downloader
-- and if the file is being distributed without the copyright owner's
permission, then the operators of server
S can be taken to court. This legal pressure was the reason that the Pirate Bay switched from serving BitTorrent
files to serving magnet links, which enable
users to download content purely from each other, without the Pirate Bay ever actually serving the
content themselves. But with both BitTorrent and magnet links, users who are downloading content from other
users, can see those other users' IP addresses -- and they know that those other users are serving
the content from files stored on their own hard drives. This means that if you're the copyright owner of that content,
you can subpoena the identities of the users behind those IP addresses, and taken them to court for
unauthorized possession
and distribution of copyrighted material.

So what would a protocol look like with built-in support for anonymization? In my first draft of an idea,
I thought that each download could take place using one intermediate user as a proxy, so that instead of
server S telling D2 to download from D1, the server would tell D2 to use download D3 as a proxy, and tell D3
to proxy the connection from D1. (As with BitTorrent, the downloader D3 would be required to allow their
machine to be used as a proxy, in order to earn credits to continue with their own download.) So D1
would not be able to see the IP address of user D2 downloading from them, and D2 would not be able to see
the IP address of user D1 that they were downloading from. Both of them would be able to see the IP address
of user D3 which is acting as the proxy between them,
but as long as it's not against the law to simply proxy a connection for someone else,
that would not be grounds to subpoena the user D3's identity. And D3 would be able to see the IP address
of D1 and D2, but if the D1 and D2 are communicating using a shared encryption key, then D3 would have no
idea what content is flowing between D1 and D2, even as it proxies the connection between them. So even
if one of D1, D2 or D3 were an "adversary" (i.e. a copyright holder intent on suing illegal file sharers),
none of the three would be able to see the IP address of another user that they knew was either downloading
particular content, or serving it out.

Of course you could also argue that if
D3 is among the users that server S is making available to others as an anonymizing proxy, then that constitutes
proof that D3 must be downloading something else from S (otherwise, D3 wouldn't need to earn credits by
acting as an anonymizing proxy), and if either D1 or D2 is an adversary, they can see D3's IP address and reason
that D3 must be guilty of some copyright violation. Similarly, if D3 is the adversary, they can see
D1 and D2's IP addresses and reason that both of them are probably guilty of some copyright infraction,
even if D3 can't actually see what they're trading. Basically, anybody could be considered "guilty by association"
simply by virtue of being in the community of users being coordinated by server S.
But (1) that accusation could be deflected if some of the files being served by S were
in fact legal and being distributed with the copyright holder's permission; and (2) in any case, the
Digital Millennium Copyright Act
requires you to claim that your specific copyrighted content is being distributed by a user, before
you can unmask that user's identity; it's not enough to claim that the user is part of a network that
distributes "some" copyrighted content illegally. D3 may be proxying a connection between D1 and D2 in order
to earn credits so that D3 can download some content for themselves, but even though D1 and D2 can both
see D3's IP address, there's no way for them to know what D3 could be downloading.

Unfortunately, this three-user-chain idea is not secure, because an adversary could still create a large number
of users co-ordinated through server S, and sooner or later, a chain would arise where both the proxy and
the downloader controlled by the adversary, and at that point, they would know the IP address of the user
serving out the copyrighted content. In other words, eventually you'll get a situation where D2 is downloading
content from D1 by going through proxy D3 -- but where D2 and D3 are both controlled by the adversary. So D2 knows
the content that's being downloaded via D3, and D3 knows the IP address of D1 that's actually serving out
the content -- at which point they can subpoena the identity of user D1, and sue them.

So consider this idea instead: When user D1 sends a request to server S to download a file, server S gives them
the IP address of another user, D2, from which they can download the file. Now, 40% of the time, user D2
actually does have the file on their hard drive and is serving it to user D1, with no proxying. The
other 60% of the time, user D2 is told by S to proxy the connection from D1 and connect to a third user, D3.
Now in 40% of these cases, D3 actually does have the file and is serving it out directly; the other 60%
of the time, D3 is proxying the connection for yet another user, D4...

So you end up with chains of varying length, with longer chains having a progressively smaller probability of forming:

40% of chains will be of length 1 (one user downloads directly from another)
60% x 40% of chains (24%) will be of length 2
60% x 60% x 40% of chains (14.4%) will be of length 3
60% x 60% x 60% x 40% of chains (8.64%) will be of length 4
etc.

These proportions of course sum to 1, and a little math shows that the length of the average chain is 3.5 nodes.
The number of downloads in a chain -- the connections between users -- is one less than the number of nodes
in the chain, so this means that to complete one download, the content will have to be transferred an average
of 2.5 times -- compared to being transferred only once, when one user downloads from another directly. In order
to ensure that users contribute enough to the system as they take from it, that means that in order to download
a file, users would be required to provide enough "proxying" to support the equivalent of 2.5 full downloads of
that same file.

These chains have a useful property: any time you're downloading content "from" another user, there's only a 40%
chance that user is serving content off of their own hard drive, and a 60% chance that they're proxying the
connection from somewhere else (another node that may in turn be proxying the connection from yet another node, etc.).
So even if the adversary controls three nodes D1, D2, and D3, and D1 is downloading from D2 who is downloading from D3
who is downloading from D4 (and D4 is not controlled by the adversary), from the adversary's point of view there's
only a 40% chance that D4 is actually originating the content. This is always true no matter how many nodes in
the chain the adversary controls -- in the end, if they want to nail someone for serving out copyrighted content,
they have to download the content from some node that they don't control, and there will only be a
40% that user is actually serving the content from their hard drive.

And the 40% number was deliberately chosen in order to weaken the adversary's legal grounds for subpoenaing the identity
of the user they're downloading from -- even if they can show that they downloaded content from another
user's IP address, it's more likely than not that the other user was not actually hosting the content.
(Of course, there might be other details in context that render that probability calculation useless. For example,
if the server S only links to one downloadable file, then all users coordinated by that server S are presumably
downloading that same file, and anybody that server S connects you to, can be presumed guilty of downloading and
sharing that file, 40% figure be damned.)

At this point you might also wonder: Why not just connect over a protocol like Tor, which provides secure anonymity for
all transactions, and then use BitTorrent or some other file-sharing system on top of that? The answer is that Tor's
connection is likely to be much slower, for at least two reasons. First,
Tor servers are a limited resource, and the more people use
them (especially for large file trading), the slower they are likely to become. (By contrast, in the peer-to-peer
proxying model outlined above, every new downloader can also be made to act as a proxy for other users, so additional
users don't slow down the system because they contribute as much as they take out of it.) Second, Tor always
routes your connection through multiple servers to guarantee secure anonymity, which means it would be slower on average
than the variable-length chains described above, where only about 20% of chains are of length 4 or more.

The key difference is that Tor provides true anonymity whereas the protocol above only provides plausible
deniability. In high-risk settings where Tor is often used, it would not be acceptable if there were a 40% chance
of your IP address being revealed to your adversary. But for file sharing, the 40% figure might be acceptable
if it's just low enough to stave off a subpoena. This trade-off makes it possible to use shorter chains, resulting
in faster downloads and less total bandwidth consumption.

You also already have the option today of using a VPN service to download files through an anonymous third-party connection,
which renders the rest of these issues moot. But users have to jump through several hoops (and pay some money) to set this
up as an option, which means that most users will not be using VPNs any time soon, leaving plenty of naive users for
the RIAA and MPAA to go after. The use of peer-proxying links would mean that all users downloading through the
system would be protected.

At the moment, the major impediment to a peer-proxying system like this would be that the chained downloads would still consume an
average of 2.5 as much bandwidth as direct peer-to-peer downloads. Even with today's high-speed connections,
this increase in inconvenience is great enough that some users might just prefer to use plain old BitTorrent
to download files directly from peers, and run the (admittedly small) risk of getting in trouble. But as bandwidth
speeds continue to grow literally exponentially, eventually the difference in inconvenience will be so small,
that users would be foolish not to use proxified downloads if it provided free legal protection.

Note that the viability of this system does depend on the ISP's attitude towards it. In particular, if your
ISP only goes after pirates because of legal pressure from content holders, then if the ISP's users are using
this peer-proxying protocol instead of a direct download protocol like BitTorrent, then the ISP can quite
truthfully claim that they don't have any hard evidence to disconnect any particular users or turn over
their identities (because the ISP doesn't know
which users are actually storing pirated files and which users are just acting as proxies).
On the other hand, if your
ISP sincerely wants to stop piracy
because your ISP is also a content company (Comcast, for example), then
they might also try to squelch the use of any protocol that enables piracy, even if they can't prove that
any particular users are using it for anything illegal. Thus Comcast might try to slow the use of the peer-proxy
protocol. But in that case they could be forced by Net Neutrality regulations to stop throttling it, in the same way that
the FCC ordered Comcast
to stop throttling BitTorrent.

As long as those conditions hold true -- content owners continue cracking down on file sharers, but proxying
remains legal and bandwidth keeps getting cheaper, and ISPs are restrained from blocking the protocols themselves
-- I think that p2p will have to evolve into something like the
chained-download system described above, to provide plausible deniability to users, without resorting to the
long chains (and subsequently slower downloads) provided by full-anonymity systems like Tor.

But again, I'm just saying it's inevitable, not that it's right.
I actually do wish that people would pay
the studios' prices for the movies that they watch; part of it is that I think most blockbusters
are actually pretty good and deserve to make money.
When you refuse to pay for movies, you're casting a vote against fun, big-budget movies that are made
for the purpose of getting lots of people to come see them and enjoy them, and instead voting
in favor of excruciatingly boring low-budget films that are made primarily so that the director could whine that the cheese-puff-snarfing
American public wouldn't know great art if it bit them on their big bloated behind and subsequently
didn't even buy enough tickets for the director to pay off the lien he took out on his Honda
Civic to get the movie produced.
Forget prosecution and civil suits; just make movie pirates sit through The Brown Bunny.