BitTorrent census: about 99% of files copyright infringing

A survey of files being shared over BitTorrent showed that the vast majority …

It has never been a secret that the majority of files being shared over BitTorrent are movies and music that are likely being shared illegally. (Sorry, Linux distro nerds.) Princeton senior Sauhard Sahi confirmed this recently after setting out to survey the content available on BitTorrent and, although there are caveats to his findings, they highlight the relationship DRM has with illegal file sharing. As in: the more DRM there is on the legit versions of the content, the more popular it is on P2P.

Sahi chose a random sample of 1,021 files from the trackerless Mainline DHT and classified them by file type, language, and apparent copyright status. He found that nearly half (46 percent) of files were nonpornographic movies and TV shows—the largest single category of content. 14 percent of the files were porn, tied with the 14 percent dedicated to games and software. Just 10 percent of the files were classified as music, and one percent were books and guides.

Sahi also analyzed whether the content was infringement, checking to see if was in the public domain, freely available via legitimate channels, or user-generated content. Based on this study, 100 percent of the movie/TV show sample was found to be infringing, as well as all of the music torrents. Seven of the 148 files in games/software were found to be noninfringing (two were Linux distros), and one of the 145 porn files was given the benefit of the doubt as noninfringing. Overall, about one percent of the total files were categorized as "likely noninfringing."

So, people largely use P2P to pirate stuff—big surprise. It's the types of files and in what ratios that show us why people share media illegally, however. Music was once the only reason to use P2P networks, and the record industry long feared that going DRM-free would only aid in a massive explosion of illegal file sharing. That has obviously not been the case—P2P users can now share their DRM-free MP3s easier than ever, and yet this category is one of the smallest of all files shared. And it makes sense: why would you bother going to BitTorrent, which may have misnamed and poorly encoded MP3s, when you could easily spend less than a dollar, getting exactly what you want from a place that you trust?

Movies and TV shows, on the other hand, are hugely popular on Bit Torrent—a trend that seems to mysteriously coincide with the heavy DRM and restrictions that come with that kind of content. DVD encryption, browser restrictions, DRM on downloads from iTunes or Amazon—there's effectively no way for consumers to buy this content without restrictions, so they're turning to P2P to get it.

XKCD says it best:

Of course, Sahi's results are only from Mainline and may not reflect the entire P2P system as a whole. His data also includes all files being shared, some of which may not be getting any downloads, while others are being downloaded en masse. Still, it's reasonable to assume that most users share what's in demand, and what's in demand right now is heavily-DRMed movies, movies, TV, and movies.

It is not surprising, b/c at least in the US everything that is created is default copyrighted. So everything on there is going to be copyrighted. With copyrights lasting a century (roughly), again it is not surprising since practically speaking nothing ever enters the public domain. Certainly nothing created in your lifetime (and most things culturally important to you) will never enter the public domain in your lifetime.

another thing could be that bittorrent, while perfect for lager files, or large collections of files at once, is plain idiotic for music, unless one is dealing in whole discographics of established artists.

for individual singles, there are established systems for it, many of whom grew out of the napster way of doing things, and then go on to get server-less/agnostic tracking and searching bootstrapped on top of them.

but then there is also the increasing availability of custom music services, with spotify being the best known. And spotify can be had on most smartphone platfroms to. so why download when any open wifi hotspot turns into a infinite selection jukebox?

who should be really worried are the local radios, that have degenerated to basically being ad/news/major studios music pushers. And where music may well be seen as a subset of the ad section.

I'd wager that a majority of those TV shows are only technically infringing, in that they're broadcast over the air freely. You can't really fault people for creating online libraries of that stuff if you won't take the trivial steps to do it yourself.

Originally posted by D_Homerick:A comment on the linked article brings up an interesting point: aren't legitimate torrents much more likely to use a traditional tracker, rather than using DHT?

That's what I was wondering too. While I'm familiar with the concept I've never used DHT. How does it work in practice? In most clients do you explicitly have to add a file that you are sharing to the DHT, or are all shared files automatically added? IE if you download something that is hosted with a private tracker (like a linux distro), would the client also create a DHT entry for it?

While the results are not expected the sample space is rather too restricting. And what about content availability? Many torrents are devoted to media that has no legit distribution in a region but are brought over by the efforts of fansubbers.

As for DHT over trackers, I think it depends on the host country's legal track record.

Originally posted by ShadowNode:I'd wager that a majority of those TV shows are only technically infringing, in that they're broadcast over the air freely. You can't really fault people for creating online libraries of that stuff if you won't take the trivial steps to do it yourself.

No one wants to deal with your stupid flash player, NBC.

Here's the crazy part about the TV shows, in the U.S. with terrestrial TV, you can freely purchase TVs and antennas to watch broadcasts, it is not some gray market, quite actively encouraged, and until last year your could also purchase VCRs and tapes to record said content for time-shifting, also very legal except for maybe the NFL but lets concentrate on non-sporting events.

Due to the DTV transition VCRs will still work, but not quite right; however, computers work even better than before since all the broadcasts are digital. So now people record free OTA DTV that has no DRM or anything and strip out the commercials and distribute it, but only the distribution is illegal. If everyone were to record their own shows and not distribute them it would be perfectly legal.

The only reason I turn to such means is that I haven't been able to buy a commercially available PVR that does OTA, someone make one already!

Originally posted by ShadowNode:I'd wager that a majority of those TV shows are only technically infringing, in that they're broadcast over the air freely. You can't really fault people for creating online libraries of that stuff if you won't take the trivial steps to do it yourself.

No one wants to deal with your stupid flash player, NBC.

Here's the crazy part about the TV shows, in the U.S. with terrestrial TV, you can freely purchase TVs and antennas to watch broadcasts, it is not some gray market, quite actively encouraged, and until last year your could also purchase VCRs and tapes to record said content for time-shifting, also very legal except for maybe the NFL but lets concentrate on non-sporting events.

Due to the DTV transition VCRs will still work, but not quite right; however, computers work even better than before since all the broadcasts are digital. So now people record free OTA DTV that has no DRM or anything and strip out the commercials and distribute it, but only the distribution is illegal. If everyone were to record their own shows and not distribute them it would be perfectly legal.

The only reason I turn to such means is that I haven't been able to buy a commercially available PVR that does OTA, someone make one already!

There are several products available from TVIX that can record OTA digital broadcasts. I had an older model that kind of sucked, but the new ones have much better video decoding, from what I hear.

You won't have a huge amount of luck with support, though. Witness the slogan on their support page:

I think it's important to point out (as the summary on Freedom to Tinker does) that this distribution measures the percentage of works which are *available* on Bittorrent, not the percentage of downloads which are illegal. I don't doubt that the percentage of works infringed is in the same ballpark, but it's not what this study actually measured.

I think the important thing to take away from this is to recognize that, right now, very little of what's out there *is* public domain / freely licensed. There's a good argument here that we download copyrighted work in the absence of anything else being available. If more free culture was available it would certainly be on Bittorrent.

If you're going to take a practical lesson away from this, it should be that we need to produce more free content, not that we need to pirate less copyrighted content.

That is an extremely narrow sample, but from what I have seen in all the torent sites I have been to, legal and not-so-much, the overall numbers look about right. I would be sure some of the percentages would change a bit and always be in flux with the current demand or the latest release but overall most of the content is illegal.

As in: the more DRM there is on the legit versions of the content, the more popular it is on P2P.

Except he didn't look at the popularity of the download, so there's no real way to draw this conclusion. The Relatively small number of music torrents could have drawn a high portion of the downloads, and thus be more popular.

quote:

Technically, I think they could claim copyright for that little home video under the Berne Conventions

The study looked at infringed copyright though, so depending on the content you could draw the conclusion that the particular file was intended for general distribution. If we were looking at copyright, we could say 100% of the material is copyrighted, which is true within the error margin of this type of study. Basically just a few old books in the public domain, everything else that would be shared is within its copyright.

Also, with such a high amount of "could not classify" this could swing quite a bit. Is that unable to classify because he couldn't tell the media type, or could not classify because he could not identify the copyright.

He also neglects all the viruses. I'd argue that anything virus infected is non-copyrighted. You are transforming the original work into something that can be distrubuted. The virus is the new work, and obviously its creator intended to allow distribution.

Originally posted by ShadowNode:I'd wager that a majority of those TV shows are only technically infringing, in that they're broadcast over the air freely. You can't really fault people for creating online libraries of that stuff if you won't take the trivial steps to do it yourself.

No one wants to deal with your stupid flash player, NBC.

A c t u a l l y.......... I wouldnt mind putting up with stupid flash players, even an ad in the beginning, as a perfect example just a few minutes before posting this comment i went to CBS (via google) to watch "two and a half men: season 7 ep 13"after taking time to load etc it then gives me the beautiful message that the stream is denied to me because the copyright gods have decided the country I live in does not qualify me to watch one of my favorite tv shows' latest episode for at least a few months.

Fuck that, 2 minutes later I had my download going from a torrent site - HD no less, and on my Swedish connection of 100mbps I had it in a couple of minutes.

Really was my fault, I thought at least some of the idiots had grown up and tried to adapt to the internet age, sadly was mistaken.

Legal content will most likely use a dedicated tracker, but I don't see any reason why it wouldn't also use DHT in addition to the tracker. I just checked; the Ubuntu torrents are not marked as private. In fact, I'm pretty sure most of the torrents with DHT disabled will be illegal content.

1000 might seem like a small sample, and maybe it is, depends on the overall number of active DHT torrents. That number will be fairly hard to measure, although I guess you could estimate it by checking what percentage of a (larger) random sample of DHT nodes is active. Should be a fairly accurate estimate in fact; all thanks to using a strong random hash function. FWIW The Pirate Bay reports 2.2 million torrents, I assume they don't count dead torrents.

DHT is a pretty amazing technology once you wrap your head around it; the central feature, if I understand it correctly, is that it only takes a logarithmic number of steps to get to the peer that essentially serves as the tracker replacement.

I wonder why the researcher didn't take the time to further differentiate between TV shows and movies, since he was already taking a fairly close look, beyond simple looking at the file type. The file size would also be interesting, and obviously the results would get shifted further away from music and probably towards movies if file size instead of torrent count would be considered. Then again maybe he did look at those things, and it's just not in the summary article (which oddly doesn't link to more detailed data/papers).

Originally posted by mitEj:What about the things that do not show on normal p2p sites and shares but use bit torrent to distribute their content.

IE WOW and updates, and other MMO's that are moving to a bitTorrent distro model ?

These count for p2p traffic but would not show on a Limewire or uTorrent type download lists.

THIS. This is why the study is BS. No one is accounting for all the legal bitTorrent traffic that legit companies ALL the time. Blizzard/Activision are just one of them. When that traffic is accounted for in the study then I will pay attention. Until then it is just part of an incomplete picture.

I don't have any problem with using legit streaming sites like hulu and... hulu... and hulu... Any show that isn't on hulu or gets linked to some external site (look for smallville or leverage on hulu for examples of this nonsense) just gets a great big pfft and I stream it elsewhere, although syfi's isn't too bad. I've yet to see a legit streaming site other than hulu not fall somewhere between a huge inconvenience and downright annoying. I don't have access to the BBC streaming site and don't know if it's any good.

How much of this content can successfully be downloaded in reality? When I've tried to use bittorrent links, I get download times measured in years, modem-like trickles of data, etc. I can barely download a 10MB comic book in several days - do people really download entire movies? I gave up on bittorrent because it was unusable.

How much of this content can successfully be downloaded in reality? When I've tried to use bittorrent links, I get download times measured in years, modem-like trickles of data, etc. I can barely download a 10MB comic book in several days - do people really download entire movies? I gave up on bittorrent because it was unusable.

You're just a bittorrent 'tard. Don't worry scrote. There are plenty of 'tards out there living really kick ass lives. My first wife was 'tarded. She's a pilot now.

Originally posted by mitEj:What about the things that do not show on normal p2p sites and shares but use bit torrent to distribute their content.

IE WOW and updates, and other MMO's that are moving to a bitTorrent distro model ?

These count for p2p traffic but would not show on a Limewire or uTorrent type download lists.

THIS. This is why the study is BS. No one is accounting for all the legal bitTorrent traffic that legit companies ALL the time. Blizzard/Activision are just one of them. When that traffic is accounted for in the study then I will pay attention. Until then it is just part of an incomplete picture.

The study didn't focus on BT traffic, but BT content. There's a difference.

How much of this content can successfully be downloaded in reality? When I've tried to use bittorrent links, I get download times measured in years, modem-like trickles of data, etc. I can barely download a 10MB comic book in several days - do people really download entire movies? I gave up on bittorrent because it was unusable.

You're just a bittorrent 'tard. Don't worry scrote. There are plenty of 'tards out there living really kick ass lives. My first wife was 'tarded. She's a pilot now.

ROFL!!! Oh shit that was funny.

quote:

Originally posted by aquasubThe study didn't focus on BT traffic, but BT content. There's a difference.

What? Patches and software are not content? How are they not content?

Who cares where the so called content comes from. Comcast and AT&T all see it as the same thing,

You think that if all the American ISP's get legal permission to cut off ALL bitTorrent traffic they are going to discriminate between legal and illegal traffic? Hell no they are not.

They are just going to drop a proverbial cluster bomb on it all and cut it ALL out.

Originally posted by trencher93ish:How much of this content can successfully be downloaded in reality? When I've tried to use bittorrent links, I get download times measured in years, modem-like trickles of data, etc. I can barely download a 10MB comic book in several days - do people really download entire movies? I gave up on bittorrent because it was unusable.

I get speeds which are between 1 megabyte - 2 megabytes per second regularly unless the torrent is very unpopular. You are either configuring your client in the worst ways possible, your network config is all screwed up, your ISP connection stinks, or you are getting your stuff from terrible sources.

Originally posted by trencher93ish:How much of this content can successfully be downloaded in reality? When I've tried to use bittorrent links, I get download times measured in years, modem-like trickles of data, etc. I can barely download a 10MB comic book in several days - do people really download entire movies? I gave up on bittorrent because it was unusable.

Depends on what you're trying to get ... if you're getting something obscure it may take forever like that, but if it's something popular with a lot of people sharing you'll get it very quickly.