The author is a Forbes contributor. The opinions expressed are those of the writer.

Loading ...

Loading ...

This story appears in the {{article.article.magazine.pretty_date}} issue of {{article.article.magazine.pubName}}. Subscribe

Costya Perepelitsa, Software Developer / Distributed Systems Engineer

The BitTorrent protocol has downloaders help send the file to other downloaders, reducing the burden on the original owner. This usually results in all participants downloading the file faster.

The way this is achieved might be best explained with an illustrated example using the worst illustrations ever made (I'm not a graphics designer).

Moe, Larry, and Curly each want to download a copy of a 120 megabyte file. Each of their computers can download 3 megabytes each minute and can upload at the same rate. Currently, the file only exists on Shemp's computer, which can download and upload at the same speeds.

The traditional approach is for each of Moe, Larry, and Curly to download the full file from Shemp. One way Shemp can do this is to upload the entire file exclusively to Moe, and then to Larry, and then to Curly. Since Shemp can upload at 3 Mb/min and each can download at 3 Mb/min, it takes 40 minutes to get it to Moe, another 40 minutes to get it to Larry, and a final 40 minutes to get it to Curly. The entire process takes two hours. Another way Shemp can do this is to upload the file simultaneously to all three by splitting his 3 Mb/min connection to 1 Mb/min to each of the three. The end result is the same: it takes two hours before all three have a copy.

There's something important to notice about this traditional case: a significant amount of bandwidth is wasted. If you consider how much total available download and upload bandwidth exists and compare that to how much is being used, you'll see that the network isn't being used to its fullest: since each of the four machines can download and upload at 3 Mb/min, they have a total 12 Mb/min of both download and upload bandwidth, but in both of the above traditional cases only a total of 3 Mb/min of download and upload bandwidth is being used at a time. The network saturation here is 25% of download bandwidth and 25% of upload bandwidth.

Seeing this, Shemp decides on the following plan:

Shemp splits the file into four pieces.

First, he will send the first piece to Moe, the second piece to Larry, and the third piece to Curly.

Then, he will send out the fourth piece to each of them, but also instruct them to get the other two pieces from the other two participants at the same time.

Let's see what happens when they put this plan into action:

Phase 1:

Shemp sends each of the three a different 30 Mb piece. He splits his 3 Mb/min upload bandwidth between the three of them, so each piece uploads at 1 Mb/min.

Shemp isn't downloading anything (because he already has the file), but his upload connection is going full blast: he is uploading to each of Moe, Curly, and Larry at 1 Mb/min, for a total upload speed of 3 Mb/min (his maximum).

Moe is only downloading at 1 Mb/min from Shemp, and he isn't uploading anything. So he has 2 Mb/min of download bandwidth and 3 Mb/min of upload bandwidth going to waste. Same goes for Curly and Larry.

Phase 1 takes 30 minutes. When Phase 1 finishes, Shemp has the full file, Moe has piece #1 of 4, Larry has piece #2 of 4, and Curly has piece #3 of 4.

Phase 2:

Shemp sends each of the three a second 30 Mb piece, and each of the three also sends the piece they got in Phase 1 to the other two.

As before, Shemp isn't downloading anything, but continues to upload at top speed: 1 Mb/min for each of three clients, for his maximum speed of 3 Mb/min total.

Moe already has piece #1 of 4, is downloading piece #4 from Shemp at 1 Mb/min, piece #2 from Larry at 1 Mb/min, and piece #3 from Curly at 1 Mb/min. At the same time, he is uploading the piece he already has (piece #1) to Curly and Larry at 1 Mb/min each. So Moe's connection is almost going at full blast; he is downloading at his maximum of 3 Mb/min and uploading at 2 Mb/min.

Larry already has piece #2 of 4, and he's receiving each of the other three pieces at 1 Mb/min each while sending out piece #2 to Moe and Curly at 1 Mb/min each.

Curly already has piece #3 of 4, and he's receiving each of the other three pieces at 1 Mb/min each while sending out piece #3 to Moe and Larry at 1 Mb/min each.

Phase 2 takes 30 minutes. When it finishes, everyone has the full file.

Both phases took a total of one hour, which is half the time the traditional approach took.

By splitting up the file, sending different parts to clients, and having them help in sharing the pieces, the time was cut down to half of the traditional approach. The clients don't need to be able to download or upload any faster; the network is simply better saturated: 75% download bandwidth (9 Mb/min out of available 12 Mb/min) and 75% upload bandwidth (9 Mb/min out of available 12 Mb/min)

And this is what Bittorrent does, except with much smaller pieces, redundancy in case some clients vanish without uploading their pieces, and sending unequal amounts of pieces to clients based on how fast their download and upload speeds are looking.