BitTorrent: ISOs for Everyone — Fast!

Have you ever tried to download the latest ISO images for your favorite Linux distribution during the first week that it's available? If so, you've probably even had trouble finding an up-to-date mirror that'd let you in, and after finding one, you were probably disappointed to see a 20 KB/sec download speed (or worse) on your cable modem or DSL line that normally downloads at 10 times that speed. And as Linux becomes more popular, the problem's only getting worse.

Have you ever tried to download the latest ISO images for your favorite Linux distribution during the first week that it’s available? If so, you’ve probably even had trouble finding an up-to-date mirror that’d let you in, and after finding one, you were probably disappointed to see a 20 KB/sec download speed (or worse) on your cable modem or DSL line that normally downloads at 10 times that speed. And as Linux becomes more popular, the problem’s only getting worse.

The traditional solution is to add more mirror sites or pony up the cash for a distributed hosting service like Akamai. However, bandwidth for all those mirrors isn’t free and neither is Akamai’s service. It’s difficult for a company like RedHat or Mandrake to justify spending more money so that thousands of users can download software for free. After all, someone has to pay for the plumbing.

Luckily, Bram Cohen has been working on a project called BitTorrent for over two years. BitTorrent fundamentally changes the way large-scale file transfers work. Rather than the traditional model of many computers downloading the file from a very small number of congested sources, BitTorrent uses a peer-to-peer model. It transforms the massive network of downloading nodes into a constantly changing web of uploading and downloading peers (see Figure One).

Figure One: BitTorrent transforms the download network into a network of sharing peers

From a user’s point of view, BitTorrent means faster downloads. In fact, the more popular a particular download is, the more likely you’ll be able to get it quickly. In other words, BitTorrent scales incredibly well and doesn’t cost an astronomical amount of money to host.

If this sounds more than a little like the Napster of a few years ago, don’t worry. The similarities are only skin deep. Unlike Napster, there’s no central repository of available downloads. While that’s great for dodging the law if you’re into swapping videos, it means that finding what you’re looking for isn’t as easy as typing “Fedora Linux ISO” into a BitTorrent search box. However, you can usually track down what you need in a matter of seconds using everybody’s favorite search engine, Google.

Once you’ve found a site that lists the download you’re looking for, you simply need to find the reference to the appropriate torrent file (a file with a .torrent extension) and feed it to your BitTorrent client. The BitTorrent client uses the data in the torrent file to find peers that have copies of the data you’re interested in.

There are GUI and console BitTorrent clients for Linux, but you’ll also find clients for Mac OS X and Windows, too. All clients offer the same basic set of features, while a few provide more bells and whistles. Choosing a client is a matter of personal preference.

It still sounds like a decentralized Napster, doesn’t it? Not quite. Unlike Napster, BitTorrent tries to download the file you need from multiple peers simultaneously. And to avoid wasted effort, it makes sure to download difference pieces of the file from different peers. In other words, you’re downloading in parallel. So, even if you only find four peers that have an upload capacity of 32 KB/sec, you’re still downloading at an effective rate of 128 KB/sec.

There’s one other piece to BitTorrent: the tracker. A tracker is the BitTorrent server that tracks activity — who’s downloading and who’s able to upload pieces of which files. Your BitTorrent client will initially need to contact a tracker before it can download anything. In keeping with the BitTorrent spirit, trackers are also decentralized. There is no list of trackers. New trackers are coming online (and going off-line) every day.

But Does It Work?

Okay, it’s really not Napster after all. But isn’t it chaotic? With all this decentralization and random machines contacting each other, how can it possibly work? It just does. Really!