BitTorrent clients for fun and profit

Henry Stanley

This is a work in progress.

…well, not really for profit. The world doesn’t need another BitTorrent client. But it’s a great project for learning about an interesting web protocol, exploring the networking stack, getting to grips with Wireshark and debugging, and learning how things work.

Reading a .torrent

The torrent files you might find somewhere like The Pirate Bay are encoded with bencode (pronounced “B-Encode”, but I’m still gonna say “ben-code”), a terse file format for representing a few simple data structures (strings, ints, dicts and lists).

A bencoded file might look like this:

dl10:hello dave!d3:foo3:bar4:fizz4:buzzei42eee

Tasty. Let’s add newlines for effect:

d
4:info
l
10:hello dave!
d
3:foo3:bar
4:fizz4:buzz
e
i42e
e
e

Elements in a bencoded structure have a letter denoting what they are: d for dicts, l for lists, i for ints. All of those are terminated with a trailing e. Strings are a little different: they start with the length of the string, then a colon, followed by the string itself (e.g. 3:foo).

In the above example, the root of the structure is a dict with one key (“info”) whose value is a list containing a string, a dict and an int.

Writing a decoder for this is an interesting exercise in itself; in the end, I used jackpal/bencode-go.

Here’s what a JSON output of this file might look like (I’m using a Ubuntu image torrent):