Trackerless BitTorrent

2005-05-20

Yesterday I wrote a post about the problems with BitTorrent and trackers and today I find out that BitTorrent has gone trackerless.

The basic idea is that the list of peers downloading the same torrent doesn’t need to be hold by one or more central trackers, but it can be contained inside the network of peer downloaders itself. The algorithm used for this is a Distributed Hash Table. The key is a checksum of some of the info in the .torrent file, and the value correlating to this key is the list of peers. These key–value pairs form the hash table.The table is distributed according to the following principle: the node(s) whose node address is the closest towards the key holds the correlating value, in this case the list of peers.

So, originally the BitTorrent client would look at the .torrent file to find the tracker URL inside it. It would then contact this tracker, recieve a list of peers, and start downloading the file. Now, the client downloads the .torrent file, computes the key from the .torrent file, uses this key to contact the node who has the peer list (the Distributed Hash Table lookup), recieves the peer list and starts downloading. Look Ma, no hands, no tracker necessary!

It may look perfect but there is one problem: how does the BitTorrent client finds the node where the peer list resides? The good thing about a Distributed Hash Table lookup is that you do not need to know every single node on the network, but you need to know at least one of them. So how does BitTorrent find at least one of the peer downloaders?

I generated a .torrent file with the 4.1.0 Beta Bittorrent client to find out. The program hung, but it got far enough to create a torrent file. Turns out the answer is router.bittorrent.com. There is a new piece of info attached to the torrent file, called node, and it contains the address router.bittorrent.com, port 6881. Your client will contact this server to find out about the Distributed Hash Table. router.bittorrent.com is not a tracker and it won’t participate in any downloading and uploading at all. It’s just a well known node in the Distributed Hash Table.

I’m not sure if this a good idea. The tracker is gone but instead of a multiple trackers, each of them serving different torrents, we now have a single point of failure for all “trackless automatic” torrents, not just the ones confined to a certain tracker. I guess that’s why there’s a “trackless node” option, where you can fill in a “first contact” node for yourself. This means that there are three ways to set up a torrent: the old fashioned way with a tracker, the new way with a “first contact” node of your own or you could rely on router.bittorrent.com.

In conclusion, trackerless BitTorrent makes BitTorrent easier for peer-to-peer file sharing, but it will not make a difference for reliable large content serving from a shared webhost. The seeding problem remains, you will have to keep seeding your torrent for as long as you want it to be alive. The tracker part is gone, but that was the easiest one to implement on a shared webhost. The tracker protocol was just HTTP. Seeding is more difficult from a webhost: it uses long living connections and a more difficult protocol. I can’t find any PHP seeders at this moment.