openSUSE download redirector

History

To ease the download of openSUSE, the first version of the download redirector was developed in late 2005 and presented at FOSDEM 2006. This first proof of concept didn't include any of the feature that are present in today's redirector, accessible at download.openSUSE.org.

The first two iterations of the redirector were written in PHP. They were superseded by the (current) implementation in C as Apache module. There was also a prototype implementation in Ruby which can be integrated via FastCGI, however the Apache module is what we use.

The current implementation is a combined effort by several people. People to name are Jürgen Weigert and Martin Polster (scanner, mirrorprobe), Peter Poeml (mod_mirrorbrain, mirrorprobe), Marcus Rueckert (the Ruby prototype). And of course lots of other people have contributed with their input, in particular Christoph Thiel, who implemented the first two iterations of the download redirector. (Those aren't used anymore, but the current implementation has its roots in their essential technical design.)

What does it do?

The goal of download.openSUSE.org is to provide an automatic and transparent mirror selection, that fits best for every user, based on his location (GeoIP) and on mirror performance. To achieve this, there is an entire framework, which forms some kind of "mirror brain", which keeps a mirror database as a "state cache" for every file on every mirror. This database is being updated continuously by mirror "scanner", which is able to crawl mirrors via ftp, http and rsync. Another essential part of the framework is the monitoring, a daemon which periodically probes the mirrors with HTTP requests, to check their online status - to ensure working redirects at any time. The key of this framework is the redirector itself, an Apache module called mod_mirrorbrain. It uses MaxMind's GeoIP, a free database that maps IP addresses to countries and regions in order to figure out the location of the requester and then query the mirror database to get a list of potential mirrors, and choose the best one.

Most openSUSE downloads (download.opensuse.org, software.opensuse.org, ftp.opensuse.org) are handled in this way. Some are not - for security reasons, certain files (like signatures) are delivered directly off download.opensuse.org. Tiny files are another exception, because an HTTP redirect would be of the same size as just delivering the tiny file itself, thus saving the client a further round trip. In addition, files that are not present on any mirror yet, or are not intended to be mirrored at all, are sent out directly. All in all, these exceptions result in about 50% of requests being redirected to mirrors.

More information about the technical implementation as well as background documentation and can be found on the MirrorBrain project page.

So how does it work?

The way how mirrors are selected might be easiest to follow when looking at some pseudocode. The algorithm goes like this:

Once a mirror is selected, the redirector returns an HTTP status code 302 (Found) and includes a Location header with the redirection URL, which makes the requester go there. If no mirror is known for a given file, the server will simply deliver the file itself.

There are some important exceptions. For certain files it is hard to make sure that they are current on all mirrors, because they change frequently. Thus, the server doesn't redirect for such files.

Mirror "Stickiness"

In the past, we used "mirror stickiness":

Once a client had been redirected to a certain mirror, it was redirected to
the same mirror again on the next request, and not to another randomly
chosen one.

This configuration proved to have no benefit over just randomly assigning mirrors, so it is no longer active (since early 2008 I think).

Built-in Metalink support

The redirector generates metalink files (see http://metalinker.org). Enabled clients can automatically fail over in case of problems, or even download in parallel. A metalink is returned whenever ".metalink" is appended to the URL of a file to be downloaded.

The redirector supports injection of verification hashes and PGP signatures into the metalinks, and it does include them for most larger files, like iso images.

More info about using metalinks to download openSUSE can be found here.

Other tidbits

The "central" manner of distribution files has an interesting implication - it allows us to gather interesting data about which files are requested which we otherwise couldn't do. Therefore we have an additional apache module in place. It collects statistics about downloads of individual Build Service packages. In principal, this module splits up path and file name and logs the components resp. increases counters in an SQL database. The sources can be found here:
https://forgesvn1.novell.com/svn/opensuse/trunk/tools/download-stats/mod_stats/.