Analysis Machine

A Perl script calls various plugins that sniff around on FTP and HTTP servers run by the major Linux distributions to discover when Fedora, Debian, and other distros update their packages.

The software in this Perl column is, to some extent, commissioned work: The editors of the German edition, Linux-Magazin, needed a tool that would crawl the update servers of major Linux distributions to find release dates for a cover story. Because Perl has a reputation of being good at complex queries and string manipulation, this article is totally devoted to a tool that supports that topic. The objective was a script that accepts a query passed in at the command line, searches different targets, filters out matching package names, collects the dates, and finally outputs a table with the results.

The difficulty is that Linux distributions offer their release information in different formats. For example, packages for Ubuntu and Debian reside on an FTP server – not all in one directory, but split up into subdirectories whose names consist of the first letter of the package name (Figure 1).

The Elasticsearch full-text search engine quickly finds expressions even in huge text collections. With a few tricks, you can even locate photos that have been shot in the vicinity of a reference image.