E. History

LinkController was originally inspired by MOMspider and having the
MOMspider code available was very useful when starting the creation of
this kit, but, it shares almost no code with MOMspider, other than what
has comes to it from the LibWWW-Perl library.

Philosophically, the MOMspider heritage is obvious in the wish to handle
big jobs efficiently. In the working practice there are far more
differences than similarities, partly caused by Perl language changes.

I decided to completely separate the exploration of the local
infostructure, looking for links to be checked, from the actual checking
process. This means that checking can be spread over a large number of
days and still run efficiently.

The basic aim of this link checking kit is to be able to efficiently
handle any size of link checking job. At the bottom end we have
checking new pages as they are written. Here we want to use
information from previous checks to avoid having to check all of each
page every time. At the other end we have massive info structures
(sites) which deal in many thousands of links and could not possibly
all be checked in one day. For this latter case the aim is to be able
to efficiently spread the link checking load into all available low
usage periods.

My primary aim in writing this was not to write very efficient code for
the small scale case (takes minimum time to do everything), but rather
code which would scale well. If your system can check 1000 links in two
days, it will hopefully be able to check almost 7000 links in two weeks.
I'm trying to make sure all data structures which grow with the number
of links are kept on disk.

Esoterica Internet Portugal

Esoterica provided me with full access to the Internet in Portugal and
use of their computers for free which allowed me to keep up on both this
software and the Linux Access HOWTO. In particular I'd like to thank
all of the members of staff who helped me very much. These people include
Mario Francisco Valente (the instigator of Mini Linux) who first
agreed to me using their kit, set me up to use their machines, and along
with
Luis Sequeira provided a sounding board for some ideas.
Luis also provided the odd lift home in the evening. Also
Martim de Magalhaes Pereira and Mr Mendes. See them all on

IPPT PAN Poland

Thanks go to IPPT PAN (part of PAN - Polska Akademia Naukowa) in Poland
and in particular Piotr Pogorzelski who allowed me use of facilities for
testing this software, provided a willing victim for having his web
pages tested and made a number of suggestions which have been
incorporated into the software.

The Tardis Project

Supported by the Computing Science department of the University of
Edinburgh, the Tardis project provides an experimental framework in
which students, former students and other related people to do their own
work on fully Internet connected Unix and Linux hosts.

The use of the facilities of the Tardis Project has made it much easier
for me to develop software like this. In particular, the large amount
of disk space the administrators have allow me to use is very useful.

Other Free Software Authors

It is through the software provided by the Free Software Foundation
(such as the gcc C compiler, Emacs, the file utilities), the
authors of the various packages which make up a working Linux System
(Linux by Linus Torvalds, Alan Cox, etc.... filesystems and support by
Theodore Tytso, Stefan Tweedie etc.. Linux-Libc by HJ Lu, based on GNU
glibc from the FSF.. the list is indefinite) and the authors
of Perl and its modules, especially Gisle Aas and Martijn Kostler for
LibWWW-Perl that I was able to set this up.

I'd particularly like to thank Tim Goodwin the author of the Perl CDB
module who made and accepted a number of alterations to that, at my
request. These alterations made this package simpler to write and
easier to maintain.