Arquivo.pt is a research infrastructure that preserves millions of files collected from the web since 1996 and provides a public search service over this information. It contains information in several languages.
Periodically it collects and stores information published on the web. Then, it processes the collect data to make it searchable, providing a “Google-like” service that enables searching the past web (English user interface available at www.archive.pt).
This preservation workflow is performed through a large-scale distributed information system and can also accessed through API.