WebReaper

Web crawler or spider, which can work its way through a website, downloading pages, pictures and objects that it finds

editor's review

download

specifications

changelog

WebReaper is a tool that can scan and download the content of a website.

The user interface of the program is plain and simple. You can start the crawling process by entering a URL.

So, you can view a list of discovered links, along with the URL/title, size, type, last date of modification and information (e.g. "Filtered: Server depth").

But you can also add filters that focus on the crawl depth, last date of modification, time since it was last modified and download time.

Moreover, you can open a filter wizard to download from any director or only download files in the root's subdirectories, download pages from any server or domain, or download only from the server specified in the root URL.

In the same wizard you can specify keywords or phrases that will be skipped, limit the download to only recently changed objects, select the formats of the files which will be downloaded, as well as specify the total levels of links to download from and the maximum sizes for the HTML and binary files.

Furthermore, you can assign batch jobs, clear the log, URL history and links, exclude links from the download process, open a link by using the downloaded copy stored locally, as well as set a root to the currently selected node.

In addition, you can create and organize a favorites list, add a new filter to the filter tree, save, load, import and export filters, and more. The program takes up a moderate-to-high amount of system resources and didn't freeze or crash during our tests.

However, there is no help file available, the size and date of last modification couldn't be retrieved in the case of some files and the interface could certainly welcome some improvements. Novices would most likely have a hard time learning how to use all the features provided by WebReaper. We mainly recommend it to experienced individuals.

Smart ordering when downloading - images, etc., are given priority, which should result in more completed pages downloading when a site download is stopped before completion. Also options to replace .ASP/.PHP/.ASPX file extensions with .HTML which works better for local browsing, as well as automatically renaming files with .HTML file extensions if no extension existed previously. New option to include download date as part of local folder name. Also added support for downloading images embedded in CSS files. This version is compiled with Visual Studio.Net.