is there any way to download only certain pages (for instance, several parts of an articles that is spread over several html documents)?
– don.joeyFeb 18 '13 at 9:16

@Private Yes, although it's probably easier to use python or something to get the pages (depending on the layout/url). If the url of the pages differs by a constantly growing number or you have a list of the pages, you could probably use wget in a bash script.
– VrealityApr 10 '13 at 22:00

2

You might consider using the --wait=seconds argument if you want to be more friendly to the site ; it will wait the specified number of seconds between retrievals.
– belacquaFeb 7 '14 at 21:27

HTTrack allows you to download a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site's relative link-structure.

WEBHTTRACK WEBSITE COPIER is a handy tool to download a whole website onto your hard disk for offline browsing. Launch ubuntu software center and type "webhttrack website copier" without the quotes into the search box. select and download it from the software center onto your system. start the webHTTrack from either the laucher or the start menu, from there you can begin enjoying this great tool for your site downloads

I don't know about sub domains, i.e, sub-sites, but wget can be used to grab a complete site. Take a look at the this superuser question.
It says that you can use -D domain1.com,domain2.com to download different domains in single script. I think you can use that option to download sub-domains i.e -D site1.somesite.com,site2.somesite.com

I use Burp - the spider tool is much more intelligent than wget, and can be configured to avoid sections if necessary. The Burp Suite itself is a powerful set of tools to aid in testing, but the spider tool is very effective.

from the licence: WARNING: BURP SUITE FREE EDITION IS DESIGNED TO TEST FOR SECURITY FLAWS AND CAN DO DAMAGE TO TARGET SYSTEMS DUE TO THE NATURE OF ITS FUNCTIONALITY. TESTING FOR SECURITY FLAWS INHERENTLY INVOLVES INTERACTING WITH TARGETS IN NON-STANDARD WAYS WHICH CAN CAUSE PROBLEMS IN SOME VULNERABLE TARGETS. YOU MUST TAKE DUE CARE WHEN USING THE SOFTWARE, YOU MUST READ ALL DOCUMENTATION BEFORE USE, YOU SHOULD BACK UP TARGET SYSTEMS BEFORE USE AND YOU SHOULD NOT USE THE SOFTWARE ON PRODUCTION SYSTEMS OR OTHER SYSTEMS FOR WHICH THE RISK OF DAMAGE IS NOT ACCEPTED BY YOU.
– Kat AmsterdamMay 8 '12 at 18:00

For what it does, the price tag is amazingly cheap - I would recommend buying it for a wide range of security testing. And it is very easy to configure it to test exactly as you want - safer than AppScan in some instances:-)
– Rory AlsopMay 8 '12 at 18:22

1

@KatAmsterdam Regarding specifically the compatibility question: According to Wikipedia, Burp Suite is a Java application, so it should run fine on Ubuntu.
– Eliah KaganApr 10 '13 at 22:21

Kat - it runs just fine on various flavours of Linux. The warning on the licence is the same as any tool you can use for security assessments.
– Rory AlsopAug 24 '16 at 15:42

If speed is a concern (and the server's wellbeing is not), you can try puf, which works like wget but can download several pages in parallel. It is, however, not a finished product, not maintained and horribly undocumented. Still, for to download a web site with lots and lots of smallish files, this might be a good option.

Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).