If a website is offline or restricts how quickly it can be crawled then downloading from someone else’s cache can be necessary.
In previous posts I discussed using Google Translate and Google Cache to help crawl a website.
Another useful source is the Wayback Machine at archive.org, which has been crawling and caching webpages since 1998.

Here are the list of downloads available for a single webpage, amazon.com: