Blocking bad bots and site rippers (aka offline
browsers)

The definition of a "bad bot" varies depending on who you ask, but most
would agree they are the spiders that do a lot more harm than good on your
site (ie: an email harvester). A site ripper on the other hand are offline
browsing programs that a surfer may unleash on your site to crawl and
download every one of its pages for offline viewing. In both cases, both
your site's bandwidth and resource usage are jacked up as a result,
sometimes to the point of crashing your server. Bad bots typically ignore
the wishes of your
robots.txt file,
so you'll want to ban them using means such as .htaccess. The trick is to
identify a bad bot.

Bots that are listed above will all receive a 403 Forbidden error when
trying to view your site. The amount of bandwidth savings and decrease in
server resource usage as a result may be significant in many cases.