How To Block Bots, Ban IP Addresses With .htaccess

Friday, November 5th, 2010 at
12:47 am

Got a spambot or scraper constantly showing up in your server logs? Or maybe there’s another site that’s leeching all your bandwidth? Perhaps you just want to ban a user from a certain IP address? In this article, I’ll show you how to use .htaccess to do all of that and more!

Identifying bad bots

So you’ve noticed a certain user-agent keeps showing up in your logs, but you’re not sure what it is, or if you want to ban it? There’s a few ways to find out:

Once you’ve determined that the bot is something you want to block, the next step is to add it to your .htaccess file.

Blocking bots with .htaccess

This example, and all of the following examples, can be placed at the bottom of your .htaccess file. If you don’t already have a file called .htaccess in your site’s root directory, you can create a new one.

So, what does this code do? It’s simple: the above lines tell your webserver to check for any bot whose user-agent string starts with "BadBot". When it sees a bot that matches, it redirects them to a non-existent site called "go.away".

Now, that’s great to start with, but what if you want to block more than one bot?

The code above shows the same thing as before, but this time I’m blocking 3 different bots. Note the "[OR]" option after the first two bot names: this lets the server know there’s more in the list.

Blocking Bandwidth Leeches

Say there’s a certain forum that’s always hotlinking your images, and it’s eating up all your bandwidth. You could replace the image with something really gross, but in some countries that might get you sued! The best way to deal with this problem is simply to block the site, like so:

This code will return a 403 Forbidden error to anyone trying to hotlink your images on somebadforum.com. The end result: users on that site will see a broken image, and your bandwidth is no longer being stolen.

Banning An IP Address

Sometimes you just don’t want a certain person (or bot) accessing your website at all. One simple way to block them is to ban their IP address:

order allow,deny
deny from 192.168.44.201
deny from 224.39.163.12
deny from 172.16.7.92
allow from all

The example above shows how to block 3 different IP addresses. Sometimes you might want to block a whole range of IP addresses:

order allow,deny
deny from 192.168.
deny from 10.0.0.
allow from all

The above code will block any IP address starting with "192.168." or "10.0.0." from accessing your site.

Finally, here’s the code to block any specific ISP from getting access:

order allow,deny
deny from some-evil-isp.com
deny from subdomain.another-evil-isp.com
allow from all

Final notes on using .htaccess

As you can see, .htaccess is a very powerful tool for controlling who can do what on your website. Because it’s so powerful, it’s also fairly easy for things to go wrong. If you have any mistakes or typos in your .htaccess file, the server will spit out an Error 500 page instead of showing your site, so be sure to back up your .htaccess file before making any changes.

If you’d like to learn more about writing .htaccess files, I recommend checking out the Definitive Guide to Mod_Rewrite. This book covers everything you need to know about Apache’s .htaccess rewrite system.

PS: If your webhost doesn’t support .htaccess, it’s time to get a better one!

Welcome to GJB Enterprises, we specialise in helping YOU achive more out of the internet.

Wether that is a website, a blog, a social network, a discussion forum, a member site, a sales page, an e-commerce site, or an Amazon shop, we can set these up for you and we can show you how to dominate google!

We can also optimise your site, do custom integration and custom programming.

To give you just what you need when you want it at a price you can afford.