A few months ago, I wrote up a little tutorial on stopping hotlinking (or hot-linking, also known as bandwidth theft) called Selective hotlinking prevention through .htaccess”>”Selective hotlinking prevention through .htaccess.” The idea was simple: prevent random users from stealing bandwidth while allowing defined directories to be hotlinked, e.g. for posting images on a message board. The technique described in the previous entry is still valid, but I’d like to describe an improved and more efficient approach to hotlinking prevention.

My “policies” on hotlinking

Most webmasters are content to simply prevent hotlinking and save bandwidth. I do a bit more. Here are my “policies” on hotlinking:

By default, an image cannot be hotlinked.

Users who link directly to an image hosted on my website are redirected to a descriptive Web page.

Specified directories allow hotlinking.

All anti-hotlinking rules are contained in a single .htaccess file. (As you’ll note below, this is a change from my previous tutorial.)

The regular expressions used to defeat hotlinking are (hopefully!) as efficient as possible.

Redirecting hotlinked images to a descriptive Web page

You’ve probably noticed that many webmasters choose to redirect hotlinked images to something funny or intentionally offensive, or simply to drop the request at the Web server level (accomplished with the “F” [forbidden] flag). I decided to do something different. I haven’t seen this anywhere else, so maybe it’s an original idea… But this is the Internet, where everything’s already been done somewhere by someone, so I highly doubt that I thought of it first.

My idea was to redirect HTTP requests for hotlinked images to a Web page giving context to the image. Of course, if you redirect a request for a JPEG to an HTML page, a user attempting to load the image via an IMG tag is going to get the ol’ red X—we’ve got a serious content-type mismatch. (This is already the behavior of anti-hotlinking sites that deny that request altogether.) But hotlinking can also be thought of in a looser sense. When other users link directly to content on your site without providing its context, it can also be a form of hotlinking. For example, by linking directly to images on your site without linking to the parent page that describes the pictures, another website can hijack your bandwidth.

Thus, I decided to redirect requests for images not originating from my Web site to a “container” Web page, which provides a link to my site, explains that the image is hosted at underscorebleach.net, and looks prettier than the image by itself. Here is an example image of Carrot Top looking like the Ultimate Warrior:

Lines 2 through 6 allow hotlinking from my site, Bloglines, Google, and cached items. I also allow requests with a null HTTP_REFERER value to obtain the image; this occurs in the case of bookmarks, some proxies, some browser settings, some third-party privacy plugins, etc. If you try to get tricky and force users to have a referrer from your own domain, you’re likely to get yourself in trouble. Trust me.

The last line redirects users to an SHTML page. Notice that I pass the value of the REQUEST_URI as a parameter in the URL to view_image.shtml. In the source, I then use a simple SSI directive to output the image. Here is the source of view_image.shtml. (file has .txt extension but put it on your website as .shtml) [updated 9/14/05 for clarity]

A quick note about this redirection technique: You do need to actually redirect, that is, use the “R” flag on the RewriteRule. If you don’t, you’re going to feed the browser an HTML page when it’s expecting an image, and it’s liable to get confuzzled.

And voila! Now, when users link to your images, they get the image, but they also get an unavoidable little advertising pitch from you, the webmaster and payer of bandwidth bills.

(Note for technical users: I’ve enabled extensionless URLs on my Web site, so “view_image.shtml” is actually “view_image” and is still server-parsed. Also, if you write PHP, feel free to convert the example view_image.shtml container page to PHP and post a link to it in the comments.)

Consolidating directives into a single .htaccess file

The power and the frustration of .htaccess is its overriding sway. A rule applied to the directory /example/ applies to /example/subdirectory/ on down. It can be “undone,” but it’s a little tricky. In my previous tutorial, I recommended inserting a “dummy rule” in an subdirectory’s .htaccess to undo rules from its parent directory. I no longer recommend this technique, because it’s too unwieldly to maintain in the case of many subdirectories. Besides, it’s inelegant.

The superior approach is to place all anti-hotlinking rules in a single .htaccess file at the root level of the website. Here is the key point: Rules for subdirectories must be higher (executed first) in the file. Combined with the “L” (last) flag that can be applied to a RewriteCond, we create a situation simulating an “if” conditional in a programming language married to a “break” statement. Here is an example for a directory /hotlinking-allowed:

So here’s what we’ve accomplished. With the above .htaccess, we’ve disallowed hotlinking from every directory except /hotlinking-allowed/. When an HTTP request for the hotlinked image http://underscorebleach.net/hotlinking-allowed/take_it.jpg comes in, it matches the first set of rules in the .htaccess file. Apache applies the rule and does not step through the remaining rules. However, when a request for the hotlinked image http://underscorebleach.net/tsk_tsk.jpg arrives, it won’t match the first set of rules. Then the second set of rules will smack it down and re-direct it to our cute, little container page.

Allowing hotlinking from multiple directories

Want to apply the above to multiple directories? No problemo. For the directories tacobell, bestfood, and ever, we’d use this

(Note: This tutorial had a typo in the RewriteCond above indicating brackets ‘[’ instead of paranthesis ‘(’. This was an important error, in terms of regexp, as the faulty RewriteCond is inefficient.)

Consolidating multiple .htaccess files

Already have a bunch of .htaccess files and want to consolidate them as described above? Maybe you can’t even find them anymore? Well, time to pull out our ol’ friend find.

cd to your home Web directory.

Execute: find . -name .htaccess

Clean up.

Parting words, closing thoughts, hopes and dreams

I hope that inspires all of you Googling webmasters. If anything’s unclear, drop a comment on this page so everyone will benefit from your question. There is a LOT of information in the full comments; please read through what’s already been covered before posting a comment. You’ll get an answer more quickly, and you won’t waste my time. Many people are trying to accomplish the same ends. Thanks!

Recently, a few of my images started ranking highly on Google Image Search for terms like "icon" or "bullet point". An unfortunate side effect of this new found popularity has been dozens of people hotlinking my images on their websites without permission.

At first, I didn’t really mind, but it reached a point where 9 out of 10 hits on my website were for hotlinked images. So I decided I had to do something about it.

The solution was to write an .htaccess file to block hotlinks, and place it in my photos directory. The code looks something like this:

Basically, this code filters out image requests based on the site that sent the request (aka the "referer"). It only affects images with the following extensions: jpeg, jpg, gif, or png.

So, how does it work? The first line turns the rewrite engine on, which allows us to redirect requests. The second line allows viewing images from blank referers; this is important because some browsers won’t send referers, even if the image is linked on your own website. The next three lines allow my own site, and two other sites, to link to my images. The final line redirects anyone else to "bad.jpg" on example.org.

Keep in mind, if you’re going to redirect someone to a different image, that image must not be on your server, or you will create an infinite loop!

Alternatively, you can simply block the hotlinked request by changing the last line to the following:

RewriteRule \.(jpeg|jpg|gif|png)$ - [F]

Instead of being redirected, the user will just see a broken image.

You can also use this code to block things besides images (MP3s or Zip files, for example). Just add the file extension into the last line, separated by a pipe character, like so:

Got a spambot or scraper constantly showing up in your server logs? Or maybe there’s another site that’s leeching all your bandwidth? Perhaps you just want to ban a user from a certain IP address? In this article, I’ll show you how to use .htaccess to do all of that and more!

Identifying bad bots

So you’ve noticed a certain user-agent keeps showing up in your logs, but you’re not sure what it is, or if you want to ban it? There’s a few ways to find out:

Once you’ve determined that the bot is something you want to block, the next step is to add it to your .htaccess file.

Blocking bots with .htaccess

This example, and all of the following examples, can be placed at the bottom of your .htaccess file. If you don’t already have a file called .htaccess in your site’s root directory, you can create a new one.

So, what does this code do? It’s simple: the above lines tell your webserver to check for any bot whose user-agent string starts with "BadBot". When it sees a bot that matches, it redirects them to a non-existent site called "go.away".

Now, that’s great to start with, but what if you want to block more than one bot?

The code above shows the same thing as before, but this time I’m blocking 3 different bots. Note the "[OR]" option after the first two bot names: this lets the server know there’s more in the list.

Blocking Bandwidth Leeches

Say there’s a certain forum that’s always hotlinking your images, and it’s eating up all your bandwidth. You could replace the image with something really gross, but in some countries that might get you sued! The best way to deal with this problem is simply to block the site, like so:

This code will return a 403 Forbidden error to anyone trying to hotlink your images on somebadforum.com. The end result: users on that site will see a broken image, and your bandwidth is no longer being stolen.

Banning An IP Address

Sometimes you just don’t want a certain person (or bot) accessing your website at all. One simple way to block them is to ban their IP address:

order allow,deny
deny from 192.168.44.201
deny from 224.39.163.12
deny from 172.16.7.92
allow from all

The example above shows how to block 3 different IP addresses. Sometimes you might want to block a whole range of IP addresses:

order allow,deny
deny from 192.168.
deny from 10.0.0.
allow from all

The above code will block any IP address starting with "192.168." or "10.0.0." from accessing your site.

Finally, here’s the code to block any specific ISP from getting access:

order allow,deny
deny from some-evil-isp.com
deny from subdomain.another-evil-isp.com
allow from all

Final notes on using .htaccess

As you can see, .htaccess is a very powerful tool for controlling who can do what on your website. Because it’s so powerful, it’s also fairly easy for things to go wrong. If you have any mistakes or typos in your .htaccess file, the server will spit out an Error 500 page instead of showing your site, so be sure to back up your .htaccess file before making any changes.

If you’d like to learn more about writing .htaccess files, I recommend checking out the Definitive Guide to Mod_Rewrite. This book covers everything you need to know about Apache’s .htaccess rewrite system.

PS: If your webhost doesn’t support .htaccess, it’s time to get a better one!

Welcome to GJB Enterprises, we specialise in helping YOU achive more out of the internet.

Wether that is a website, a blog, a social network, a discussion forum, a member site, a sales page, an e-commerce site, or an Amazon shop, we can set these up for you and we can show you how to dominate google!

We can also optimise your site, do custom integration and custom programming.

To give you just what you need when you want it at a price you can afford.