Using mod_rewrite to control access

This document supplements the mod_rewritereference documentation. It describes
how you can use mod_rewrite to control access to
various resources, and other related techniques.
This includes many examples of common uses of mod_rewrite,
including detailed descriptions of how each works.

Note that many of these examples won't work unchanged in your
particular server configuration, so it's important that you understand
them, rather than merely cutting and pasting the examples into your
configuration.

The following technique forbids the practice of other sites
including your images inline in their pages. This practice is
often referred to as "hotlinking", and results in
your bandwidth being used to serve content for someone else's
site.

Solution:

This technique relies on the value of the
HTTP_REFERER variable, which is optional. As
such, it's possible for some people to circumvent this
limitation. However, most users will experience the failed
request, which should, over time, result in the image being
removed from that other site.

There are several ways that you can handle this
situation.

In this first example, we simply deny the request, if it didn't
initiate from a page on our site. For the purpose of this example,
we assume that our site is www.example.com.

In this recipe, we discuss how to block persistent requests from
a particular robot, or user agent.

The standard for robot exclusion defines a file,
/robots.txt that specifies those portions of your
website where you wish to exclude robots. However, some robots
do not honor these files.

Note that there are methods of accomplishing this which do
not use mod_rewrite. Note also that any technique that relies on
the clients USER_AGENT string can be circumvented
very easily, since that string can be changed.

Solution:

We use a ruleset that specifies the directory to be
protected, and the client USER_AGENT that
identifies the malicious or persistent robot.

In this example, we are blocking a robot called
NameOfBadRobot from a location
/secret/files. You may also specify an IP address
range, if you are trying to block that user agent only from the
particular source.

As noted above, this technique is trivial to circumvent, by simply
modifying the USER_AGENT request header. If you
are experiencing a sustained attack, you should consider blocking
it at a higher level, such as at your firewall.

##
## hosts.deny
##
## ATTENTION! This is a map, not a list, even when we treat it as such.
## mod_rewrite parses it for key/value pairs, so at least a
## dummy value "-" must be present for each entry.
##

193.102.180.41 -
bsdti1.sdm.de -
192.76.162.40 -

Discussion:

The second RewriteCond assumes that you have HostNameLookups turned
on, so that client IP addresses will be resolved. If that's not the
case, you should drop the second RewriteCond, and drop the
[OR] flag from the first RewriteCond.

Notice:This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed again by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Freenode, or sent to our mailing lists.