Welcome to the Perishable Press “Blacklist Candidate” series. In this post, we continue our new tradition of exposing, humiliating and banishing spammers, crackers and other worthless scumbags..

Imagine, if you will, an overly caffeinated Bob Barker, hunched over his favorite laptop, feverishly scanning his server access files. Like some underpaid factory worker pruning defective bobble heads from a Taiwanese assembly line, Bob rapidly identifies and isolates suspicious log entries with laser focus. Upon further investigation, affirmed spammers, scrapers and crackers are swiftly blacklisted from future access. For the most heinous offenders, we suddenly hear Rod Roddy’s guzzling voice echo throughout the room:

“Candidate number 2008-03-09, COME ON DOWN!! — you’re the next contestant to get blacklisted from the site!”

Every week, I dig through my access and errorlogs to learn from the spammers, scrapers, and other cracker whores. Typically, attempts to exploit potential security vulnerabilities demonstrate the following characteristics:

indexed URLs targeted via attack strings

multiple URLs are tested for each attack

attacks occur quickly, usually within seconds

multiple IPs are used for each attack

IPs are vastly different, even random

many attacks are from Latin American, Asia Pacific, and RIPE networks

These trends are associated with a large majority of attacks, occurring frequently enough to be dismissed without further investigation. Attacks that deviate significantly from these familiar patterns are of particular interest, especially those involving a single IP address, enduring for longer time periods, or employing unusual attack methods. Such attacks pose a greater risk by demonstrating premeditation, threatening performance and compromising security. These more serious types of attacks are investigated fully and subsequently featured in the monthly Blacklist Candidate series. In this edition of the series, we expose, humiliate, and banish blacklist candidate #2008-03-09: IP address 87.248.163.54!

Synopsis

On March 4th, 2008, an attacker identified with IP address 87.248.163.54 attempted to access a series of nonexistent URLs, each consisting of the site’s blog root ( https://perishablepress.com/press/ ) appended by a character string emulating the following pattern:

Using variations of these URLs, the attacker hit my server approximately 100 times over the course of four minutes (from 15:01 to 15:05), averaging an attack every 2.4 seconds. Most likely, the attacker employed an automated script to execute the requests. Further, given the uniformity of the target URL and the similarity of the appended attack strings, this attack seems to be targeting a specific software platform that is not installed on the Perishable Press domain. This indicates that the attack was not specifically targeted at my site, but rather happened as a random vulnerability check. To prevent further attacks, the associated IP address was blocked on March 5th via htaccess. No similar incidents have occurred since.

Further, the attacker employed a blank (unidentified) user-agent for every recorded attack.

Discussion

Although probably random, this attack was deliberate, automated, and hostile. Crackers trying to access URLs containing the term “administrator” are not your friends, and should be blocked immediately and dealt with accordingly. Too many people have grown accustomed to such attacks, easily dismissing them as “normal” or even “expected” activity on the Web. Wake up, folks! These mindless cracker whores are attacking your personal assets and deserve to be hunted down and punished as criminals. Would you casually dismiss someone trying to break into your car 100 times? I don’t even think so..

Details

Here are the first and last log entries for attack. As discussed, the entire set of excluded entries 1 is similar to the following:

Unfortunately, although this method would prevent further attacks, it would also block any legitimate URLs containing instances of the target strings. [ Update: Don from rants.thenexus.tk has confirmed that this first method will prevent Joomla users from accessing certain pages. ] Thus, for this particular blacklist candidate, we are better served by simply denying the attacker’s unique IP address:

great article again dude;
a note on the joomla side of using this trick.
Don’t :p
the ‘/components/’ will muck about and possibly stop components from working. (tested on the only live component I had puarcade).
And then the ‘/administrator/’
will stop you getting in to your back end.

Yes, that’s what I figured.. it almost seems as if the attack was intended for Joomla, although other platforms may use similar structures. In either case, thanks for the confirmation — will definitely advise against the first method of blocking via htaccess.

[disclaimer:] Consider me a nOOB who has spent quite some some time reading up on your posts here before posting this.

Background: I have recently migrated my site from Expression Engine to WordPress and – as expected – I’ve been deluged by spam from the second I switched the new site live. I had a couple of top spots on Google with the old site that I redirected permanently to the new site. They’re being harvested one by one and … well, you know the rest.

Ever since then – I wrote a post on it on my site – I’ve been reading around I don’t know how many sites to find a viable solution for my problem, a solution that’s “future-proof”.

Your blacklisting of attempts via a predefined and extendable blacklist seems to be the most future-proof, meaning I can work my way into your way of doing things and not waste any time trying to understand other plugins and whatnot … which might eventually disappear from the face of this earth.

In short, I think if I follow your excursions into swampy territories here, try to learn as I go along and adapt to the ever-changing tactics of the spam league, I might be able to transfer that knowledge CMS-independently in the future. You know, time-saving and all of that.

So, before I become a regular around here and while hoping that you won’t abandon your Don-Quixote-like fight against spam ;) , I have two (simple?) questions to get me going:

You started out with a blacklist which you then offered up (forgive me if I misread that) in a more compressed and efficient format. If, again, I understood correctly, all the subsequent posts (like this one) then offer up tidbits that I could add to the .htaccess file to build upon my/our defence(s). Am I correct? That means I would take the compressed blacklist file and add on everything that you’ve posted on this issue since then?

Secondly, already now I have a several hundred lines long .htacces file. There was no way I could rewrite the former links into the new ones (they didn’t really have a common denominator … or I was too dumb), so I have a bit more than one-hundred permanent redirects in there that will be thrown out once Google has figured out that what was once there isn’t there anymore (or, elsewhere).

So, last question: If I tack your blacklist and snippets onto the file, will that have a major (minor I don’t care about) effect on the loading time of my site’s pages? The redirects didn’t seem to hurt much.

Thanks for bearing with me and for taking the time to read this.
I hope I wasn’t too much of a nuisance.

First question: yes, you read correctly; the first blacklist was replaced with an optimized (compressed) version that was also updated with new bots and other villains. I would recommend using the compressed version over the original version. As for combining these different htaccess strategies (i.e., the “ultimate blacklist” and the “2G blacklist”), you are correct. For example, within my site’s root htaccess file, I placed a copy of my compressed blacklist after the other (non-blacklist) directives. Placement isn’t necessary, but I thought I would mention it. Then, after the first set of blacklist directives, and as a distinct and separate set of rules, I included a copy of the 2G blacklist. Beyond these two lists, I have several rules blocking specific user-agents, followed by a final set of individual “deny from” directives. So, to generalize this strategy (and to answer your question more directly), you should be able to include as many spam-blocking and other security-related rules as necessary to your root htaccess file. They are designed and intended to operate independently of one another in a perpetual state of symbiotic bliss. ;) As long as the code blocks are kept distinct and accurate, all should be fine.

And this leads very nicely into the response to your second question, which is also a resounding “yes”. As with any script, processing time is related to the number of executed steps. htaccess is no different, however, I have seen (and used) htaccess files with many more directives than we are talking about here. Of course, depending on the server and available resources, your mileage may vary. To check the performance for yourself, upload the htaccess file you intend to use and use a tool such as a free website speed test to determine the average load time. Then, remove the rules in question and run the test again. Comparing averages should give you a good indication of the effect on loading time. Incidentally, if you decide to run such a test, I would love to hear the results! :)

Now, I have a question for you.. I read through the comments in your intense WordPress: Spam Magnet article, and saw that Perishable Press itself had been blacklisted from your workplace’s filtering system. What’s up with that, anyway? I always thought of this site as the exact antithesis to all things spam-related on the Internet ;) No worries though — I understand the intricate complexities of administrative security policies (sarcasm). In any case, I hope this information helps! Let me know if I may be of any further assistance.

thanks for your detailed reply. I’m glad I got it for once. For me this is all a totally new way of approaching things and I will finally dare to jump and learn how all these commands work in detail. I always stayed away from .htaccess because messsing things up there usually screws up everything else as well. So, I’ll follow your instructions here (starting next weekend) and see what happens. I’ll also report back once I have things in place and let you know if I have any speed issues.

The comment about your site having been blacklisted came from some other visitor who then never returned, so I have no idea how that came about. But, and I’m just guessing here, if someone approaches spam fighting by letting some plugin reign free without keeping it in check, the same thing will happen that has been developing on my site … eventually, just about everything and everyone is being blocked, deterred or blacklisted. You need to take action to reverse some steps a plugin took to protect you from spam.

It’s the latter that had me thinking about searching for another solution. I’d like to be in charge without leaving the thinking work to a plugin.

Secondly, if I invest lots of time into figuring out what works, I’d like to invest that time on a method that’s future-proof, meaning that I can use and adapt the .htaccess method and migrate that to just about any other setup, should I decide to change CMS down the line.

So, thanks again … and I’ll be a regular around here to see what’s cookin’ …

Ohmygosh… So this is how people try to get into sites. A couple of weeks ago my friend’s site was hacked by someone. I went to go check the logs on his site and sure enough it was hit over 200 times by the same IP address the day the site was hacked.
>.< why am I so blind to these things?

Projects

About the site

Perishable Press is the work of Jeff Starr, professional developer, designer, author, and publisher with over 10 years of experience.
Check out some of Jeff's books and projects, follow on Twitter, or learn more »

Fun fact: Perishable Press has been online since 2005, and features over 800 articles and more than 11,000 comments. More stats »