After a series of posts about Google Image poisoningcampaigns that used hot-linked images a main trick to get top positions in search results, I’d like to describe a different Google Image poisoning attack that affects WordPress blogs and uses self-hosted images.
I found 4,358 self-hosted WordPress blogs that contained many (usually more than 100) doorway pages that redirected visitors coming from Google Image search to fake AV sites.

Those doorway pages can be easily identified:

Doorway URL pattern

They have the following URL pattern: hxxp://<hacked-wordpress-blog.com>/?[a-f]{3}=<keywords> , where [a-f]{3} is a combination of three letters “a” through “f” and the <keywords> is a hyphen-separated combination of keywords that contain either word picture or pictures. Here are some examples:

At the top of the images you can see an inscription — the domain name of the hacked site. This way criminals set their seal to the images to make them look like an original content of that site, not stolen images. At the same time this artifact can help identify poisoned image search results and avoid clicking on them.

The image files contain the following string inside: <CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 100. This means that they were created using the GD Graphics library.

In my understanding, hackers use a PHP script to fetch top rated images (returned by Google Images search), resize them to “tbumbnail-size” (width: 200-300 pixels) and to “full-size” (some random size – may even be larger that the original image) and finally add the domain name stamp.

Timestamps

At the very bottom of the HTML code of the doorway pages you can see comments like this:

<!-- 7/24/2011 4:30:03 PM --><!-- new england railroad pictures -->

The timestamp and the targeted keywords (they match the <keywords> part of the URL). This way you can easily see when the doorway was generated.

Redirects

The doorway pages rank quite well for some keywords both in Google Web search and Google Images search (especially when you are searching for exact phrases). However the malicious redirects occur only when you click on Google Images search results, which proves that Google Images poisoning is the main goal of this black-hat SEO campaign.

The redirects have two stages. The first redirect goes to an intermediary server (TDS) that, in turn, redirects to a landing pages that pushes a fake anti-virus tool (I’ve seen two different variations of the fake AV pages).

As you can see, the TDS server receives information about the keywords, source, and the actual referring URL.

The intermediary domains change every day. They actually belong to other hacked sites (mostly WordPress blogs)

Here are just a few domain names of the intermediary TDS sites used in this attack:

video.bywhy .com

ppopo2.bget .ru

awalstudios .com

demo.hireindians .net

www.privatepilot .hu

footballgirdles .tk

The domain name of a landing page consists of a .in domain that changes every day and some random “updateNN” or “scanNN” subdomain, e.g. update82.yourscan .in or “scan73.moomles .in”

Here are a few .in domains of the fake AV sites used in this attack:

spelleit .in

svernick .in

senerino .in

moomles .in

klopster .in

bastandro .in

waspeeds .in

yourscan .in

x-scan .in

Most of the .in sites point to the 193 .105 .154 .31 IP address (United Kingdom, Ars Tolerantia, with Latvian contact information).

Detection rate

The fake AV sites push scareware .exe with names like InstallSecurityScanner_NNN.exe, e.g. InstallSecurityScanner_225.exe. These files are being repackaged every day and their detection rate (according to VirusTotal) is quite low. The typical detection rate for currently served files is 8/43 (18.6%). It usually improves to 35%-50% by the time when the malicious file is no longer in use and a new file with a low detection rate is being served by the fake AV server.

3 domain(s) appear to be functioning as intermediaries for distributing malware to visitors of this site, including hireindians .net/, awalstudios .com/, bywhy .com/.

Update (Aug 10, 2011):

There are more than 4,358 hacked WP sites (found a few more)

After I sent them my lists and additional information, Google has removed those doorway pages on hacked sites from index (both web search and image search).

Hackers have injected hidden links that point to the doorway pages on the same sites into legitimate web pages of the hacked WP blogs. The links are cloaked (visible to search engine crawlers only). I believe they are injected into WordPress theme files (most likely footer.php).

I still haven’t receive a single response neither from webmasters of the hacked sites nor from their hosting providers. Hey, your information can help thousands of WordPress bloggers!

Hiccups in serving malware

On Tuesday, TDS generated redirects that missed domain name part in URLs of the fake AV sites. E.g.

hxxp://scan15./index.php?QwrhS9RYbcxGhnpcM45NtCWyBT…RcrnQq2F4MWHYQ==

As you can see, everything else is in place, except for the domain name. It looked as if the criminals ran out of the domain names (most of them were registered on July 20th) or forgot to specify a new domain for a new day.

Nonetheless, on Wednesday, the URL generation process was restored. However the landing pages wouldn’t open (at least for me). At the same time, when I opened the root page of the fake AV site (e.g. hxxp://scan36.bastandro .in, hxxp://bastandro .in or even simply hxxp://193 .105 .154 .31) the malicious download would start automatically.

On Friday, I see a different hiccup. The TDS redirects to a newly registered domain (August 4th)

hxxp://update82.yourscan .in/index.php?Q+Xh59RmbVNGM3p…fnk6164ISHXQ==

that points to a different IP address (46 .4 .161 .228) and that server seems to be down. At the same time, the 193 .105 .154 .31 server still automatically starts malicious downloads if you visit it, but the download size is 0 bytes.

I wonder if all these hiccups have to do with the crisis of the fake AV industry that Brian Krebs describes in his recent post.

“During the past few weeks, some top fake AV promotion programs either disappeared or complained of difficulty in processing credit card transactions for would-be scareware victims”

If this is true, we should expect the hacked sites will eventually try to monetize search traffic some different way.

How the hack works

At this point I couldn’t find cooperative webmasters of the hacked WordPress blogs that would share internal details of the hack. Nonetheless, my black box testing approach allows me to make some conclusions.

Narrowing down

The hacked sites belong to different people and are hosted by different hosting providers. Other sites (both WP and non-WP) on the same servers are not affected. They are all WordPress blogs. Many of them are up-to-date (run the latest version of WordPress). So it’s neither a server-wide hack, nor an intrusion via stolen site credentials (otherwise we’d see many non-WP sites). At the same time, it is not a core WP hack. In my experience, this usually means that hackers used some backdoor script.

Actually, this is where webmasters of compromised sites can help me. Usually a log analysis + a server scan can provide a very reliable information about the attack vector: vulnerable files and backdoor scripts. Please, contact me if you have raw access logs for July.

.htaccess

Sometimes, I saw two different blogs on the same server (and most likely under the same user account) with the same doorway pages. Moreover, while blogs themselves looked different, the doorway pages used a template of only one of those sites and had links to that site only.

I think this happens because hackers created a .htaccess files with rewrite rules above the site root (quite a prevalent trick with .htaccess hacks). The rewrite rules map the doorway URLs to some .php script.

Caching

All doorway pages and images are cached somewhere on the server. Unlike other SEO poisoning attacks that I wrote about, they are not generated on the fly. If you specify some different keywords in the URL, you will get a 404 error. Moreover this 404 error will be different than normal 404 error pages of the hacked sites.

Another proof that the spammy content is cached and is not injected at the run-time into live WordPress pages is the timestamps at the bottom of the HTML code and old articles in the “Recent Posts” section. On some sites, instead of a real site template, they use a pre-built empty Kubrick template with the fingerprint that doesn’t change from site to site (WordPress 2.3.1, 22 queries, 0.912 seconds)

Rounding up: If I were a webmaster of one of those hacked sites, I would start looking for rogue rules in .htaccess files in the site root and above the site root directory. The rewrite rules should point to a doorway script. Then the script should point to a cache directory with all the html and jpg files. Then I would try to analyze access logs and scan files on server to find backdoor scripts and security holes.

To webmasters

Regularly check statistics for suspicious requests. In this particular case you can even use JavaScript-based services like Google Analytics since hackers don’t remove your scripts from page templates and not all user requests get immediately redirected to malicious sites — e.g. people may actually open the doorway pages when the click on Google web search results (not image search results). However, raw logs will show more accurate information.

You should also check Google Webmaster Tools for suspicious search queries and indexed pages.

Make sure your WordPress is up-to-date. All themes and plugins come from trusted sources and don’t contain known security holes (check their websites, google them). If your themes or/and plugins use the timthumb.php file, consider updating this file (Its developers are currently actively improving the security).

##
If you have any details about internals of this attack and especially the security hole, please leave your comment below or contact me directly. It would also be interesting to hear your thoughts about the hiccups of this attack (and whether they are really hiccups).

That 46 address seems to be temporary. “yourscan .in” now points to the 193 address again. So does the “x-scan .in” (both registered on Aug 4th).

By the way, today they didn’t change the domains. Instead the landing page address no longer have a subdomain (which didn’t resolve yesterday).hxxp://x-scan .in/index.php?Q0jhfN…
This means that they also changes the TDS code (it would always add random subdomains)

There was a recent vulnerability with TimThumb’s image resizing where it was not checking MIME. Could this be the result of exploiting TimThumb, which is used as a standalone plugin and also used in many themes.

[...] as secure as a platform as I have ever seen, however, it’s main weaknesses lie in plugins. Unmask Parasites briefly touches on how up-to-date WordPress blogs can be compromised: the TimThumb vulnerability, [...]

I believe I was victim of what you are discussing. In fact, I recently received a letter saying we are in copyright infringement for using photos. This is how I found out about the url on my site that I didn’t create.

hxxp://www.e…d.com/?eed=pictures-of-tallahassee’s-new-state-capitol

Is there a way to track who hacked my site and determine when it was done?

I cannot find the source of this page to even delete it off of my FTP or wordpress blog.

[...] Once a site is infected, it’s not always easy to remove all the malicious code. Denis Sinegubko, the Russian researcher who discovered the WordPress attack used to poison Google Image results, has advised webmasters of compromised sites to look for rogue rules in the .htaccess files in the site root and above the site root directory. He has more here. [...]