Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

Max Romantschuk writes "This morning while doing my usual log review of reader activity on my weblog, I discovered some rather strange sites, porn sites, which were linking to me. Closer inspection revealed that they weren't linking to me at all, but that someone had falsified the HTTP referrer header to inject the links into my logs." (Read more below.)

Max Romantschuk continues: "It took a moment to realize what was going on, but then it dawned to me, I was being spammed through my referrer logs! A quick google search on the words "referrer spam" confirmed my suspicions, this was indeed a widespread practice, and not new at all. In fact, Wired had an article on the subject dating almost a year back. It turns out the spammers aren't after blog authors, but what they are actually doing is targetting people which publish their referrer logs on their sites automatically. Fortunately, I don't.

I run a very small site, and get about 20 to 50 visits a day, and I don't publish my logs. Not exactly a likely target, am I? Clearly these spammers seem to do this in volume, and the phenomenon is bound to increase as email spamming is becomming increasingly hard. With email spam, IM spam, Windows Messaging spam (NET SEND popups) and HTTP referrer spam, how long will it take until every open technology has to be locked down? I hate to say it, but I doubt Wikis and similar systems will stay open for very long if things keep going in this direction."

The idea behind a Wiki is that anyone can maintain it. The more people that maintaining something, (Linux) means all the more people to remove nasties. In this case the nasties just happen to be spam. As long as copies of the Wiki are kept after every N changes all should be good, just in case a spammer deletes everything...

Due to the fact that anyone can maintain it spammers can add and change it. Now, can any number of people find and delete spam in a Wiki faster than however many bots the spammers decide to throw at it?

I'm contributing to Wikipedia [wikipedia.org], and we have some ways to deal with vandalism. We weren't (yet !) victims of determined spammers with bots, so it's theoritical, but here are things we can use:

first, all changes appear in a special page, so anyone can see them, and switch back to a previous version in history. Anyone can in one click see differences with the previous version

all contributions of users (anonymous or not) are easily viewable by anyone, thus cleaning after finding a spammer is made easier

Personally I don't like people tracking my referrer links. Mind your own business. If you want to see who is linking you, you can do that with google. I know people disagree, since your website is your business. But I don't like being monitored that closely.

Just leave your damn referrer blank then. I suppress the referrer through Opera everywhere, and only enable it on sites which are foolish enough to believe I want to leech their images, and on those maybe one or two sites where I know they use my referrer info for something useful.

But don't set it to some bogus info, or you're no better than these crimina^H^H^H^H^H^H^H spammers.

1) Blank2) Constant value3) Same URL that is being retrieved4) "base" URL of the site being accessed -- ie if you were acccessing http://www.yahoo.com/some/path/some/file.html the referer would be "http://www.yahoo.com/"

> sites which are foolish enough to believe I want to leech their images

Clearly you don't run a site yourself. That happens. There is nothing foolish about checking for it.

It costs me hundreds of MB per month if I don't keep an eye on my logs. If my bandwidth use suddenly goes up and I start seeing the same forum showing in my log hundreds of times, going to one of the URLs inevitably shows some asshat using an image from my site in his avatar or sig.

-----going to one of the URLs inevitably shows some asshat using an image from my site in his avatar or sig.-----I had no idea that referrer IDs and URLs were embedded in pictures. Not that I have a sig or an avatar (a what?) but it's an interesting bit of information for me.

At what point are we going to start tracking our pee after it's in the ocean?

Snooping on people is not really the problem. I don't really care if people blank out their referrer or put something bogus instead. The problem is that by having your logs constantly spammed, your log data becomes useless. If you're using a log analysis program like webalizer, your total hits, visits, etc are way out of wack because only 1 out of every 3 or so hits is legitimate. You can't get an accurate picture of how many hits your site is actually getting. I don't know how it happened, but my site

... check out what the Herbalife and other MLM scumbags are doing to Monster.com. This phenomenon appears to be spreading over the entire net.

I have used Monster.com [monster.com] on several occasions, and even found a contract there a couple of times, and I was even considering advertising on their site. In just the last week or so, however, I have noticed a new trend that is rapidly rendering Monster.com completely worthless. Seems that my current job search agents (for C++/C#/Java programming) are returning dozens

I was having the same problem; getting literally thousands of hits to my site from referrers for all kinds of porn and other random domain names. I did a google search and found this site: http://www.spywareinfo.com/articles/referer_spam/ [spywareinfo.com]. It shows how to use mod_rewrite with apache to block the most frequent domains. I took Mike's blacklist and created this page [resynthesize.com], which automatically creates the.htaccess file for you. The problem is that they seem to be registering tons of new domain names so it's hard to keep up a decent blacklist.

Web sites can be defaced. This is typically thought of as illegal. Does the level of security on that site affect the legality of the defacement? Just because a wiki is more easily editable than an otherwise non-secure site should not automatically allow hijacking of that site for purposes other than those intended by its owner. Would the appearance of 'specific wording' on the site make enforcement of this easier?

Except that one guys defacement could be another man's legitimate posting. Take a look at the average message board. People make trolling, yet related comments everyday. Who is to say that is or isn't vandalism? Perhaps a better course of action would be to limit the number of posts in a given day. I would think 10 wiki posts (they should be insightful) would be more than enough. Sure bots could trash the site, but it would be too slow and painful for the average spammer.

I don't publish the logs on my very small, low-traffic site and I get quite a bit of this as well. All of the non-legitimate referrers on my site seem to be weblogs. No porn so far. I just ignore them. Referrer stats are the least useful part of my logs anyway.

As for solving the issue of false referrers, why not just modify where the referrer ends up based on whether the specified referring page actually has a link to you or not. The distributed effects of zillions of bloggers all spamming the spam site with automated HTTP requests should be enough to dissuade the spammers from continuing:)

OK... I'll bite.. How do you tell if the page actually has a link to you without trying to fetch the page and seeing if there's a link? This gets particularly interesting when you deal with content generated on the fly - there's a very good chance that my Slashdot page has links on it that aren't on yours, for instance, and which also won't be on the page your proposed automatic verifier will get if it blindly chases the Referer: URL back.

I would think that it would easy enough to send a spider to the referrer page and search for the referred page. If you don't find it, delete it from the log. In fact, you wouldn't even need the spider because the link should be the exact page anyway.

This also becomes a means to maintain the blacklists other have mentioned.

Thanks for pointing out that this is spam!I also get these "referers".

The sad thing is, that it is nowadays half-criminal to do a ping/traceroute to a certain host (Considered preparing an attack) but these spammers can generate their high volume(!) traffic, out of every RFC borders, and don't get problems at all.

"A quick google search on the words "referrer spam" confirmed my suspicions, this was indeed a widespread practice, and not new at all. In fact, Wired had an article on the subject dating almost a year back."

Thats not clue enough that maybe your lack of knowing about this isn't newsworthy?

"A quick google search on the words "referrer spam" confirmed my suspicions, this was indeed a widespread practice, and not new at all. In fact, Wired had an article on the subject dating almost a year back."

Thats not clue enough that maybe your lack of knowing about this isn't newsworthy?

My lack of knowing about it may also be an indication of this being a legitimate issue despite of being less than common knowledge. By managing to get this article published I may have raised public awareness of this

Unfortunately, as long as everyone has rights to post on the internet (in one way or another), somebody is going to abuse that priviledge. Since when are HTTP Referrer logs considered good content. You don't really have control over what goes into those logs anyway. We need to find more ways of filtering out bad content. It's like free speech. Ham radio and usenet have their share of nuts. Most people don't turn to those places for news. We tend to filter information on our own.

I'm in violent agreement about needing ways to filter bad content, but not about filtering info on our own. Usenet not only has its share of nuts, its signal/noise ratio is awful. So isn't Slashdot a way of addressing that shortcoming, by having top-level stories moderated by trusted users, and comments trusted by experienced ones, allowing people to filter not just by reading stuff but by automatically avoiding stuff?

Lest anyone think this is offtopic, let me point out that Slashdot has in some sense