Trackback Spam

The flood of truly vile Trackback spam today sadly confirms what I was worried about months ago: if we lock down commenting (with things like WP Hashcash and Typekey) it’ll just push them to Trackback. All the captcha in the world won’t fix Trackback. Our last line of defense is the content-based filters like Spam Karma, Spaminator, and Three Strikes.

Related

Post navigation

83 thoughts on “Trackback Spam”

One solution for TrackBack might be just to stick everything in a moderation queue. This should be less distruptive than doing so for comments because comments may be in response to each other, whereas TrackBacks are basically just a statement of linkage.

I find TrackBacks of very limited use anyway – I tend to skip straight over them and scroll straight to the comments, as most TrackBacks are along the lines of “Matt Mullenweg says X…” which I know because I’ve just read the post!

I’m not down with the trackback spec, but surely it’d be easier to trace trackback and a lot harder to fake in consistantly? Couldn’t you just reverse DNS it (plugins for which already exist) and if it hasn’t come from a real website then just delete it? I’m actually quite happy with it all because I knew it would happen! Finally, I predicted something with accuracy

How is pingback different from trackback with advanced moderation that’s defaulted to off? If you’ve got pingback to notify you by e-mail (which you logically would have) then it looks like a pretty sweet way of getting spam back into e-mails.

I got the initial whiff of their systems testing my blog out by gibberish trackbacks yesterday but I had no way to reach you to warn you. I have turned off pings in PHPmyAdmin for now. I’ll have to switch from the stopgap method to spam karma. Boy, it was good while the stopgap worked didn’t it?

I was hit by the same junk this morning, very disturbing. In fact, they’re still coming.

Unfortunately I don’t think that content based filtering will work as all the comment spam I’ve been receiving has been complete gibberish lately. I don’t really know what they’re trying to achieve, who searches for uioouisdajkljkflds?

It would be nice to be able to mark all trackbacks & pingbacks as ‘moderate automatically’.

Also, it would be nice to have a ‘delete’ link in *all* comment notifications, successful or not. Keep up the fight matt.

I have just upgraded Spam Karma to handle this type of TB spam….
Although it is not ideally suited to stop it (lots of its filters are not relevant and I do not have the time right now to add new filters that would do an efficient job), it should do an all right job of stopping the bulk of TB spams (we’ll just have to work a bit tighter on the master blacklist). Anyway, it’s version 1.15 for those interested.

One important thing, though, is that there is no way to stop the notification email from being sent, even if the spam is deleted (other than disabling all notifications). I just commented out the lines in my code, but can’t really write a tutorial for all versions of WP now… it’d be nice if somebody could do it.

In WP 1.3/1.5, one just needs to comment out: the following lines in wp-includes/function-post.php:

Otherwise, I think we need to seriously phase out TBs in favour of PBs. Or at the very least have a few serious checks built-in the TB mechanism (blog has to be linked from the URL provided, anti-IP-spoofing, whatever…) since these cannot be easily implemented in a “comment-spam-plugin” way…

I was thinking along the same lines Simon…I wouldn’t see a problem of automatically forcing all trackbacks and pingbacks into moderation. To go along with something Matt said, if the site is in the WP links, let it go through. While some people also do this for comments, I think it’s a bad policy on comments. TB and PB are a different story if you ask me…it doesn’t matter all that much if a TB or PB doesn’t show up on the blog for a few hours, where as a comment could really add to the discussion on a particular post.

I’ve sent a patch to Ryan to allow for comments to be preprocessed by a plugin before getting into the meat of wp_new_comment. I’ve already modded Spaminator to use the new hook. That way, *all* comment postings can be dealt with in the raw, by the content filter plugins.

Matt, thanks for taking the conditional out and making it work better for the others. It was more a solution I took bec. I was too busy today (still am, actually) for me to install a spam plugin. A torniquet, it was, is all.

I also got a token trackback spam from the “Type Key Spammer”, who I suppose is someone simply pointing out how easy trackback spam is. It wouldn’t be a surprise at all if spammers moved to TrackBack. At least with MT’s implementation of it, it would be much easier to get things through but at the same time, there is far less of a payout for spammers since they can only put in one link (the excerpt is stripped of links). In any case, the problem of locking the front door and having them come in the back is neither a new concept nor one that was unconsidered.

What I’m wondering is whether the big surges in Trackback spam are pan-weblog. I’ve been trying to trace down people talking about Trackback spam today and am finding it to be predominantly WordPress blogs. There have been some MT bloggers, but my own installations have hardly been hit. Of course, I’m using MT-DSBL which trashes submissions from open proxies, so perhaps that’s why.

If they were hitting predominantly WP blogs, that would be a new development. Can’t really see why though considering how open a default installation of MT is to these things. Thoughts Matt?

@alvin, that’s been around for a long time, especially on free sites like Blogger. I wrote a short piece called Blogs as tools of spam a while ago, because I noticed while using the “random site” feature of Blogger that a preponderance of the sites were total spam!

This would be a good time to address this little issue. That would make things a little easier for any helpful dev that comes up with a genious idea.

Of course, the method of removing the incentive would help as well. The “redirect to site via google” trick would at least make doing this not as appealing as it would gain spammer nothing.

However, I hate it when I leave a comment and take the time to enter a url only to discover that I’ve gotten nothing back from it (in the form of a little itty-bitty page rank increase). But here’s an idea…

Since the comment spam filters have become fairly effective (stopgap extreme is working most effectively, btw) you could display comments, tracks, and pings separately – tracks and pings with a “redirect” url, and comments without – at least until we have better tools for fighting, what could be, the next big blogspam issue.

If what you say about type key authenticated spammer is true, then we don have to wait long for spammers to do the blog postings themselves!!!!! Just imagine todays title post: texas hodem rocks! content: Need money? Gamble? Visit bla bla. Scary, but not a distant future. Someone pls tell me i am wrong.

Oh, sadly, they already are. There are spammers who are running blogs (and no, I’m not linking to them). I’ve already run across a handful. They don’t seem to be doing it for the same reasons though. Some are talking about their techniques. Some are filled with links to (ostensibly) their client’s sites. Others are just running normal blogs where 99% of the content is non-spammish, but every now and then a link is thrown in for Page Rank. Although the latter are harder to identify, at least they are just regular blogs and their comments on my site and others are actually on topic.

The line between spammers and bloggers is getting hazy. That’s no good for anyone.

I would have to think that we (as a community) should stop thinking about being reactive, and start being proactive. People are going to jail these days for email spam…. I would like to think that there is enough brainpower within this group to “trackback” the offenders to their source, the open relays, or ISP’s that OPENLY allow the problem to perpetuate. This problem is obviously created by a handful of unholy people who we should OPENLY VOW to track down and prosecute. If you got an IP… U CAN BE TRACKED DOWN!
Just my 2cents…

One of the things, I find intrigueing about Drupal, is the fact that it handles trackbacks just like most other content submissions, and therefore allows its users to apply the same bayesian filter pool to them. http://www.b19s.org had a fair bit of trackback spams lately, and all but one got caught in the SPAM filter. Another idea would be to parse the sending URL for an occurence of the trackbacked string.

This all started for me late last week. On Sunday it was out of control and I killed the trackback feature altogether in the trackback php file. The fact is that trackback, as another user mentioned, is bullshit anyway and a remnant of early blogging days when you wanted to prove that somebody was linking to you. And I suppose it was some sort of cornball community courtesy (sorry!). In fact I find it annoying to go to someone’s comments and find that it’s just a bunch of trackbacks quoting the exact same stuff over and over. The question on my mind is who is experimenting with this spam engine? I just got these inane messages with no links to anything.

I think what we really need is a distributed real-time blacklist. Take Jay Allen, and +1. As soon as a spam outbreak takes place, it should be able to compensate, shield and notify the blog owner that they got spammed (if they were unlucky and one of the early ones), and flag them.

Until we do that, this won’t stop.

Would be good if Google was in on the blacklist, and could make sure they get blacklisted from their index.

I followed the link and read the posts by Tom Raftery but need a KISS explanation.

How do they trackback pings and spam? I understand that both are a scourge but not the mechanics and not being a programmer, some of the posts on this page and Tom’s page just go woosh over my head. I know how to ping another computer but that’s about it…. Can anyone explain to me the basic mechanics of the above in language that even an idiot could understand? P l e a s s s e ?

Apologies if I presumed too much knowledge in my posts. If you are using a blogging application, like WordPress, there is a facility called Trackback, whereby, when you are making a post in your blog, and you refer to a post someone else made in their blog, you can add in the trackback uri of their post (normally displayed at the end of their post) to your blogging software and it will send a notification (called a trackback) to them.

When their blogging software receives this notification (Trackback), it displays the relevant part of the post in the comments section of the site – as my trackback (no. 67 above) is displayed.

Spammers post faked trackbacks directly to the blogging software, pretending someone has posted about one of your posts, hoping your blogging software will automatically display their spam on your site (thinking it is a legitimate comment).

I hope this clarifies the situation for you – I will post this on my blog, as well as here, and if you require further clarification, feel free to contact me.

I actually went a couple steps further… I tracked down the trackback spammers to their hosts and ISP’s and reported them, logging everything they were doing and every IP# they were using and bugging the crap out of the hosts and ISP’s until something was done.

Lo and behold, I haven’t seen hide nor hair of 5 trackback spammers that were abusing the priviledge severely. Sucking up my bandwidth hasn’t been an issue for the past month after this phenomena had been happening to my blog. I may be going to extremes, but it works instead of going in and fiddling with any code.

I have even e-mailed them pretty much threatening them with reporting them to their ISP’s and hosts and getting such nice letter asking me to e-mail them URL’s so I can get off the lists so my site isn’t spidered. I have usually reported them anyway, so it makes no difference to me, but they can get real nice, real fast when you play hardball. 😉

I was hit with 2 trackbacks from a WordPress blogger that reference similar content on his blog, but when I checked to see if my blog’s link was there, it was not.

He probably entered in a manual ping to my blog and didn’t put my URL in. But I wonder if someone could do automate this to try and make it appear like a legitimate trackback.

For example:
1) I write about Skype in my blog
2) A blogger (who spams) also writes a blog about Skype.
3) Blogger spammer does a Google query on “Skype” and pings all the URLs listed in the search query using the excerpt of his “Skype” blog entry, so it appears legit.
4) Blogger gets trackback “links” that appear to be legitimate, but he doesn’t “reciprocate” the link.

Perhaps this has even happened already or is happening? If so, that makes it much more difficult to find the trackback spam.

Note that a couple of researchers at Rice University have released a Trackback spam blocker plugin for WordPress: their Trackback Validator Plugin checks to make sure that the sites which send you Trackback pings really do link to you, a test that most Trackback spammers fail.