How to Remove Semalt from Google Analytics

Google Analytics may be reporting a new website that is sending visitors your way: Semalt. Sadly, Semalt isn’t sending real human visitors; it is a keyword research programme or robot that is being counted incorrectly in your statistics.

If you go to your Google Analytics and look at Acquisition > All Referrals, you will see Semalt referral traffic there.

In the report, every visit from Semalt is new, and every visit from Semalt has a 100% bounce rate.

You may notice that I do not link to the Semalt service in this post. Semalt is selling a keyword ranking service that I am NOT recommending. You may find this review of Semalt very interesting. Services like Semalt should NOT appear in your statistics, in the same way any other automatic indexing service should not appear.

Semalt typically sends small numbers of visitors each month, and usually isn’t a significant referral. Having this robot data in your statistics does, however, skew the averages with, reporting higher than average new visitors and higher than average bounce rates. You may also find that your statistics are reporting artificial page views due to Semalt’s visits.

And if you are using a hosted blogging service like WordPress then you will find WordPress.com has now blocked Semalt. If you, like us, use WordPress on your own server (rather than the hosted version) then you will need to block the Semalt visits.

Step 5: Amend the fields to just like they are in the screenshot below. Make sure that ‘Exclude’ is selected and ‘semalt.com’ is entered into the Filter Pattern field. The filter will also block all sub domains of Semalt such as 34.semalt.com as well as the main domain.

Click Save and that’s it! You have now excluded Semalt from your referral traffic data.

Once you have your filter live, keep in mind that it will only filter the data from this point forward. It does not retrospectively filter the visits out.

Whilst here you may wish to filter out your own traffic which can skew statistics by blocking your own IP address with another filter. You may also need to keep an eye out for other spammy referral domains that are skewing your statistics and preferably block them from entering your website.

For more hints and tips, please see the comments posted, feel free to post a comment or contact us for dedicated Google Analytics Consultancy.

67 responses to “How to Remove Semalt from Google Analytics”

Hi Susan – I’m no expert, but I believe that although your filter pattern (“semalt.com”) will work, it should be a POSIX Regular Expression (so “semalt.com” with a backslash). The dot on its own means “any character”, so technically your filter will also block a referrer called “semaltacom” or “semaltocom”, not that they’re likely to exist.

Let me tell you about Semalt.
Semalt bots harvest statistics for web analytis service and cause no harm. Those crawler bots have 100% bounce rate and don’t click on advertising banners (cpc, cpa, cpm systems) or extend links. All the visits are automatic and random.
If you want to exclude your site from Semalt database, please follow this link: http://semalt.com/project_crawler.php

Thanks for the information, Nataliyia, but I am NOT suggesting our clients submit their web address for exclusion.

I would like to highlight to our readers that:
– your spider is taking up our server bandwidth without our permission
– your spider does not appear to comply with instructions in robots.txt
– your spider irresponsibly distorts our Analtyics data

I find your suggestion that our readers should have to GIVE you their web address top stop the crawling to be unethical and inappropriate. Personally, I’m not happy to give you my web address for exclusion.

Thanks for the filter instructions; it doesn’t seem to work though; the domain still appears in the referrals list.
The best way to block them properly is probably with code on your website using your .htaccess file or with PHP. That way you’ll also prevent them from using up your site’s bandwidth and possibly scraping your site’s content. They seem the type that would do that.

The filters haven’t worked for me & I am using a google blog & was told you can’t change or edit the .htaccess file on a google blog. If that’s true, does that mean there is no way for me to block them?

Tony it’s probably best in that case to use the IP exclusions instead of the filters. You can find out you IP by literally googling “what’s my IP” and google or any number of sites will tell you. Then exclude this by following the instructions here: https://support.google.com/analytics/answer/1034840?hl=en-GB

Does the new “Block spiders and crawlers” checkbox in Google Analytics not work to block crawlers like semalt? I wrote a quick post on how to do this but I heard from one person that this wasn’t working. Does anyone else have any feedback on this?

Hi Brent, we have setup a new view for the Hallam account with the “Bot filtering” box ticked in the Admin area “View settings” and one view without it ticked. We will let you know the results as soon as possible.

Hi Peter, are you looking at historic data? It will not remove any old Semalt visits but will filter out any future visits from now on.
If that isn’t the case please send us a screenshot so we can confirm the settings you have.

easy and straight to the point. I was tired of seeing these guys distort my analytics and I was not willing to go to some random website and submit any information. So glad to knock this all out within Google’s trusted website. Thanks for the tutorial Susan.

I find it fascinating that you can get penalized for the least little thing due to Google’s increasingly strict and unpredictable guidelines (claiming it’s all about pure results when we know there’s more to it than that) yet they can’t stop Semalt from showing up and skewing our stats? It’s not like this just started yesterday. It’s possible that something else is going on – some kind of relationship perhaps. It goes on all the time. That’s why you’re seeing all the big boys at the top of the serps these days. Think it’s just coincidental?

I tried out the filter in both Custom and Predefined mode and get the same message:”This filter would not have changed your data. Either the filter configuration is incorrect, or the set of sampled data is too small.”

I have a few sites; some with lots of traffic and some not, and I’ve double checked my inputs so I don’t know which part of the response from Google is correct.

We get the same message from Google when trying to verify the filter, we saw this yesterday in fact, from a site with 20-30 referrals from Semalt.com recorded each day on average.
After a full day of implementing the filter you should see all future referral traffic from Semalt.com drop to zero which proves it works.

Thanks for the article Susan. I have been exploring different options on blocking Semalt. I found that not too long ago Google Analytics released the option to Block Spiders and Crawlers, but with a few accounts I manage, it appears that they still sneak in. I have also tried the suggested Filters method (haven’t tested this one yet). I have also found a few other ways to block semalt from referral traffic. Someone mentioned going to their website and opting-out I don’t like that option, because I have a sneaky suspicion that Semalt sells your domain to others once they know you have a valid website that is monitored. I also have been adding semalt to the Referral Exclusion List, which seems to be working rather well. I also like the option of blocking them via .htaccess, kind of sad that you’d have to go to that extreme to block them. Have you or someone else tried some of these other methods?

Susan, Jonathan, et al – thank you for this info. But my experience is that this does *not* work. Not only does the “Verify this filter” warn that it won’t alter data, saving it results in no change – I still have semalt.com “referrals” all over my results a month after setting up the filter per your instructions.
BTW, the suggestion to use the text “semalt.com” is rejected by Google – it gives an error msg about a filed containing invalid data.
Any suggestions?

Will this also work for other weird referral sites? I have:
buttons-for-website.com
econom.co (this one is bringing me a LOT of fake traffic and I want it gone!)
forum.topic44207784.darodar.com
make-money-online.7makemoneyonline.com

Yes this will work for any referral URL.
I would just block the root domain, for example “7makemoneyonline.com”, instead of the sub-domain, for example “make-money-online.7makemoneyonline.com”, if it’s causing issues to totally block the traffic.

Thanks! Now it seems like iloveitaly.com is causing problems! I googled and it seems like a lot of people are talking about something called .htaccess. Why wouldn’t they just use this easy filtering process in GA that we are using??

The .htaccess file allows you to block or redirect traffic entirely in case a bot such as Semalt’s was slowing your server down or flooding it with requests. This method hides the traffic from the statistics only and doesn’t require any development.

Semalt has become a real annoyance, they no longer just operate semalt.com but multiple domains and subdomains turning this into a cat and mouse game.

While filtering Analytics is a first good step to keep your own data clean, I believe that blocking this traffic via .htaccess is better way to go if you have the technical know-how. Blocking this traffic takes an active step to show that this is unacceptable behavior on the Internet.

As long as you allow their traffic on your site, even if its out of sight via Google Analytics filter, you provide them with the data that they run their business on.

I recently went as far on one website to display a ‘Semalt not welcome’ alternate content page if their referral was detected.

Interesting. I started seeing Seamalt in my statistics and thought, “Oh cool! They send lots of referrals.” So I wanted to see if I could find info about it. And I found all of this useful info. Sad that they aren’t real. They comprise over 15% of my traffic.
So I made the filter and will see what happens.

Good article, but i am with those that take the .htaccess fix. Why blind yourself to the stats but leave your self open to someone abusing your website/server/bandwitdth?

On a slightly different note… has anyone had forum.topic58506415.darodar.com (russian based) showing up in their referrals?.. this is a bit of special case all in itself, as it appears to be a referral spam attack designed purely to hit at GA itself.

I’ve had quite a few in the past weeks to my site www photographer-kettering co uk. It looks to be a little bit of code ‘attack’ crossed with a bit of a social hack.

THIS DOES NOT WORK! I’ve done THIS in analytics, updated the robots file and the .htaccess. Absolutely nothing works. No matter what I do, it keeps showing up, over and over and over again and there is absolutely no way to stop it at all.

Does anyone have an actual solution which has worked to any degree in any capacity? I need those numbers to be accurate right now and I can’t keep going through this day by day and trying to do the math to figure out what’s real and what’s just this idiot from Russia.

I had this for a while and they always seemed to get through somehow but I signed up recently for CloudFlare and that stopped this referral spam instantly, which rather makes me wonder why Google can’t seem to do it themselves.

This Semalt problem wasn’t the reason I initially signed up for CloudFlare but it certainly brought a smile to my face when I checked CloudFlare’s analytics and saw they had been blocked.

Thank you all for participation and adding value to the online community. For those who are interested in getting this issue resolved, I’d recommend the .htaccess method.

1) if you’re not sure how to make an .htaccess file, just open any text editor such as (Notepad for Windows users) or (TextEdit for Mac users) and save as .htaccess (no name, just file extension. example, filename.pdf, filename.html) this is a no file name .htaccess

2) Paste in this code. Notice the pattern of the pipe (|) which means (or) and the backslash () which escapes the dot (.) since it has it’s own meaning in Apache Directives. The “.” is a special character normally means that one character is unspecified. If you see any suspicious url just simply add it to your line along with the rest of the urls included with . being escaped with .