Google Analytics Short Tail/Long Tail Segmentation

I was just pulled into a conversation with Paul and James about developing a report of some kind in Google Analytics which segments short tail (1-2 words long) and long tail traffic (3 words or more). The advantage being you can then compare them and use it to estimate the amount of long tail gains you would get from going after a short tail keyword.

For example, say you are currently getting 1,000 visits from short tail keywords and 5,000 visits from long tail keywords, this means you are getting 500% more long tail traffic. Now using this ratio you can estimate how much traffic you are likely to receive by going after a keyword, for instance if you target “wooden chairs” you would notice it has 5,400 searches in the UK (according to the Google AdWords Keyword Tool) and if you were to get a number 1 position you might get say 10% of this. So you are likely to get 540 visits from that one keyword, but the long tail traffic would be five times greater than this, 2700 visits (based on the ratio above). So a total potential 3,240 visits.

This may not seem that great, but it makes all the difference when you are considering the ROI of going after a particular keyword in an SEO Campaign. We have noticed that the long tail / short tail ratios differ dramatically between clients – so its a bad idea to use one across industries/clients – you may also want to exclude brand searches, as if it has a strong brand this will skew the data a lot.

Update: These have been updated, based on a regular expression made by Ben Gott.

If you would prefer the regular expression, here it is (this is the short tail reg ex, exclude this to get the long tail) – it shows all one or two keyword phrases it shows all the keywords which contain 2 or more spaces:

David Whitehouse

David has honed his skills as an SEO and has become the "go to guy" for Google Analytics. He's a passionate Internet marketeer and he enjoys helping clients maximise their potential.

Making your inbox more interesting

Looking to keep up to date, or find out those things we can’t mention on the blog? Then sign up to our semi-regular newsletter. Don’t worry, we won’t spam you.

Email Address

24 Comments

Dave

Another thing I like to do is create a ‘brand’ segment to go along with these and then remove them from the other segments. This gives a deeper picture by not having brand related terms in the head and tail segments. This often can mess up the head term data (for 1-2 word brands). So it’s worth doing

David Whitehouse

Yeah, obviously we couldn’t share a ‘brand’ segment, because that would be unique to each client. But yeah, it definitely has a big impact (as I mentioned).

Vipul - http://twitter.com/vipulg

This is great stuff. Segmentation is of great help as we can analyse data from different angles.

Dave - http://www.djb31st.co.uk

Very useful!

Its easy to forget just how much power you get with google analytics.

Great example of a useful segment.

Is there a database of common useful segments? Appears there would be a niche need for this, but google returns nothing

Alex - http://www.analyticsseo.com/

I am always prefer to use seo tools that handle that part too, like http://www.analyticsseo.com/ i have recently tried out this website tool. This tool allows me to monitor my website activities without adding any JavaScript code into our website

Andrew@BloggingGuide - http://webuildyourblog.com

Great analyzation….very enlightening!

Clement Mazen - http://www.autoquake.com

Thanks for this – I had been using a similar syntax but found a way, thanks to an article by Ben Gott on SEL, to apply a more economical and robust one and get a better result, using “\s” to denote a space instead of using “(a-z,0-9)” to denote all other characters.

At least 3 words simply becomes:
include: \s.*\s

At most 2 words being:
Exclude: \s.*\s

Modifying the number of” \s” allows to change the settings regarding the number of words within the search term. Beyond the simplicity of this syntax, I quite like the fact that it “takes care” of words formed with any non-standard alphanumerical characters, in addition to a to z and 0 to 9.
On that note, if you are interested in GA filters and would like to know more, I recommend the O Reilly Regular expressions Pocket reference, as there are many other “shortcuts”. Has anyone got a link to a useful online ressource for RegEx?

DangerMouse

Out of interest why did you select 10% as your figure for a number 1 ranking for the Exact match on that term?

David Whitehouse - http://www.david-whtehouse.org/

@DangerMouse – it was just an example, a rough estimate of what some rankings may get at position 1, it also made the maths much more simple. I didn’t base it on an average of our data from Webmasters Tools or anything like that!

We have done comparisions with the number of visits and the exact match search volumes and often it was in the range of 5-12% (from what I can remember) at position 1.

David Whitehouse - http://www.david-whtehouse.org/

@Clement Mazen that is a much better way of doing it, I’m going to change the filters to that and link him. Certainly an improvement.

David Whitehouse

I’ve updated the filter – there were a few minor issues that needed ironing out, so anyone who has commented – you may wish to update.

Clement Mazen - http://www.autoquake.com

Thanks David for the update. Two more things about the (\s|+) version suggested by Ben, though:

1. if you want to use the filters for paid search data, and use the new Broad Match Modifiers, the use of “+” signs as modifiers will completely throw the filters off:
“+seo”, for example, would then be considered a two-word search term.

2. Even though some searchers occasionally separate words in their query with a “+” rather than a space, it is rare enough to be ignored, and when searchers do use a “+”, they tend to use it as in [search + engine], not [search+engine], in which case the (\s|\+).*(\s|\+) filter does not work as expected, capturing this as a long tail term anyway.

Maybe a better way to tackle the “+” signs would be to set up a profile with a custom filter replacing ” (+|\s\+\s) ” by ” \s) “, then ” (^\s|\s$) ” by nothing and ” \s\s ” by ” \s ” to remove extra spaces (~equivalent to TRIM() in Excel).

It is very exciting to see how much GA functionality is “below the surface”, especially with those RegEx and all the segments/custom filters you can use them on. David, a proper tutorial on GA RegEx would be very useful 😉 Websites out there tend to target programmers rather than web analysts and can are not always very helpful when tackling GA.

David Whitehouse - http://www.david-whtehouse.org/

@Clement – In reply to your comment.

1. I see what you are saying, the filter I have created is purely for organic traffic – I did originally intend it for both PPC and SEO – but under Avinash’s advice I limited it to just organic.

2. I see what you mean, this is certainly a problem, I think we coud fix this by doing some of your suggestions.

Making (\s|\+).*(\s|\+) into (\s|\+)+.*(\s|\+)+ would allow multiple spaces and +’s in a row.

I’d like to stay away from custom filters and just improve or use multiple regular expressions. But I like your suggestions of using ^ and $ – I’ll have a think and see if I can improve it a bit further, see if I can solve the trailing and leading slashes.

David Whitehouse

@Clement, again I’ve updated it:

[^\s\+]+(\s|\+)+[^\s\+]+(\s|\+)+[^\s\+]+

Basically it is anything except a space or plus sign, followed by one or more space or plus signs, followed by one or more characters (no spaces or plus signs) followed by… etc.

My head hurts now.

Clement Mazen - http://www.autoquake.com

Great stuff – that should really nail it!

vdouda

Cool tip. Works great but could you explain a bite more this regex.
What does the ‘s’ stand for?

The \s denotes a space, if you don’t escape it with a “\” it will just be included as a space.

David Whitehouse - http://www.david-whtehouse.org/

Sorry it will just be counted as an s, if you don’t escape it (It is only counted as a space if you escape it).

Daniel Sim - http://www.pluginseo.com

Interesting way to predict visitors from long tail terms.

I do find juggling lots and segments still a bit unweildy in GA. It’d be great if GA grouped keywords automatically to give as good a picture as we get from adwords.

Search Down Under – SEO Cafe Learnings | - pingback

[…] Has your site been affected? Through Google analytics advanced segmentation marketers / search pro’s can segment short tail and long tail terms into 2 categories and analyse the behaviour of each. For those wishing to utilise this segmentation technique refer to the following article which provides a quick link for the segment to be set up within your Google analytics profile – http://www.davidnaylor.co.uk/google-analytics-short-taillong-tail-segmentation.html […]

Andy Beard - http://andybeard.eu

It is all great in theory, but out in the wild you can’t determine longtail in that way.

As an example recently my blog was on the first page of results for “gmail” here in Poland, as high as #4

I have talked to Dave about my “wacky stuff” about a year ago – I didn’t see a gain in search traffic, and this just meant I lost more relevant “long tail” traffic. to a blog post that was relevant 3 years ago.

However interestingly Google didn’t win (so far) the gmail.pl domain in Poland, and that domain has really messed up SEO.

Roy Olders - http://www.seo-sharkx.com/

I thought that there is a rule that a 3 word keyword phrase indicates people that are more willing to buy (and are more specific) than 1 or 2 word keyword phrases.

Get in Touch

Things are better when they’re made simpler. That’s why the David Naylor blog is now just that; a blog. No sales pages, no contact form - just interesting* info about SEO.

If you’d like to find out more about the Digital Marketing services we do provide then head over to Bronco (our main company website) to get in touch.