The Advanced Guide to Keyword&nbspClustering

The author's views are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

If your goal is to grow your organic traffic, you have to think about SEO in terms of “product/market fit.”

Keyword research is the “market” (what users are actually searching for) and content is the “product” (what users are consuming). The “fit” is optimization.

To grow your organic traffic, you need your content to mirror the reality of what users are actually searching for. Your content planning and creation, keyword mapping, and optimization should all align with the market. This is one of the best ways to grow your organic traffic.

Why bother with keyword grouping?

One web page can rank for multiple keywords. So why aren’t we hyper-focused on planning and optimizing content that targets dozens of similar and related keywords?

Why target only one keyword with one piece of content when you can target 20?

The impact of keyword clustering to acquire more organic traffic is not only underrated, it is largely ignored. In this guide, I'll share with you our proprietary process we’ve pioneered for keyword grouping so you can not only do it yourself, but you can maximize the number of keywords your amazing content can rank for.

It’d be foolish to focus on only one keyword, as you’d lose out on 90%+ of the opportunity.

Here's one of my favorite examples of all of the keywords that one piece of content could potentially target:

Let’s dive in!

Part 1: Keyword collection

Before we start grouping keywords into clusters, we first need our dataset of keywords from which to group from.

In essence, our job in this initial phase is to find every possible keyword. In the process of doing so, we'll also be inadvertently getting many irrelevant keywords (thank you, Keyword Planner). However, it's better to have many relevant and long-tail keywords (and the ability to filter out the irrelevant ones) than to only have a limited pool of keywords to target.

For any client project, I typically say that we'll collect anywhere from 1,000 to 6,000 keywords. But truth be told, we've sometimes found 10,000+ keywords, and sometimes (in the instance of a local, niche client), we've found less than 1,000.

I recommend collecting keywords from about 8–12 different sources. These sources are:

Your competitors

Third-party data tools (Moz, Ahrefs, SEMrush, AnswerThePublic, etc.)

Your existing data in Google Search Console/Google Analytics

Brainstorming your own ideas and checking against them

Mashing up keyword combinations

Autocomplete suggestions and “Searches related to” from Google

There's no shortage of sources for keyword collection, and more keyword research tools exist now than ever did before. Our goal here is to be so extensive that we never have to go back and “find more keywords” in the future — unless, of course, there's a new topic we are targeting.

The prequel to this guide will expand upon keyword collection in depth. For now, let’s assume that you’ve spent a few hours collecting a long list of keywords, you have removed the duplicates, and you have semi-reliable search volume data.

Part 2: Term analysis

Now that you have an unmanageable list of 1,000+ keywords, let’s turn it into something useful.

We break each keyword apart into its component terms that comprise the keyword, so we can see which terms are the most frequently occurring.

For example, the keyword: “best natural protein powder” is comprised of 4 terms: “best,” “natural,” “protein,” and “powder.” Once we break apart all of the keywords into their component parts, we can more readily analyze and understand which terms (as subcomponents of the keywords) are recurring the most in our keyword dataset.

Here’s a sampling of 3 keywords:

best natural protein powder

most powerful natural anti inflammatory

how to make natural deodorant

Take a closer look, and you’ll notice that the term “natural” occurs in all three of these keywords. If this term is occurring very frequently throughout our long list of keywords, it’ll be highly important when we start grouping our keywords.

Paste in your list of keywords, click submit, and you'll get something like this:

Copy and paste your list of recurring terms into a spreadsheet. You can obviously remove prepositions and terms like “is,” “for,” and “to.”

You don’t always get the most value by just looking at individual terms. Sometimes a two-word or three-word phrase gives you insights you wouldn’t have otherwise. In this example, you see the terms “milk” and “almond” appearing, but it turns out that this is actually part of the phrase “almond milk.”

To gather these insights, use the Phrase Frequency Counter from WriteWords and repeat the process for phrases that have two, three, four, five, and six terms in them. Paste all of this data into your spreadsheet too.

A two-word phrase that occurs more frequently than a one-word phrase is an indicator of its significance. To account for this, I use the COUNTA function in Google Sheets to show me the number of terms in a phrase:

=COUNTA(SPLIT(B2," "))

Now we can look at our keyword data with a second dimension: not only the number of times a term or phrase occurs, but also how many words are in that phrase.

Finally, to give more weighting to phrases that recur less frequently but have more terms in them, I put an exponent on the number of terms with a basic formula:

=(C4^2)*A4

In other words, take the number of terms and raise it to a power, and then multiply that by the frequency of its occurrence. All this does is give more weighting to the fact that a two-word phrase that occurs less frequently is still more important than a one-word phrase that might occur more frequently.

As I never know just the right power to raise it to, I test several and keep re-sorting the sheet to try to find the most important terms and phrases in the sheet.

When you look at this now, you can already see patterns start to emerge and you're already beginning to understand your searchers better.

In this example dataset, we are going from a list of 10k+ keywords to an analysis of terms and phrases to understand what people are really asking. For example, “what is the best” and “where can i buy” are phrases we can absolutely understand searchers using.

I mark off the important terms or phrases. I try to keep this number to under 50 and to a maximum of around 75; otherwise, grouping will get hairy in Part 5.

This exercise provides us with a handful of the most relevant and important terms and phrases for traffic and relevancy, which can then be used to create the best content strategies — content that will rank highly and, in turn, help us reap traffic rewards for your site.

When developing your hot words list, we identify the highest frequency and most relevant terms from a large range of keywords used by several of your highest-performing competitors to generate their traffic, and these become “hot words.”

When working with a client (or doing this for yourself), there are generally 3 questions we want answered for each hot word:

Which of these terms are the most important for your business? (0–10)

Which of these terms are negative keywords (we want to ignore or avoid)?

Any other feedback about qualified or high-intent keywords?

We narrow down the list, removing any negative keywords or keywords that are not really important for the website.

Once we have our final list of hot words, we organize them into broad topic groups like this:

The different colors have no meaning, but just help to keep it visually organized for when we group them.

One important thing to note is that word stems play an important part here.

For example, consider that all of these words below have the same underlying relevance and meaning:

blog

blogs

blogger

bloggers

blogging

Therefore, when we're grouping keywords, to consider “blog” and “blogging” and “bloggers” as part of the same cluster, we'll need to use the word stem of “blog” for all of them. Word stems are our best friend when grouping. Synonyms can be organized in a similar way, which are basically two different ways of saying the same thing (and the same user intent) such as “build” and “create” or “search” and “look for.”

Part 4: Preparation for keyword grouping

Now we're going to get ourselves set up for our Herculean task of clustering.

To start, copy your list of hot words and transpose them horizontally across a row.

List your keywords in the first column.

Now, the real magic begins.

After much research and noodling around, I discovered the function in Google Sheets that tells us whether a stem or term is in a keyword or not. It uses RegEx:

=IF(RegExMatch(A5,"health"),"YES","NO")

This simply tells us whether this word stem or word is in that keyword or not. You have to individually set the term for each column to get your “YES” or “NO” answer. I then drag this formula down to all of the rows to get all of the YES/NO answers. Google Sheets often takes a minute or so to process all of this data.

Next, we have to “hard code” these formulas so we can remove the NOs and be left with only a YES if that terms exists in that keyword.

Copy all of the data and “Paste values only.”

Now, use “Find and replace” to remove all of the NOs.

What you're left with is nothing short of a work of art. You now have the most powerful way to group your keywords. Let the grouping begin!

Part 5: Keyword grouping

At this point, you're now set up for keyword clustering success.

This part is half art, half science. No wait, I take that back. To do this part right, you need:

A deep understanding of who you're targeting, why they're important to the business, user intent, and relevance

Good judgment to make tradeoffs when breaking keywords apart into groups

Good intuition

This is one of the hardest parts for me to train anyone to do. It comes with experience.

At the top of the sheet, I use the COUNTA function to show me how many times this word step has been found in our keyword set:

=COUNTA(C3:C10000)

This is important because as a general rule, it's best to start with the most niche topics that have the least overlap with other topics. If you start too broadly, your keywords will overlap with other keyword groups and you'll have a hard time segmenting them into meaningful groups. Start with the most narrow and specific groups first.

To begin, you want to sort the sheet by word stem.

The word stems that occur only a handful of times won’t have a large amount of overlap. So I start by sorting the sheet by that column, and copying and pasting those keywords into their own new tab.

Now you have your first keyword group!

Here's a first group example: the “matcha” group. This can be its own project in its own right: for instance, if a website was all about matcha tea and there were other tangentially related keywords.

As we continue breaking apart one keyword group and then another, by the end we're left with many different keyword groups. If the groups you've arrived at are too broad, you can subdivide them even more into narrower keyword subgroups for more focused content pieces. You can follow the same process for this broad keyword group, and make it a microcosm of the same process of dividing the keywords into smaller groups based on word stems.

We can create an overview of the groups to see the volume and topical opportunities from a high level.

We want to not only consider search volume, but ideally also intent, competitiveness, and so forth.

Voilà!

You've successfully taken a list of thousands of keywords and grouped them into relevant keyword groups.

Wait, why did we do all of this hard work again?

Now you can finally attain that “product/market fit” we talked about. It’s magical.

You can take each keyword group and create a piece of optimized content around it, targeting dozens of keywords, exponentially raising your potential to acquire more organic traffic. Boo yah!

All done. Now what?

Now the real fun begins. You can start planning out new content that you never knew you needed to create. Alternatively, you can map your keyword groups (and subgroups) to existing pages on your website and add in keywords and optimizations to the header tags, body text, and so forth for all those long-tail keywords you had ignored.

Keyword grouping is underrated, overlooked, and ignored at large. It creates a massive new opportunity to optimize for terms where none existed. Sometimes it's just adding one phrase or a few sentences targeting a long-tail keyword here and there that will bring in that incremental search traffic for your site. Do this dozens of times and you will keep getting incremental increases in your organic traffic.

What do you think?

Leave a comment below and let me know your take on keyword clustering.

Need a hand? Just give me a shout, I’m happy to help.

About tomcasano —

Tom Casano is the founder of Sure Oak, an SEO agency on a mission to help people live their most amazing and fulfilled lives. Tom is an SEO strategist with core competencies in link earning, ROI analysis, content planning, and keyword research. Sure Oak cares about helping purpose-driven organizations achieve remarkable, long-term growth to make the world a better place to live in with SEO. Tom also hosts The Sure Oak Podcast and is passionate about SaaS, FinTech, and growth marketing.

Solid topic here Tom, I am definitely a big fan of targeting multiple keywords on one page and totally agree that It’d be foolish to focus on only one keyword, as you’d lose out on 90%+ of the opportunity. Great article sir!

Thanks so much Cheryl! Yes, there are just so many long-tail keywords and modifiers that can go into one page, it is mind-blowing. You can discover a lot of these in Google Search Console when you filter for one page and look at the queries listed. You might be in position 50 for a keyword that has a modifier that you never even put on your page. So you can simply go back and add those words in! Easy to do, and more incremental traffic for your business. :)

Thanks Tom Casano for your advice. They are very useful and necessary to plan the creation of content on our website.

For example, we can create an entry on our website for each of the keywords found. In this way, we can then create an internal linkbuilding strategy to internally link different pages with different anchor text to enrich the keyword semantics of each url we index.

Also, those groupings of keywords that we can also include some strategically in subtitles (h2, h3 and h4) of our different articles.

Thanks for those other tools for keyword collection, Sunil! It's amazing how many keywords you can find when collecting keywords to consider, it almost becomes endless (but lower search volume). Yes, Google Sheets is all I use for spreadsheets these days. You're very welcome, happy to help. :)

It is really up to you and how far you want to take it. I have done a few projects of 10k+ keywords -- but it can get to an extreme where the 2,000th+ keywords is fairly useless if you haven't created pages/content or anything for the first 500 or so. :) You get the point... It can be endless, totally up to you. I just like to know the full landscape of opportunities from the get-go! :)

Trust me Tom, this is exactly what I was looking for. When I
looked at your excel sheet, I literally jumped out of my chair. I work on this
kind of data on a daily basis. I mean, with so many KW research tools around
(both paid and free), it is easy to find yourself sitting on a huge pile of
potential keywords. I have a client from trade show and display
banner industry. Spending 30 minutes on KW tools for just one KW “Roll-UP
Banner” for his website, gave me several thousands KW, but no practical
solution to work with them. I was thinking of creating that many pages to
target all the keywords, which is just not possible even with decent
budget.

This
is definitely a better and a scalable approach to work with huge stacks of keyword.

Thanks
for explaining everything in detail. Much appreciate, specially for the excel formulas.

Hey Salan, you're very welcome! I really appreciate you sharing how helpful this has been for you! To me, this is really just the start before we start mapping clusters and planning out content... There are so many possible ways to implement keyword clusters onto a site. For example, some clusters can be broken into sub-clusters that themselves should be mapped.

I'm happy to hear about how you jumped out of your seat! Haha. :)

We should also be thinking about what type of page will satisfy user intent (content vs something else), competitiveness, and buying intent. Thanks again for sharing your feedback, Salan. :)

Thanks Jean-Christophe! It's been fun talking with you about this in our direct messages. It seems like the depth and amount of work/research you've done in this area -- and the extent with tens of thousands of keywords and programming with R -- that you have become quite an expert in this niche of SEO as well! Looking forward to hearing more about your process and approach. I'm sure I will learn a lot from you. Cheers.

Thanks, Roman! I really appreciate that and it means a lot to me. It's wild how far the rabbit hole of something as simple as just "keyword research" can go. I often find that clients might initially have a list of maybe 50 or 100 keywords. They're only seeing the tip of the iceberg! :)

Yes completely agree. Is one the area where I decided to improve (Keyword Research). I have been researching studying and testing all the Articles and Techniques available on the Top SEO blogs. For sure that your article will be on my bookmarks list

I was planning to create an article with the results of this research and will let you know. You will see how others implement your technique and maybe you can share us some guidance

Hi Nikola! Thanks for your thoughts. I'm happy to hear you found it interesting. There are definitely moments when it could be monotonous or boring (that is the time to put your favorite music on your headphones while working :) I have had to keep learning more about using formulas in Google Sheets over time as some things just have to be done automatically and in bulk. Yes, it's not the most exciting subject -- but when you can save 30 minutes of mindless or repetitive work, then it becomes highly interesting!

This article was indeed very very technical... and I believe that to the non-SEO, their eyes may glaze over. But I wanted to share my entire process and all of the nitty gritty details in case anyone wants to replicate it. So I even included the technical details too. Cheers. :)

This is one of the deepest article on "Keyword Optimization" & the knowledge and information you showed is really phenomenal! Just one this I would like to share is, some time "GA site search" also helps you finding right keyword to target right content to the right audience! What do you think on this?

Thanks, Ankit! I'm glad you really enjoyed it. I would replace "GA research" with "Google Search Console research" -- as that is where all of the keyword data is. GA hides most of the keyword data, whereas Search Console gives it to you. Hope that helps!

Oh I love this, Tuhin! If you have any screenshots or examples I'd love to see one of your dendograms. I do agree that clusters can and certainly should be hierarchical with sub-clusters underneath them. Great point, looking forward to hearing more on this. Thanks.

Fantastic article on what really is the next stage in the "keyword research" area. It's simply not enough to target a single term with Google's intelligence in determining meaning from a search result behind just do the keywords and their order match up.

For a Guide, this seems short. I expected it to be long. But anyway, this was a perfect utilization of my time because after investing mere minutes, I was able to understand a lot more about Keyword Clustering and a simpler way of how to do it (step-by-step).

I'm actually learning SEO and I have realize that I've been doing it the wrong way. Keywords is one of them. SEO looks like a lot but I'm learning and keyword clustering just one of the newbies only knowledge book. But then I ways have issues with integrating a KW without messing my article up.

Hi Onome, yes any one sub-field of SEO can get more and more in-depth. Especially technical SEO, as an example.

I'm not sure if there is a definitive "right" and "wrong" way to do SEO. I do think that there are some approaches that can be more beneficial than others though. Adding in keywords must be done naturally and good for the reader and user experience instead of just for search engines. Best of luck!

It is so challenging for me to pick 1 or 2 points and say wow these are valuable as this entire article is a treasure of innumerable key elements which all SEOs should leverage on when doing Keyword researches for their clients!!

I see so many SEOs following an inaccurate Keyword Research Strategy for their clients however, now when I see any one do that I will simply send then a link to this article to help them enhance their Keyword Research process and build a stronger strategy for their clients!!

Thanks so much, Namrata. Your feedback means the world to me. :) I agree... I thought it was strange that nobody is talking about this... And there isn't much information or details about this anywhere on the web. I had trouble finding this myself and had to create my own process and methodology around this. There are just so many opportunities and content planning and keyword mapping is more multi-dimensional than just target this one page for this one keyword. Thanks again for your enthusiasm and sharing this.

This method is so useful and on top of that the creator has no need to get stressed if s/he missed out on any search term or keyword. It was like collecting keywords all over the world and then picking up which suits our necessity. Such a great post.

Interesting article Tom !. I always make a post in my blog I focus on the key word that logically would be more logical for potential customers to put, in order to capture the largest possible number of customers. Throughout the blog I always put more letters, in size h2, so that they stand out more when accounting for SEO. I do what you say in the post, such as brainstorming, synonyms, etc ...

Excellent post outlining how we can target multiple words in a single detailed post . The days when many webmasters used to create one post targeting a single variation of the keyword or keyphrase are long gone.

Now it is advantageous to write a detailed post covering not only the targeted words but also lots of closely related words. I have observed that increasingly a single web page is on the first page of search for lots of closely related search queries.

For example you can cover Pasta recipe, how to make pasta and Pasta with tomato sauce in a single post/ page and also add other related words.

And to your point, in the past SEOs would create a different page for each individual keyword, but yes, one page can target and rank for an unlimited number of related keywords. There are synonyms, phrase variations, long-tail keywords, and so forth. The question becomes -- when to break apart one keyword cluster into 2 or more separate pieces of content or pages to map to. Those are decisions that have to be dealt with on a case-by-case basis depending on user intent and so forth.

Haha, I know! That is one thing that still gets me about Yoast -- But I understand that for the average Wordpress content writer, perhaps even having 1 keyword to focus on is a big step in the right direction. But for people like us, that's not really helpful. Hehe, glad to hear your thoughts, thanks for sharing!

thanks a lot for your article! For Part 1 I would also add the search terms from Google AW as another source if the site is already existing and Google AW is being applied.

I believe I will use quite a lot of your input for future keyword work... Keyword grouping seems to me definitely underestimated. Thanks also for the idea with the "Write Words Tool". It did not know that one before.

Thank you for sharing this outstanding method for Keyword research for SEO and Content Marketing. I was a bit confused on the "right power to raise to". Seems like there should be a way to approach that determination. This would make a great Webinar to expand and walk people through the process.

Hey Mike! Thanks for the kind words. I'd love to do a webinar on this some time. :)

The reason that I use that formula and raise that variable to a power is to give more weighting to the idea that -- if there is a 4 word exact match phrase that is recurring a LOT, then I want to give more weighting to that when analyzing the terms/phrases of keywords, than a one-word keyword that recurs the same amount. You can play with the variables in a formula to analyze this depending on how you want to look at it... So I'm not sure if I should give the weighting to a 2-word or 3-word phrase a weighting of 2x or 3x or 4x, so I just play with the formula depending on the project, as it does vary from project to project.

You're very welcome Zainab! I really appreciate your compliments. Yes, it is amazing to me how invaluable these opportunities are -- yet I never hear anyone talking about them. I think this is certainly a great opportunity that many aren't thinking about. :)

A year ago I did some research to find some keyword clustering tools but none of them came 20% of the way to where this process is. I actually had designed a tool/app and still have the UX mockups for the tool to do all of this more seamlessly. It can handle word stems, and more importantly, search volume for terms and automatically do a lot of this for you. Since I felt that not enough SEOs currently follow and use this process, I thought it'd be very useful to me and a few others, but not enough to warrant the development, project management, and investment to build it.

In the meantime, creating scripts/macros in Google Sheets would be very helpful but we still do it all manually, unfortunately. :( Maybe someone will come along and collaborate with us at Sure Oak. :)

An awesome article. I just want to add a point that removing the preposition “to” can be an issue if we manage keywords for travel related bussinesse. Because, it’s obvious that “travel to Us” will completely be different from “travel from us”. So I would be more carefull with the prepositions “to” and “from”.

Ah yes, that is a very great point! There are other instances, for example, the word "it" is actually "IT" (information technology) for some niches, so you couldn't remove that one either.

But actually -- what you're referring to, is phrases as well. So you can even analyze the occurrence of phrases like "to US" and "from US" and also "to the US" and "from the US", and then "from the United States"... and "from America", etc etc.etc. Great point, thanks for the input! :)

Something I noticed around 10 years ago, that as links are built for a page, then the number of phrases and keywords that page performs for increases, so I would recommend that content is built in inconjuction with links, so as the potential of page is recognised then the most is extracted from it, through targetted link building.

Hey Tom, really cool stuff! Just going through it now and thought I would add a little formula hack to remove the need of that "hardcoding" phase!

Instead of using "=IF(RegExMatch(A2:A,"Hotword"),"YES","NO")"

You could use;

=Arrayformula(IF(RegExMatch(A2:A,"hotword"),"YES",""))

This would remove the need to do that step!

Nice one man!

Using this for a little bit of a different process at the moment.

Basically getting all insights from Google Ads seach query report that have converted, grouping them using your method above, cross matching with ranking data and then devising a content upgrade strategy to hopefully increase traffic and leads for a client!