How Content Influences Rankings – What We Found Out After Analyzing 3,7 Million Pages

You’ve heard it a thousand times by now already: write high-quality content and the ranks will follow. Or Content is king! But what does that even mean?

To shed some light once and for all if and how content influences rankings, we’ve performed an in-depth analysis on almost 40,000 keywords and around 4 million pieces of content. And the results are GOLD.

Content marketing, among other SEO strategies, is really a hot topic these days for any industry. But I bet you already knew that. What we think you might not know is the exact correlation between SEO and content.

Do you know how the content of your website actually impacts your search engines’ rankings? Or do you know the exact key elements your content needs to have in order to rank in Google’s top results?

We have good news and bad news. The good news is that after having analyzed a whopping set of data we’ve got it all sorted out: content’s correlation to Google rankings, metrics to measure content’s performance, key elements that boost content + many other data that we’re sure is going to impact your website big time.

The bad news? Well, the study is quite large and to get the best out of it you will need to thoroughly read this article. And yes, it’s quite long but, for the sake of your business, totally worth it.

1. A High Content Performance Will (Almost) Guarantee You a Top Google Position

Do you know that saying: Save the best for last? This time we decided not to take it into account.

Admittedly, we are too excited about the findings so we couldn’t help but share the big news with you:

Content does influence rankings but we also figured it out why and how.

If you torture data long enough, it will confess.

As you can see in the screenshot below, after compiling and analyzing all the ranking content for the top positions in Google, for around 40,000 keywords, we realized that there is a very strong correlation between content performance and rankings (and therefore organic traffic we might say).

The higher the content performance score, the better the rankings.

The content performance metric shows you how well a page is optimized from a content point of view, on a scale from 0 to 100. We needed a way to scale content when correlating it with rankings and content performance was great for this. You can find out more about Content Performance here or later in this article.

At an informal level, we always knew that content strategies have a lot to do with rankings and quality content creation improves your rankings. Google itself admitted it and they even created an update dedicated to this issue and the importance of relevant content.

When you actually see real numbers that show to what extent content performance influences rankings, it’s totally different.

So let’s do a rundown of these interesting observations.

The first thing to notice when looking at this chart is that good content performance is within average values – almost within the middle two-thirds of a 0 to 100 scale. There are no extremely high or extremely low content performance scores. Not only that but along the represented interval, points are quite evenly distributed – there are no quick jumps or sudden falls, everything follows the same, smooth line.

This is another important observation: the trend line is a lot smoother (almost perfect, from corner to corner). There are only 4 instances in this chart where points don’t follow the “high content performance equals high position” rule and those are very subtle, almost unnoticeable. If the vertical axis went from 0 to 100 instead of zooming in at the current interval, we’d probably not notice those points at all.

Last but not least, there is another visual cue to be found in the graph: a sort of a “cut-off” point at a score of 50 for the content performance, which almost evenly splits the first 10 positions from the following 10. While not specifically significant in itself, it is consistent with the other observations and strongly points that:

There is a very strong correlation between a high content performance scores and ranking on a high position.

2. Commercial vs. Brand Keywords: How Do They Correlate With Rankings

There are 9 million bicycles in Beijing and over 1 billion websites in the world. And all these billion websites are interested in targeting all sort of keywords.

At a holistic level, we realized that the keywords analyzed by us within this research, fall into one of the two categories: commercial keywords and brand keywords.

Brand keywords refer to the keywords strongly related to the name of the brand. For instance, “cognitiveSEO” is a brand keyword we are interested to rank for.

Commercial keywords are the ones that have a direct profit – making intent. For instance, in our case “best SEO blog” would be a commercial keyword.

The question now remains: does the content targeted for commercial keywords correlate differently with rankings than the content targeted for brand keywords?

Commercial Keywords – Ranking Correlation

Just by taking a look at the chart below it’s crystal clear that when it comes to the content targeted by commercial keywords, the correlation between the content performance and top rankings is extremely strong.

This means that when it comes to commercial keywords, content has a very high impact on rankings.

It’s almost impossible to keep up with the link amount a big brand is getting. Yet, content performance is something that is in your full control and you can take advantage of it.

Brand Keywords – Ranking Correlation

We realized that the same strong correlation that applies to the overall content performance and rankings, applies as well when it comes to brand keywords in particular.

It seems that the better the content performance score for the brand keywords’ content, the better the rankings.

Yet, we couldn’t help noticing a particularity of this correlation. Yes, it’s a strong one (we ran several regression tests on it to make sure of that) but it’s slightly different. As you can notice in the chart above, the first position has almost the same content performance score as the 5th one (#1 scores 46, # 5 scores 45). And this situation replicates to some extent to other situations as well.

And, as we are prayed upon the idea: if it works, why? If it doesn’t why?, we took a closer look at this situation.

Content matters a lot when it comes to brand keywords, yet, not as much as it matters when it comes to other types of keywords. And this is because we believe there are other external factors that boost a brand name to the top of the rankings.

Let’s take, for instance, “Brand Mentions”. It could be a commercial keyword, yet, is also a brand name: brandmentions.com.

Even if the content performance score for brandmentions.com score isn’t as big as for other pages targeting the same keyword, the brand itself will be the winner and it will rank first.

And this is because Google is really good at semantical search and understands the intent of the user, firstly. Not to mention that the brand itself is most likely the topical authority for the brand keyword and it will be automatically ranked among first.

Yet, content performance seems to matter a lot in the case of brand keywords as well. So just because you are targeting your own brand name doesn’t mean that you don’t have to put efforts in creating a high performing content.

3. How Do Domain and Page Performance Influence Rankings

The bitter-sweet truth is that content alone does not boost rankings. There are other ranking factors that interfere when it comes to SERP position. And among those factors there is the domain and the page’s performance.

We can refer to domain/page performance or authority. We used the term “performance” for this research as it piles up several factors (domain/page’s age, number of incoming links, etc.)

Domain Performance vs. Rankings

As we can deduct from the screenshot below, overall there is a strong correlation between a domain’s performance and its ranking position.

Yet, this relation is not always smooth and linear. We can see that the domain performance for the pages ranking number one is almost the same with the ones ranking number 4.

Therefore, even if domain performance seems to matter a lot when it comes to rankings, there are cases when other factors, such as content performance, might matter more than the domain’s overall performance.

Page Performance vs. Rankings

When it comes to Page performance, there is clearly a strong relation between the authority of a page and its rankings. The higher the page performance, the better the rankings.

If, after looking at this chart you say something like: I have zero chances of ranking without a high page performance, think again.

When you get to think about it, how does a page become highly authoritative? Among others, when it’s a topical authority on a subject. And how can you become a go-to resource on a subject, the topical authority on that matter? By having relevant and optimized content on that particular subject. Therefore, without planning to make everything about content, we cannot help by noticing how connected the pieces from the ranking puzzle are.

4. How Long Should Your Content Be to Get High Rankings

As we crawled so much data, we couldn’t help taking a peak at the number of words top 20 Google ranking webpages use.

The very first thing you should notice about any chart is the variation range. In this case, it looks like having anywhere between 1400 and 2000 words is associated with being in the top 20 rankings.

It seems that the optimal number of words one page should use in order to rank high is somewhere around 1700 words. Yet, here is a longer subject to debate which we’ve luckily covered in a previous study: Long versus short content, which one ranks better?

The next thing you should do is look for trends and correlations. And here is where things usually get complicated. Even this chart, while seemingly straightforward, points to a few interesting observations. The first is that, in general, it is better to have more rather than fewer words .

There are 11 points on the graph that go in direct decreasing order of both number of words and position. They are not consecutive, so that makes the trend a bit less obvious, but it is there. Even where there are consecutive points that don’t follow this “rule,” the rule is reinstated within maximum 2 points.

The second equally interesting observation is that this rule is visible only from the 3rd position onwards, while the first three positions actually display the opposite trend – the fewer the number of words, the higher the rank. But here comparison between points is again useful.

The number of words associated with the first position, while lower than the number of words associated with the second or third position, is still higher than 15 of the other 20 positions (75%).

I know; it sounds a bit complicated. But this actually means that while there is not a perfectly linear correlation, the placement of points is still consistent with the idea of a correlation.

What do we get when put all of this together? Two conclusions:

Having fewer words, rather than more, within a certain interval, is likely to land you on a lower ranking.

Having between 1600 and 2000 words (all other factors being equal) has the highest chance of placing you in the top 10 Google positions.

Yet, remember: it’s not the size of the dog in the fight, it’s the size of the fight in the dog.

5. Defining a New Metric: Content Performance

As we like this metric just as much as a kid likes his newly lego built toy car, indulge us to explain to you some more about it.

Why? Because it’s worth it. Not to mention that content performance is the core of the present research.

We needed to correlate content with Google positions. But we cannot correlate “content” as a whole. We needed a metric that would measure the content influence. And here is how we ended up with Content Performance.

How Did We Come Up with the Idea of Content Performance Metric?

It all started from a cliche. Everybody says that content is king, you need to have good content to rank and I bet you can continue this row of self-evident truths.

But lately, it’s been getting harder and harder to rank on Google’s top rankings, no need to insist on that. We figured out that the solution of getting high ranks relies on content.

But our question was:

How can we actually measure the impact of content on rankings? We also believe that content impacts rankings but how exactly? We needed some clear evidence.

Call me the doubting Thomas but I believe that almost everything that exists can be measured at some point. So, can we “measure” content? A difficult, but not impossible task.

Inspired by semantic search and the term frequency – inverse document frequency (I know it sounds like a Wheel of Fortune word but you can find more info on it here) we came up with a metric that can measure the impact any piece of content has on rankings: the Content Performance metric.

What Is Content Performance?

The Content Performance score reflects how important a word is to a document in a collection or corpus.

Putting it simply, the content performance metric shows you how well a page is optimized from a content point of view, on a scale from 0 to 100.

The higher the score, the better optimized the content is. And not only this: the same metric gives you info on the reasons why a piece of content is performing well or not.

The Content Performance metric is an indicator entirely developed by us, everything from soup to nuts. A lot of Google reverse engineering was involved in this, combining algorithms and concepts such as semantic search, LSI (Latent Semantic Indexing), TF*IDF or topical authority, just to mention a few.

How Does Content Performance Work?

We start by analyzing the top rankings results from Google, having content as a focus point.

After that, we apply some really advanced algorithms in order to identify the semantics, topics and keywords used on those pages. We do not take in consideration HTML tags influence (h1/h2, etc.). We only look at how well written and relevant the content is, trying to identify the exact factors that boosted those pieces of content on the top of the rankings.

Based on an in-depth analysis, we give a content performance score for each piece of ranking content, highlighting the focus keywords used by that piece of content.

6. How We Did the Research

The short answer for the “how we did the research” question would be: with tons of patience and a truck of coffee.

The longer answer would be: 3,784,369 pieces of content.

What we actually did was to look at the content, domain, page performance and many other data coming from approximately 3,7 million of pages ranking on 40k keywords. We took the first 100 SERPs for each keyword. We’ve analyzed them thoroughly, squeezed them as much as we could and we draw some great conclusions.

Even if we had the data for the first 10 Google pages, we decided to present you the conclusion on the top 20 positions in Google because we know that these are the “most hunted” positions and are of great interest for you, guys. The data collection took place in April 2017.

I have yet to see any problem, however complicated, which, when you looked at it the right way, did not become still more complicated.

Poul Anderson

American science fiction author

Inspired by Google Semantic Search, Latent Semantic Index, Latent Semantic Analysis (LSI & LSA) and applying lots of advanced algorithms and Pearson Correlations, we squeezed everything we could for the set of data we had in order to make sure we have the most accurate and reliable results possible.

We also ran some regression analysis to make sure the correlation was right and the results more than confirmed our theory. It turned out there was a 97% positive relationship between variables and that the regression model accounts for 94% of the variation in ranking. With p falling way below 0.01, it is clear that this finding is significant and not the result of random chance.

This, of course, doesn’t mean that there aren’t other factors that contribute to ranking variation, just that the variation in content performance correlates, to a very large degree, with the variation in rankings, and that it is consistent with all the other factors.

We hope we haven’t lost you somewhere among the geeky data. Yet, we believe that a solid research cannot be made without a proper methodological approach.

And yes, we do know that correlation does not imply causality. But, as the creator of the XKCD web comic states:

Correlation does not imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing ‘look over there.’

And, if we were to draw a conclusion to all the data, what you need to keep in mind is that a high Content Performance score is correlated with top rankings in Google.

7. How Can You Improve Your Content Performance Score

If you’ve made it so far, allow me to reward you with a short joke:

Statistics play an important role in genetics. For instance, statistics prove that numbers of offsprings is an inherited trait. If your parent didn’t have any kids, odds are you won’t either.

Also, since you’re here I’m guessing that you’ve browsed through all the charts and figures exposed above and you are now interested in our sense of humor only. But you want to see how can you make the most out of this research.

If the content performance boosts rankings, then, the content performance score is what you need to work on.

Seeing the high correlation between the content performance score and the rankings, wouldn’t it be great if there was a tool that helped you out to calculate and improve your content performance score?

As we’ve made this great breakthrough with the content performance score, we knew we needed to do something about it. So we’ve begun to dig deeper and decided to create a tool that will offer something that no other automated tool can: the exact recommendation that any piece of content should follow in order to improve its content performance score and therefore its rankings.

It wasn’t an easy ride, I can tell you. Lots of hard work, advanced algorithms and months of continuous work and research. Yet, the results are tremendous as we’ve managed to create the tool that we and any SEO pro, webmaster or content marketing needs.

But I’d better show you how it works.

Let’s say that I own a food blog and I am interested in tackling the “vegan recipes” niche. I already used the Keyword Tool to get an idea of what people are searching for in Google when it comes to this subject. And now I am interested in “beating” my competition’s content performance score.

Just by taking a look at the Ranking Analysis section I can get a full idea of who my competitors are, where they rankings are and how they got there.

Having the content performance score for each of my competitors makes it easy for me to understand where my content performance score should be. More than that, I also get the exact list of focus words those webpages used. The focus keywords are the exact keywords that boost the content performance score.

I am all covered when it comes to the competition. I know who ranks where and why. But what about my content score?

The Content Assistant is a feature also created by us; a sort of personal assistant that will tell you the exact pieces you need to use to improve your content performance score. It’s a learning machine, based on real search results, that helps you to optimize your content and your overall marketing strategy.

It’s like trying to solve a really complicated puzzle but someone would come and would give you the exact instructions and steps you need to follow. Things get way much easier, right? Just take a look at the screenshot below.

I’ve entered my own piece of text on which it seems I need to do some content improvements. It seems that my content performance score is only 19 and my goal is to beat #1 score, which is 52. So how do I get that score? By following the recommendations, just like in the screenshot above.

There are some important keywords that I never thought of using. And not only my content performance score will be increased if I use them, but also the overall quality of my article, as it will tackle some subjects that are of high interest. And hopefully my ranks as well.

It’s a 100% beneficial situation: my blog articles will get great content optimization tips (due to which my target audience will have a better user experience) and my SEO rankings will hopefully improve.

Hopefully this research, along with the tool, will make you re-think your content marketing programs and the way you view search engines rankings.

Cornelia is a proud Digital Marketer @ cognitiveSEO. When she is not documenting for the next amazing case study, she is probably somewhere trying out a new extreme sport such as Hang Gliding. Also, she's an avid traveler, extreme sports enthusiast, and aspiring drum singer.

The article gives a profound insight of SEO and Blogging Tactics.I myself being a newbie blogger realize its importance.The best thing I found is the interactive charts and tools provided to users.Its simply awesome….