Another January Mozscape Index Has Been&nbspReleased!

The author's views are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

Just 13 days ago on January 11th, we released the first Mozscape index for 2013. And today, we're launching the latest January Mozscape index - another two indexes in one month! Mozscape data has been refreshed across all our applications so you can see the latest data in Open Site Explorer, the Mozbar, PRO campaigns, and the Mozscape API.

This index finished up in record time, running smoothly on the high power cluster compute machines in AWS. Our Mozscape processing team (Doug, Martin, Brandon, and Stephen) has spent the past few months really cleaning up and optimizing the software that produces these indexes. Changes are slow going with this software - big data is big and changes are big! There is a lot of testing and optimizing that must be done before changes even make it into the production index, but these guys are dedicated to getting you index twice a month!

We're eagerly waiting for our first index to be released from our new colocation in Virginia - hopefully in the month of February. With some new configurations and master network tuning from our Tech Ops team, we currently have an index churning away, so far with promising performance!

Here are the metrics for this latest index:

70,278,347,012 (70 billion) URLs

1,516,212,211 (1.5 billion) Subdomains

145,518,352 (145 million) Root Domains

783,206,227,396 (783 billion) Links

Followed vs. Nofollowed

2.24% of all links found were nofollowed

56.43% of nofollowed links are internal

43.57% are external

Rel Canonical - 15.11% of all pages now employ a rel=canonical tag

The average page has 78 links on it

66.68 internal links on average

11.07 external links on average

And the following correlations with Google's US search results:

Page Authority - 0.36

Domain Authority - 0.19

MozRank - 0.24

Linking Root Domains - 0.30

Total Links - 0.25

External Links - 0.29

Since this index was kicked off January 14th, the latest crawl data is really fresh! There is just over 30 days of crawl data in this index, the majority being crawled in January, but some crawl data as old as mid-December. There was a significant increase in the number of subdomains crawled for this index compared to the our previous index. Further investigation revealed we found a fairly small increase of root domains that had a substantial number of new subdomains associated with them. Because they are such low authority, the increase won't have any impact on our metrics, but does significantly increase the number subdomains in this index.

You guys could do a really cool post with these statistics. Personally, I'd love to see:

On average, how many external links are within the body tags on a page.

How many of those links are nofollowed vs dofollowed.

What the average percentage of external links to a site are nofollowed vs dofollowed across the web.

Understanding what percentage of sites fall within certain DA buckets. For example, I'm sure 90% of everything you guys crawl has a DA of under 30, while maybe .0001% of the internet is a DA 95 and above.

I think I saw something like that with PR a long, long time ago. I could probably think of more, but it's Friday at 7:00 in Utah. Have a good weekend everybody!

Yeah, that is a great idea! We're working right to compile stats like that into an internal dashboard, but I'm sure when the project is complete we could export some of that data into some cool charts. Thanks for the idea!!Carin

Your Software is really Great! Specially Because of you guys Doug Martin Brandon and Stephen that are doing a Great Job cleaning up and optimizing the Software. For me as an SEO master the help that your guys give me is Excellent. Thanks again!

Thanks so much to the team for their hard work on managing these updates, as it's clearly a major task. I'm personally finding it very helpful and exciting to have the new index updates twice a month, so thank you.