< Back to Blog
November 13, 2013

Hyperlink Graph from Web Data Commons

Note: this post has been marked as obsolete.
The talented team at Web Data Commons recently extracted and analyzed the hyperlink graph within the Common Crawl 2012 corpus. Altogether, they found 128 billion hyperlinks connecting 3.5 billion pages.
Common Crawl Foundation
Common Crawl Foundation
Common Crawl builds and maintains an open repository of web crawl data that can be accessed and analyzed by anyone.

The talented team at Web Data Commons recently extracted and analyzed the hyperlink graph within the Common Crawl 2012 corpus.

Altogether, they found 128 billion hyperlinks connecting 3.5 billion pages.

They have published resulting graph today together with some results from the analysis of the graph.

http://webdatacommons.org/hyperlinkgraph/
http://webdatacommons.org/hyperlinkgraph/topology.html

To the best of our knowledge, this graph is the largest hyperlink graph that is available to the public!

This release was authored by:
No items found.