article

Access Common Crawl Data That is Stored on S3
Crawling billions of pages online is a large undertaking. Thanks to the non profit companies, Common Crawl does the heavy lifting for you…
interview

Machine Scale Analysis of Digital Collections: An Interview with Lisa Green of Common Crawl
How do we make digital collections available at scale for today’s scholars and researchers? Lisa Green, director of Common Crawl, tackled this and related questions in her keynote address at Digital Preservation 2013…
article

A Free Database of the Entire Web May Spawn the Next Google
A nonprofit called Common Crawl is now using its own Web crawler and making a giant copy of the Web that it makes accessible to anyone…