Search results

Common Crawl - Blog - Introducing CloudFront as a new way to access Common Crawl data as part of Amazon Web Services’ registry of open data

Introducing CloudFront as a new way to access Common Crawl data as part of Amazon Web Services’ registry of open data. Ten years ago(!) Common Crawl joined AWS’s Open Data Sponsorships program, hosted on S3, with free access to everyone.

Common Crawl - Blog - Oct/Nov 2023 Performance Issues

These graphs are a bit hard to understand, so here are some additional details: Requests to https are being filtered by Amazon CloudFront first, and if they pass that filter, they are then sent to S3 and also appear in the S3 graphs.

Common Crawl - Blog - March/April 2024 Newsletter

CloudFront Performance this Week. S3 Performance this Week. We see that 503 Slow Down responses have been reduced dramatically, meaning egress of our data is now happening much more smoothly than around November of 2023.