←  Back to Blog
June 10, 2026

Common Crawl Foundation at IIPC-WAC 2026

Common Crawl was well represented with contributions at the 2026 IIPC Web Archiving Conference and General Assembly.

Members of the Common Crawl Foundation team (Laurie Burchell, Sebastian Nagel, and Pedro Ortiz Suarez) attended the 2026 IIPC Web Archiving Conference (WAC) and General Assembly (GA) held at KBR, the Royal Library of Belgium in Brussels.

Royal Library of Belgium, picture by EmDee CC BY-SA 4.0

Themes for this year's conference consisted of Access & Research Use, Tools & Infrastructure, Collection Development, Legal & Ethical Issues, Policies & Standards, and Environmental Impact. In accordance with that last theme, participants were asked to use the stairs over the elevators.

Common Crawl Contributions

Common Crawl was well represented across the event, with both presentations and references in the work of other participants.

A photograph of Sebastian Nagel, Laurie Burchell, and Pedro Ortiz Suarez.
Sebastian Nagel, Laurie Burchell, and Pedro Ortiz Suarez.

Work on improved language identification for crawl data was presented by Laurie.  Pedro presented statistics on data-access methods used to download Common Crawl data and showcased cc-downloader adoption over the last year, with Sebastian presenting the research on Web crawling policies and opt-outs at the General Assembly done in collaboration with CCF.  Presentations are expected to be posted to the IIPC YouTube channel in the near future.

A photograph of Sebastian Nagel giving a presentation on CCBot
Sebastian Nagel presenting CCBot

Common Crawl’s data products were often referenced, both during talks and behind the scenes, notable mentions include the “Responsible Strategies” session by Abbie Grotke, "End of Term Web Archive: Harmonizing WARC contributions from multiple crawling partner" presented by Mark Phillips, and "Crawl, cloud, carbon: measuring and reducing emissions for web archivists" by Simon Ponsford.

We look forward to more discussions with our friends (new and old) from the IIPC in the near future.

A photograph of Laurie Burchell presenting CommonLID at Howest
Laurie Burchell presenting CommonLID at Howest
A photograph of the closing panel: Web Archiving for Accountability, Shown from left to right: Emily Tripp, Marvin Milatz, Friedhelm Weinberg, Basile Simon
Closing panel: Web Archiving for Accountability, shown from left to right: Emily Tripp, Marvin Milatz, Friedhelm Weinberg, Basile Simon
This release was authored by:
No items found.

Erratum: 

Content is truncated

Originally reported by: 
More details
Some archived content is truncated due to fetch size limits imposed during crawling. This is necessary to handle infinite or exceptionally large data streams (e.g., radio streams). Prior to March 2025 (CC-MAIN-2025-13), the truncation threshold was 1 MiB. From the March 2025 crawl onwards, this limit has been increased to 5 MiB.