Common Crawl was invited to the AI Plumbers Unconference held at FOSDEM this year. The contrast between this fringe event and the main conference couldn’t be bigger; cozy sessions of talks and small breakout rooms facilitating about 100 people, compared to the 10,000 open source developers that visit the Open University in Brussels over the two days of the event.
With only three rooms in 2006, 28 in 2016, and 37 lecture halls now in 2026, all densely packed with both attendees and scheduled talks, the growth of this conference cannot be understated.

Admission is free-as-in-beer, with the event mostly financed through the sale of t-shirts and not-so-free beer. The feeling that you are visiting a festival rather than a conference is, through the enormous amount of buzz and people, very palpable.

AI Plumbers Fringe Event
As mentioned, the contrast with the unconference directly following FOSDEM couldn’t be greater, held at a cozy meeting space hidden in the center of Brussels, we were welcomed into the meeting room through a fairytale-like staircase, leading up to a small lecture hall dotted with cushions and sitting nooks. A wondrous environment that proved extremely suitable for the free exchange of ideas.

A few attendees came up to us, “Oh you work for Common Crawl? Thanks so much for the data!” As was the case at FOSDEM, language preservation and archival of digital data before it succumbs to bit rot, or is drowned out by generated content, was a hot topic.
We had a lovely chat with Ron Evans and William Kennedy, who were there to demonstrate deployment and workings of Kronk and Yzma, a model server and llama.cpp wrapper built entirely in Go. Ron pictured below as he introduced us to his, very cute, Gopherbot.

Erratum:
Content is truncated
Some archived content is truncated due to fetch size limits imposed during crawling. This is necessary to handle infinite or exceptionally large data streams (e.g., radio streams). Prior to March 2025 (CC-MAIN-2025-13), the truncation threshold was 1 MiB. From the March 2025 crawl onwards, this limit has been increased to 5 MiB.
For more details, see our truncation analysis notebook.

