As a sign of many more good things to come in 2012, Founder Gil Elbaz and Board Member Nova Spivack appeared on this week’s episode of This Week in Startups. Nova and Gil, in dicussion with host Jason Calacanis, explore in depth what Common Crawl is all about and how it fits into the larger picture of online search and indexing. Underlying their conversation is an exploration of how Common Crawl’s open crawl of the web is a powerful asset for educators, researchers, and entrepreneurs.
Some of my favorite moments from the show include:
- In a great soundbyte from Jason at the beginning of the show, he observes that Common Crawl is in many ways the “Wikipedia of the search engine.” (8:50)
- When the question is posed whether or not Common Crawl may eventually charge some fee for our data and tools, Nova’s response that Common Crawl is “better if it’s free… [We] want this to be like the public library system” captures the spirit of Common Crawl’s mission and our commitment to the open web. (32:00)
- When asked about projects and applications that would benefit from Common Crawl, Gil makes a compelling case for organizations that can use Common Crawl as a teaching tool. If someone wants to teach Hadoop at scale, for example, it’s essential for them to have a realistic corpus to work with — and Common Crawl can provide that. (46:18 )
Those are just a few of the highlights, but I highly recommend watching the episode in its entirety for even more insights from Gil and Nova as we gear up for big things ahead for Common Crawl!