We’re very excited to announce the winners of the First Ever Common Crawl Code Contest! We were thrilled by the response to the contest and the many great entries. Several people let us know that they were not able to complete their project in time to submit to the contest. We’re currently working with them to finish the projects outside of the contest and we’ll be showcasing some of those projects in the near future! All entries creatively showcased the immense potential of the Common Crawl data.
Huge thanks to everyone who participated, and to our prize and swag sponsors! A special thank you goes out to our panel of incredible judges, who had a very difficult time choosing the winner from among such great entries. All of the entries were in the Social Impact category, so the third grand prize (originally intended for the Job Trends category) goes to the runner-up in Peoples’ Choice. And the grand prize winners are…
People’s Choice: Linking Entities to Wikipedia
It’s not surprising that this entry was so popular in the community! It seeks to determine the meaning of a word based on the probability that it can be mapped to each of the possible Wikipedia concepts. It creates very useful building blocks for a large range of uses and it’s also exhilarating to see how it can be tweaked and tuned towards specific questions.
Project description
Code on GitHub
People’s Choice: French Open Data
Another very popular entry, this work maps the ecosphere of French open data in order to identify the players, their importance, and their relationship. The resulting graph grants insight into the world of French open data and the excellent code could easily be adapted to explore terms other than “Open Data” and/or could create subsets based on language.
Project description
Code on GitHub
Social Impact : Online Sentiment Towards Congressional Bills
This entry correlated Common Crawl data and congressional data to look at the online conversation surrounding individual pieces of legislation. Contest judge Pete Warden’s comments about this work do a great job of summing up all the excitement about this project:
“There are millions of people talking online about laws that may change their lives, but their voices are usually lost in the noise of the crowd. This work can highlight how ordinary web users think about bills in congress, and give their views influence on decision-making. By moving beyond polls into detailed analysis of people’s opinions on new laws, it shows how open data can ‘democratize’ democracy itself!”
Honorable Mentions
The four following projects – listed in alphabetical order – all came very close to winning.
Facebook Infection
Project description
Is Money the Root of All Evil?
Project description
Code on GitHub
Reverse Link!
Web app
Code on GitHub
Web Data Commons
Project description
Code on Assembla
We encourage you to check out the code created in the contest and see how you can use it to extract insight from the Common Crawl data! To learn how to get started, check out our wiki, which also includes some inspiration and ideas to get your creative juices flowing.
Thank you to all our wonderful sponsors!
Prize Sponsors:
Swag Sponsors: