Search results

Common Crawl - Blog - blekko donates search data to Common Crawl

December 17, 2012. blekko donates search data to Common Crawl. We are very excited to announce that blekko is donating search data to Common Crawl!

Common Crawl - Team - Rich Skrenta

He was founder and CEO of Blekko, a web search engine; the Open Directory Project, an innovative community-edited search platform; Topix, a news aggregator combined with a social forum; and Tobiko, a restaurant recommendation platform.

Common Crawl - Team - Jen English

She has made significant contributions during her tenure at Blekko, focusing on Content and Search Quality. She also served as an Engineering Manager at IBM Watson.

Common Crawl - Team - Greg Lindahl

He was previously the Founder and CTO of Blekko, an internet search engine. Before joining Common Crawl full-time in 2023, Greg was a member of the Event Horizon Telescope Collaboration, working at the Center for Astrophysics - Harvard & Smithsonian.

Common Crawl - Blog - September 2014 Crawl Archive Available

Thanks again to. blekko. for their ongoing donation of URLs for our crawl! The Data. Overview. Web Graphs. Latest Crawl. Crawl Stats. Graph Stats. Errata. Resources. Get Started. AI Agent. Blog. Examples. Use Cases. CCBot. Infra Status. FAQ. Community.

Common Crawl - Blog - October 2014 Crawl Archive Available

Thanks again to. blekko. for their ongoing donation of URLs for our crawl! The Data. Overview. Web Graphs. Latest Crawl. Crawl Stats. Graph Stats. Errata. Resources. Get Started. AI Agent. Blog. Examples. Use Cases. CCBot. Infra Status. FAQ. Community.

Common Crawl - Team - Stephen Burns

Stephen Burns is an accomplished marketing leader with a comprehensive background in digital and event marketing, honed through significant roles at companies such as OpenSpace, Accela, Charles Schwab, Singularity University, and Blekko.

Common Crawl - Blog - July 2014 Crawl Data Available

Thanks again to. blekko. for their ongoing donation of URLs for our crawl! Note: the original estimate for this crawl was 4 billion, but after full analytics were run, this estimate was revised. The Data. Overview. Web Graphs. Latest Crawl. Crawl Stats.

Common Crawl - Blog - August 2014 Crawl Data Available

Thanks again to. blekko. for their ongoing donation of URLs for our crawl! The Data. Overview. Web Graphs. Latest Crawl. Crawl Stats. Graph Stats. Errata. Resources. Get Started. AI Agent. Blog. Examples. Use Cases. CCBot. Infra Status. FAQ. Community.

Common Crawl - Team - Mike Markson

Later, at Blekko, he helped develop a web search engine focused on curating high-quality content.

Common Crawl - Blog - March 2014 Crawl Data Now Available

Blekko. for their ongoing donation of URLs for our crawl. The Data. Overview. Web Graphs. Latest Crawl. Crawl Stats. Graph Stats. Errata. Resources. Get Started. AI Agent. Blog. Examples. Use Cases. CCBot. Infra Status. FAQ. Community. Research Papers.

Common Crawl - Blog - Winter 2013 Crawl Data Now Available

The new crawling method relies heavily on the. generous data donations from blekko. and we are extremely grateful forongoing support! In 2014 we plan to crawl much more frequently and publish fresh datasets at least once a month. The Data. Overview.

Common Crawl - Blog - November 2014 Crawl Archive Available

Thanks again to. blekko. for their ongoing donation of URLs for our crawl! The Data. Overview. Web Graphs. Latest Crawl. Crawl Stats. Graph Stats. Errata. Resources. Get Started. AI Agent. Blog. Examples. Use Cases. CCBot. Infra Status. FAQ. Community.

Common Crawl - Blog - December 2014 Crawl Archive Available

Thanks again to. blekko. for their ongoing donation of URLs for our crawl! The Data. Overview. Web Graphs. Latest Crawl. Crawl Stats. Graph Stats. Errata. Resources. Get Started. AI Agent. Blog. Examples. Use Cases. CCBot. Infra Status. FAQ. Community.

Common Crawl - Blog - April 2014 Crawl Data Available

Thanks again to. blekko. for their ongoing donation of URLs for our crawl! The Data. Overview. Web Graphs. Latest Crawl. Crawl Stats. Graph Stats. Errata. Resources. Get Started. AI Agent. Blog. Examples. Use Cases. CCBot. Infra Status. FAQ. Community.

Common Crawl - Blog - January 2015 Crawl Archive Available

Thanks again to. blekko. for their ongoing donation of URLs for our crawl! Please. donate. to Common Crawl if you appreciate our free datasets! We're seeking corporate sponsors to partner with Common Crawl for our non-profit work in big open data!

Common Crawl - Blog - June 2018 Crawl Archive Now Available

This huge "collection of bookmarks" dates back multiple years, even back to 2012 when we first received. seed donations from Blekko. This month we started to remove old "bookmarks" from our URL database.

Common Crawl - Blog - Learn Hadoop and get a paper published

Cluster and visualize their networks of links (You could use Blekko's /conservative /liberal tag lists as a starting point). So, again -- if you think this might be fun, leave a comment now to mark your interest.