Articles

  • CommonCrawl Tutorial
    University of the Pacific - ECPE 293A Course

    This new tutorial uses a command-line interface to Elastic MapReduce (link 1, link 2) that is compatible with IAM accounts. While it’s a little more work to setup, it’s a better method for dealing with larger, complicated projects.

  • Sample Wordcount Streaming Job Using PHP on CommonCrawl Dataset
    Fights with Bytes

    A blog tutorial about running PHP on Elastic MapReduce to analyze Common Crawl data.

Slide Presentations

Videos