Ville Tuulos is a researcher with Nokia Research in Palo Alto. He has been working with large data sets since 1999, building solutions for statistical information retrieval. After several misguided attempts to orchestrate highly distributed systems in C and Python, he found Erlang in 2006. He is also co-author of the book "Mobile Python - Rapid application development on the mobile platform". In 2007 he started to build Disco, an Erlang / Python implementation of the Map/Reduce framework for distributed computing. Disco is now used by Nokia and others for quick prototyping of data-intensive software, using hundreds of gigabytes of real-world data.

Disco is an open-source implementation of the Map/Reduce framework.The Disco core is written in Erlang, which makes the most critical components of the system, namely job scheduling, distribution and fault detection, remarkably compact and robust. Users can specify jobs in Python, with all batteries included, without having to worry about any issues related to parallelism. As the result, Disco makes it fun to process massive amounts of data with hundreds of CPUs in parallel. This talk will introduce Map/Reduce programming with Disco and show how to perform simple and some more complicated data-processing tasks with minimal amount of code. For Erlang hackers, and for future Disco comitters, the talk will include a walk-through of central components of Disco, showing how Erlang can be used to handle large amounts of data efficiently.