Processing Large Data with Hadoop and EC2

This talk will cover processing large quanties of data using Hadoop running on top of Amazon’s EC2 machines. It will cover the theory of the MapReudce/Hadoop model and its applicability to solve different kinds of problems. Gottfrid will provided a brief of overview of AWS EC2 and S3 and look in detail at some of the work he has done using these pieces; some of it is described in the Self Service Prorated Super Computing Fun blog post.

People planning to attend this session also want to see:

Derek Gottfrid

The New York Times

Derek Gottfrid is a Senior Software Architect at The New York Times. He has been involved in building many key parts of the nytimes.com infrastructure, including search, web serving, e-mail distribution, and platform development. Derek has led efforts to improve the use of open source software within the Times and is responsible for the open source project dbslayer — a database connection pooling server. He also blogs regularly about his open source work at open.nytimes.com.