Strata 2012 Schedule

Below are the confirmed and scheduled talks at Strata 2011 (schedule subject to change).

Customize Your Own Schedule

Create your own Strata schedule using the personal scheduler function. Mark the tutorials, sessions, keynotes, and events you want to attend by clicking on the calendar icon next to each listing. Then click on "personal schedule" below and get your own customized schedule generated.

This tutorial provides a solid foundation for those seeking to understand large scale data processing with MapReduce and Hadoop, plus its associated ecosystem. This session is intended for those who are new to Hadoop and are seeking to understand where Hadoop is appropriate and how it fits with existing systems. No programming experience is required.

1:30pm-5:00pm (3h 30m)
Data Science

The Two Most Important Algorithms in Predictive Modeling Today

Jeremy Howard (Enlitic), Mike Bowles (Biomatica)

Wouldn't it be great if there were just use two algorithms which could handle most of your predictive modeling needs? It turns out that actually this is the case. Noted machine learning instructor Dr Mike Bowles and champion data miner Jeremy Howard will teach you everything you need to know to apply them successfully.

9:00am-12:30pm (3h 30m)
Visualization & Interface

Designing Data Visualizations Workshop

Noah Iliinsky (Amazon Web Services)

This workshop is a jumpstart lesson on how to get from a blank page and a pile of data to a useful data visualization. We'll focus on the design process, not specific tools. Bring your sample data and paper or a laptop; leave with new visualization ideas.

1:30pm-5:00pm (3h 30m)
Data Science

Developing applications for Apache Hadoop

Sarah Sproehnle (Cloudera, Inc.)

Learn now how to use a Hadoop cluster for data analysis using Java MapReduce, Apache Hive and Apache Pig, and get an overview of using the HBase Hadoop database. Some programming experience is strongly recommended for this session.

9:00am-12:30pm (3h 30m)
Data Science

Large scale web mining

Ken Krugler (Scale Unlimited)

Want to extract and process Big Data from the web? This tutorial will show you how to use key open source technologies such as Hadoop, Cascading, Bixo, Tika, Mahout and Solr to create scalable, reliable web mining solutions.

Learn first hand from award-winning Guardian journalists how they mix data, journalism and visualization to break and tell compelling stories: all at newsroom speeds.

9:00am-12:30pm (3h 30m)
Data Science

Introduction to R for Data Mining

Joseph Rickert (Revolution Analytics)

This tutorial will enable anyone with some programming experience to begin analyzing data with the R programming language

1:30pm-5:00pm (3h 30m)
Data Science

Building Applications with Apache Cassandra

Nate McCall (Apache Cassandra)

This presentation goes beyond the hype, buzzwords, and rehashed slides and actually presents the attendees with a hands-on, step-by-step tutorial on how to write a Java application on top of Apache Cassandra. It focuses on concepts such as idempotence, tunable consistency, and shared-nothing clusters to help attendees get started with Apache Cassandra quickly while avoiding common pitfalls.

9:00am-12:30pm (3h 30m)
Data Science

Hadoop Data Warehousing with Hive

Dean Wampler (Lightbend), Jason Rutherglen (Datastax)

This hands-on tutorial teaches you how to setup and use Hive, a high-level, data warehouse tool for Hadoop. Hive provides a SQL-like query language, HiveQL, that is easy to learn for people with prior SQL experience, making Hive attractive for data warehousing teams. Hive leverages the power of Hadoop for working with massive data sets without requiring expertise in MapReduce programming.

1:30pm-5:00pm (3h 30m)
Data Science

Hands-on Visualization with Tableau

Jock Mackinlay (Tableau Software), Ross Perez (Tableau Software)

In this hands-on class, learn how to turn data into effective, interactive visualizations. You do not require a Tableau license to participate, but must bring a Windows laptop or virtual machine.

9:00am-12:30pm (3h 30m)
Sponsored Session

Big Data Without the Heavy Lifting

James Dixon (Pentaho), Chris Deptula (OpenBI)

The big data world is extremely chaotic based on technology in its infancy. Learn how to tame this chaos, integrate it within your existing data environments (RDBMS, analytic databases, applications), manage the workflow, orchestrate jobs, improve productivity and make using big data technologies accessible to a much wider spectrum of developers, analysts and data scientists.

1:30pm-5:00pm (3h 30m)
Sponsored Session

Big Data Entity Extraction With Less Work and Less Code

Richard Taylor (HPCC Systems from LexisNexis Risk Solutions)

While extracting entities from massive amounts of text is a major problem, a proven solution exists. This tutorial will demonstrate a natural language parsing technology to extract entities from all kinds of text using massively parallel clusters.

9:00am-5:00pm (8h)
Deep Data

Deep Data

Deep Data is a no-holds-barred program for data scientists. The advanced technical content will keep you up to speed with the latest techniques, and give you the opportunity to debate and network with the most skilled data scientists in our industry.

9:00am-5:00pm (8h)
JumpStart

Strata Jumpstart

Jumpstart looks at how building and running businesses changes in a data-driven world. It's the missing MBA for Big Data.

12:30pm-1:30pm (1h)

Break: Lunch Sponsored by HPCC Systems

7:00pm-9:00pm (2h)

Strata Mini Maker Faire & Data Crush

Two events happening in the same time & place: *Mini Maker Faire* is a showcase of innovative data-related hardware, apps, and projects *Data Crush*, an experiment combining wine-tasting with the gathering, analysis, and application of data to track behavioral trends and influencing factors.

5:00pm-7:00pm (2h)

Break

Sponsors

Elite Sponsors

Strategic Sponsors

Partner Sponsors

Impact Sponsors

Premier Exhibitors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com.