http://traffic.libsyn.com/sedaily/rocana_esammer.mp3Podcast: Play in new window | DownloadRocana applies big data, advanced analytics, and visualizations to dev ops in order to guide users to the root causes of problems. Eric Sammer is the co-founder and CTO of Rocana. At Cloudera, he served as an Engineering Manager responsible for tools and partner integrations. Within that role, he developed many of Cloudera’s best practices for developing large, distributed, data processing infrastructure. Questions include: Does

Sean Owen, Director, Data Science @ Cloudera via Quora Although people use the word in different ways, Hadoop refers to an ecosystem of projects, most of which are not processing systems at all. It contains MapReduce, which is a very batch-oriented data processing paradigm. Spark is also part of the Hadoop ecosystem, I’d say, although it can be used separately from things we would call Hadoop. Spark is a batch

http://traffic.libsyn.com/sedaily/matei_spark.mp3Podcast: Play in new window | Download Apache Spark is a fast and general engine for big data processing. Matei Zaharia created Spark, and is the co-founder of Databricks, a company using Spark to power data science. Questions: What was the motivation behind creating Spark? How much faster is a Spark job than a Hadoop job? What is the relationship between streaming and batch processing? Is Spark’s core advantage over Storm

http://traffic.libsyn.com/sedaily/eli_cloudera.mp3Podcast: Play in new window | DownloadCloudera allows enterprises to leverage their data through its Hadoop platform. Eli Collins is the Chief Technologist at Cloudera. Topics include: changes to Hadoop since Cloudera’s founding Cloudera’s usage of Spark, Docker, and other open-source technologies how enterprises use batch and streaming together Cloudera’s open-source policy Should Frito Lay open source its chip-making abilities? how collaboration occurs between big, competing companies the growth of increasingly