IN THIS FOURTH EDITION of the O’Reilly Data Science Salary Survey. They analyzed input from 983 respondents working in the data space, across a variety of industries— representing 45 countries and 45 US states.

Through the results of their 64-question survey, They’ve explored which tools data scientists, analysts, and engineers use, which tasks they engage in, and of course—how much they make. READ MORE

In this video, Tyler Akida presents a whirlwind tour of the evolution of massive-scale data processing at Google, from the original MapReduce paradigm to the high-level pipelines of Flume to the streaming approach of MillWheel to the portable, unified streaming/batch model of Google Cloud Dataflow and Apache Beam (incubating).

Tyler also highlights similarities and differences with related open source systems such as Flink, Spark, Storm, and Gearpump, calling out ways in which they’re converging on and diverging from the Beam model and what that means when running Beam pipelines on their respective runners. Watch Video

At the Machine Learning Meetup in NYC, Dan Melamed gave a machine learning talk titled: “How To Learn From What Your Users Might Not See”. This talk will focus on contextual bandits and their applications.

In this tutorial, Dan will show how to learn from such data in a principled, efficient, and unbiased manner. The techniques that he will describe were largely responsible for a click-thru rate gain of over 25% on MSN.com. Watch Video

In this episode of the O’Reilly Data Show, O’Reilly’s online managing editor Jenn Webb speaks with Natalino Busa on the topic of predictive analytics, the challenges of feature engineering, and a new class of techniques that is enabling features to emerge from patterns within the data.

They also discuss the relationship between predictive techniques and high-quality microservices, and how machine learning is being used to improve financial services. Listen to Podcast

Machine Learning is at the core of data science and we see it’s applications all over now (i.e. recommender engines, etc.). As Pedro Domingos’s Professor of Computer Science U. Washington writes in the piece, “In reality, the main purpose of machine learning is to predict the future.” It’s important to be aware of the MYTHs associated with Machine Learning. Read More

Renee (Teate) just got back from PydataDC, where she gave this presentation on “Becoming A Data Scientist”, which intends to summarize and share what she has learned from her podcast series of 13 interviews with Data Scientists in the field. Read More

In 2013, Airbnb had a small, centralized team of five data scientists serving the data needs of the company. Since then, they have grown to become one of the largest, most innovative startup teams with over 70 data scientists now serving separate business units. In addition to setting a consistently high bar on new hires and focusing on technical mentorship from peers, the structure of the organization has been key to successful growth. Read More