Hadoop Matters Blog

A new survey of data science tools shows that Python usage is quickly gaining steam among advance analytic professionals, at the expense of both R and SAS. According to the results of the 2016 survey, conducted by Burtch Works, R is the preferred tool for 42% of analytics professionals, followed by SAS at 39% and Python at 20%. While Python’s placing may at first appear to relegate the language to Bronze Medal status, it’s the delta here that really matters.

What is Spark?

Apache Spark is an open source big data processing framework built around speed, ease of use, and sophisticated analytics. It was originally developed in 2009 in UC Berkeley’s AMPLab, and open sourced in 2010 as an Apache project.