We see a more pronounced story compared to Google search volume. The percentage of job ads available to indeed.com shows that demand for Python machine learning skills has been dominating R machine learning skills since at least 2012 with the gap only widening in recent years.

KDNuggets Survey Results: More People Using Python for Machine Learning

We can get some insight into the tools used by machine learning practitioners by reviewing the results for the KDnuggets Software Poll Results.

Here’s a quote from the 2016 results:

R remains the leading tool, with 49% share, but Python grows faster and almost catches up to R.

— Gregory Piatetsky

The poll tracks the tools used by machine learning and data science professionals, where a participant can select more than one tool (which is the norm I would expect)

Here is the growth of Python for machine learning over the last 4 years:

1

2

3

4

2016 45.8%

2015 30.3%

2014 19.5%

2013 13.3%

Below is a plot of this growth.

KDNuggets Poll Results – Percentage of Professionals Using Python.png

We can see a near linear growth trend where Python s used by just under 50% of profesionals in 2016.

It is important to note that the number of participants in the poll has also grown from many hundreds to thousands in recent years and participants are self-selected.

What is interesting is that scikit-learn also appears separately on the poll, accounting for 17.2%.

The survey tracks tool usage of practitioners, and as with the KDNuggets data.

Quoting from the key findings from the 2016 report, we can see that Python plays an important role in data science salary.

Python and Spark are among the tools that contribute most to salary.

— Page 1, 2016 Data Science Salary Survey report.

Reviewing the survey results, we can see a similar growth trend in use of the use of the Python ecosystem for machine learning over the last 4 years.

1

2

3

4

2016 54%

2015 51%

2014 42% (interpreted from graph)

2013 40%

Again, we can plot this growth.

O’Reilly Poll Results – Percentage of Professionals Using Python.png

It’s interesting that the 2016 results are very similar to those from the KDNuggets poll.

Quotes

You can find quotes to support any position on the Internet.

Take quotes with a grain of salt. Nevertheless, quotes can be insightful, raising and supporting points.

Let’s first take a look at some cherry-picked quotes from news sites and blogs about the growth of Python for machine learning.

News Quotes

Python has emerged over the past few years as a leader in data science programming. While there are still plenty of folks using R, SPSS, Julia or several other popular languages, Python’s growing popularity in the field is evident in the growth of its data science libraries.

… the last few years have seen a proliferation of cutting-edge, commercially usable machine learning frameworks, including the highly successful scikit-learn Python library and well-publicized releases of libraries like Tensorflow by Google and CNTK by Microsoft Research.

Note that scikit-learn, TensorFlow and CNTK are all Python machine learning libraries.

Python is versatile, simple, easier to learn, and powerful because of its usefulness in a variety of contexts, some of which have nothing to do with data science. R is a specialized environment that looks to optimize for data analysis, but which is harder to learn. You’ll get paid more if you stick it out with R rather than working with Python

Quora Quotes

Below are some cherry picked quotes regarding the use of Python for machine learning taken from Quora questions.

Python if a popular scientific language and a rising star for machine learning. I’d be surprised if it can take the data analysis mantle from R, but matrix handling in NumPy may challenge MATLAB and communication tools like IPython are very attractive and a step into the future of reproducibility. I think the SciPy stack for machine learning and data analysis can be used for one-off projects (like papers), and frameworks like scikit-learn may be mature enough to be used in production systems.

I’d also recommend Python as it is a fantastic all-round programming language that is incredibly useful for drafting code fragments and exploring data (with the IPython shell), great for documenting steps and results in the analytical process chain (IPython Notebook), has a huge selection of libraries for almost any machine learning objective and can even be optimized for production system implementation. In my opinions there are languages that are superior to Python in any of these categories – but none of them offers this versatility.

[…] It is because the language can make a productive environment for people that just want to get something done quickly. It is fairly easy to wrap C libraries, and C++ is doable. This gives Python access to a wide range of existing code. Also the language doesn’t get in the way when it comes time to implement things. In many ways it makes coding “fun again” for a wide range of tasks.

10 Responses to Python is the Growing Platform for Applied Machine Learning

I agree that for the moment Python and R are the 2 platforms that you really need for machine learning . However , there is fierce competition growing in the Julia corner . I recently had to code from scratch a customised k-means++ algo for a warehouse network location problem in the US . Using the flexclust package in R would have led to severe performance problems , Ditto for coding it in Python . The JIT compiler in Julia however proved to be nearly as fast as C : about 2 orders of magnitude faster than R (probably also than Python) and Julia is a lot easier (higher level) and compact to code in . Sure , Julia is still in beta (0.5) but already there are more than 1000 high quality packages available for it . As it stands , Julia is not ready yet for commercial apps , but certainly ready for internal projects . Tensorflow has already been ported and Keras will follow I hope . Mxnet is available too . Julia also was designed with parallel computing as a standard feature and will port from a laptop to a super computer cluster . This is the future of scientific computing . I intend to have Julia replacing C, Python and R asap in my job

Hi Jason, I would like to ask what’s your personal take of view in regards to python vs R. For instance instead of showing me the trends and all the stats, I would like to ask, if I may, how do you use them in your daily job, which one do you prefer and why? For instance are different tasks accomplished better with one than the other? Which are those in your experience and why?

In one of your posts I saw that your wrote that python is for intermediate tasks and R for advanced. Would you mind elaborating on this a little bit?