The Latest News from Burtch Works

2015 SAS vs. R Survey Results

The topic of SAS vs. R is hotly-debated everywhere, from Reddit forums to LinkedIn groups to around the water cooler with your data analytics team, with no clear consensus. So, last year, Burtch Works decided to conduct a flash survey of our network of quantitative professionals to determine their preference. We had over 1,000 responses in less than 24 hours, with many expressing firm opinions one way or the other.

This year we decided to run the flash survey again, and received even more responses than last year! To keep the results simple, we asked one question – Which do you prefer: SAS or R?

Our results showed that now, support for each tool is almost equal. This year 48% of respondents chose R (an increase of 13 points over the 35% of quantitative professionals who chose R last year), while the other 52% of professionals chose SAS.

There were also several respondents (not counted in the totals above) who responded with neither/both, a few write-ins for Python, and one for SPSS.

And on a lighter note…

Although it’s not exactly scientific, many respondents were very passionate in their support for their chosen tool, so we kept a tally of how many in each group used exclamation points (one or multiple) and smiley faces along with their message. We were curious if one camp was more enthusiastic or emotive about their vote than the other. We’re not sure what this means exactly, but it was interesting to note that R users were more emotive in every category despite their slightly smaller sample size.

Colorful Commentary

Here are some of the more humorous comments that we received from people declaring their allegiance one way or the other:

SAS. More Powerful… if R could scale, then R

R! But don’t tell my friends at SAS 🙂

SAS, but I’m taking an online R class now to stay current. I’m sure I’ll still like SAS better though.

I’m under 40 and work in tech, so I haven’t used SAS since a professor made me back in grad school.

SAS due to inertia, learning R as I respond.

R hands down. SAS is for boring dinosaurs. Note: I have used both extensively. One of the main requirements of me switching jobs last year was to not use SAS anymore. SAS will die a slow, painful death. No tech company/startup in their right mind uses SAS and that is where the future is going (not to mention where the more exciting work is).

SAS but R might take over this year.

I will never take a job that uses SAS ever again… I would rather do anything else than program SAS!

SAS had its glory. Sadly it is becoming legacy system.

JMP! Just kidding, SAS. If R wasn’t case sensitive and had the helpful color coding of SAS though, R could be a real contender.

I only use SAS when they make me.

Over the next few weeks we’ll be digging deeper into the data to see how these results vary by industry, years’ experience, region, and education level. To see our “deeper dive” analysis from last year’s sample, click here, and keep your eyes on the blog for this year’s extended analysis!

Follow Burtch Works onTwitterorLinkedInto get the best quantitative career news and blog updates delivered right to your news feed, and check out ourYouTubechannel for access to all our latest salary information and webinars!

IMO, SAS’ main advantage over R lies in its macro language processing facility. With SAS you can write programs that write other programs in a very straightforward way making it a good choice for big, dynamic reporting tasks. SAS’ SQL interpreter is also superior to R’s, with better enhancements over the ANSI standard and easier scripting (superior to MySQL, even).

Yes, SAS scales better than R, up to a point, but that scale comes at an enormous $ cost. In my own experience, SAS is great for simple jobs on ~100M rows, okay for jobs between 100M-500M, and then its greatness quickly tails off. But I regularly need to work on >1B rows, where neither R nor SAS is very good. So, there is Hadoop, or sometimes Python. If you want modeling at scale, there is Revolution R. Less expensive than SAS, but still. Scale is always a problem, and there is no panacea.

SAS is an idiosyncratic black box that requires a lot of memorization. Which option goes where in which PROC? Which PROC is most efficient for a given task? You need to know, because SAS will drown you in output you don’t want (and gobble up your time producing it). SAS graphics hurt the eyes and cannot be improved without considerable learning effort. I find R better than SAS for EDA, but that is my opinion.

From a career perspective, I feel like SAS jobs are not great for skills development. When I had a SAS job, SAS was all I ever used (maybe some Excel). Without SAS, I use Python, R, MySQL, Excel, a little bit of C, a little bit of Java, and I am often exposed to new things. Sometimes that feels a little kludgy, and sometimes it feels good. If I could have a job where I used both, I think I would like that very much, and it is hard to say which tool I’d reach for most often.

Roger Fried

It would be nice to see a breakdown of respondents by department: IT, “Analytics”, Bioinformatics, Marketing, Operations,Corporate Finance, Securities Analysis, Accounting, “Interdiscplinary”, Consulting, Sales, Other

Richard Spotswood

Passion and hype tend to produce more smoke than fire. I use both; I think it makes better sense to emphasize that both do the job they set out to do handily. You can teach statistics with both SAS and R, and historically that’s been the case: social sciences tended to use SAS, economists stata, engineers matlab. More recently, statistics departments use R a lot more than they have in the past and machine learning is done with matlab, python and R.

Talking about different platforms for doing machine learning and statistics brings order to this discussion, because some code libraries really are better for a particular area. Python can be used as both an analytic tool and a programming language, so it is good in environments where analysis is tightly bound with a production system like a web site. Matlab has historically been used for image and sound processing and is frequently used for analysis and the results coded in C++. SAS is a proprietary platform with decades of use and can analyze large datasets without using lots of memory. R tends to load everything in memory, is frequently faster and uses an open source platform. Depending upon the individual’s resources *and* the organization’s resources, one is a better or less-better fit. It really does depend on context.

Christian Cantos

How to include R code directly into you SAS programs ?

WPS is a commercial alternative to run SAS language scripts.
It comes with a Proc R that let you run R code within the SAS language.

So the next related question, is when is it better to make use of R, or SAS instructions.
Being a SAS guy, I would make all data manipulations using SAS, at least for large files, and use R only
for graphics or statistical functions applied on reduced or aggregated sets of observations.
I would like to find more on how and why to split work between SAS and R.

I think that memory limitations with R, beacause it does not paginate on disk, could be the number one criteria.
Any idea ?

Don Vicendese

I like using Stata for data manipulation and straightforward analyses. It can be very powerful in functionality and ease of use. There are some analyses which are superior in R – for example mgcv GAM or spatial analysis. I also use QGIS for spatial analysis which is also freeware. Stata has the capacity to scale up to very large data sets. Programming is straightforward in both Stata and R. I used SAS in my undergraduate years but my (now biased) viewpoint is that Stata and R are superior. I have also used SPSS and, in my opinion, I would put it behind SAS. In regards to graphics, Stata is adequate but R is generally superior especially if 3D is required. R has many more types of data visualizations than Stata and R does them very well.

Kule

Tricia

Are you guarding against response bias? R users may be more enthusiastic than SAS users, but I’m not convinced the numbers are representative. I also agree with Roger that a breakdown by department/industry would be illuminating.
– a SAS user who swears she’s not in denial

Harry

Linux vs Windows vs Mac. Haven’t we been through this before? If you’re over 30 you remember the time when open source was going to take over Windows and Apple. But people don’t want to surf 30 hours of Youtube to figure out how to change their screen saver. Trust me, SAS isn’t worried about this “takeover.” It will change their business, and it has already forced them to lower prices and offer free versions to unis and students. I know of one uni already that is returning to SAS because the learning curve with R is just too steep, and SAS is uni-friendly again. Meh.

Colton

My vote goes to R.

The only thing I would choose SAS for is reading in messy data (text files, etc..), as opposed to regexp with R. Other than that, I find the R syntax to be far more straightforward and easier to process. Not to mention R is free
which makes practicing/learning it a much more viable option for recent graduates!

Cletus Flynn

Our Blue company was absorbed by Orange and Blue used SAS extensively. It is so ingrained in the culture that everyone thinks nothing could be better than SAS. As as long time SAS user, I find R a refreshing change and actually quite powerful, not mention it’s free. That’s where Orange is going and SAS will die on our Blue systems which cost an amount that makes me hurl just to thing about it. The are memory issues with R and I get that, but there are work around(s) and we’re smart enough to figure those out in time. I hate to say this, but as a long time SAS user, R is going to be the go to software for statistics. Who knows. maybe Julia could step up in the next 5 – 10 years.