Thursday, January 10, 2013

Hatin' on Forbes

While the paper by Ray Rivera is mostly stupid, it actually has a point that I agree with - that data science shouldn't be considered the oracle at Delphi that has all the answers. Granted, that's true of any technology or set of buzz words.

Heck, if you want to be sold snake oil - try asking a consultant what sort of "analytics" he would recommend. (Interesting note: Ray appears to work for SAP. SAP does analytics consulting. Correlation!)

Magic voodoo snake oil algorithms are WHY more people should be doing data science. If everyone could examine data and competently discuss the results, nobody would be able to sell that kind of crap. The more people that can perform logical thinking, the less we have shady consultants.

I firmly believe that not enough people are doing data-sciencey type things. Everybody should be able to sit down with some data and a nice high-level language like Ruby or Python (or R if you've got serious chops) and look for correlations. And be able to show those correlations, or lack thereof.

So lay off, Forbes. Feel free to tell more people to do data science, but don't try and STOP data science. That's stupid. Don't be stupid.

Data Science gives us amazing things like the census dotmap, or the xkcdcongressional history graph, or even this awesome personal annual report just built at random. There are amazing uses of data science all around, and they've been around for years, and they keep getting better. And the more people doing data science, the better it's going to be!

I'm excited to see the next amazing use of data science. And I hope to continue improving my own data science for many years.

...

(Interesting thought: Fantasy football players are probably doing a lot of data science, even if they wouldn't call it that)

...

Question: why is data science considered "new"? I've basically done data science type things my entire working life (15 years now, FYI) - just never called it that, or approached it quite as rigorously as I do these days. Edward Tufte has been doing work here since 1970. According to Wikipedia, R was written in 1993, and SAS was introduced in 1968. Those are programs to do statistics on data. Nobody called it data science before. There's a name for it now. Makes it easier for a Google search, so I appreciate it.