Conversations with a Resident Analyst: What is a Data Scientist–Really?

Kiran works as a Senior Analyst at LatentView and holds a Dual Degree from IIT Madras. He enjoys watching and playing football.

Like many others, one year ago I was more of a spectator to the media hype surrounding the new buzz word “data scientist”. Being a data spying professional myself, I wanted to do a deep dive search and find out more about the origins of this newly evolved role. With some ground work, I have learned that over the years many technology companies like Facebook, LinkedIn and Google had data centric teams completely focused on developing products or services to create more value for their customers. Immense success achieved by these teams has in-fact led to the origin of this new term “data scientist” and it has gone main stream since then. Some work aspects of a data scientist have lots of similarities with traditional roles such as Statistician, Predictive Analyst, Business Analyst and Business Intelligence Analyst.

According to D.J. Patil, one of the pioneers in the data science field, data scientist is a new kind of computer scientist — a gig that’s one part mathematician, one part product-development guru, and one part detective and dare I forget one part sexy! As Patil claims, by 2020 some 50 billion devices, from cars to appliances, will be talking to one another and companies will soon need teams of data scientists to sort through everything from internal inventory metrics to customer tweets. Anjul Bhambhri, vice president of big data products at IBM, describes the data scientist role as “part analyst, part artist”. She says, “A data scientist is somebody who is inquisitive, who can stare at data and spot trends. It’s almost like a Renaissance individual who really wants to learn and bring change to an organization.”

After chewing down much more information than I could swallow, I would have to say that a data scientist possesses strong domain knowledge, identifies right business problems and gathers data from multiple diverging sources, not just limiting to one single source like a CRM system. The data scientist then proceeds to write computer algorithms to scrape the data, parse and convert it into an analysis ready format. Once the data handling part is done, he/she would do what I like to call a “what if” analysis, question existing assumptions, mine for hidden patterns, perform statistical hypothesis tests and find out innovative ways of solving the pre-defined business problem. Better equipped with data and analytical insights, the data scientist would then communicate the findings and recommendations to business end users through a nicely crafted story.

In short, data scientists are not limited to one concrete definition or description, they do it all.

We are living in an age where information is the new oil and the demand for data scientists is rapidly increasing. I would say those who refuse to accept this would soon become part of history.