Who are you, Data Scientist? Answer with a survey

“I keep saying that the sexy job in the next 10 years will be statisticians, and I’m not kidding”.

Hal Varian, Chief Economist at Google and emeritus professor at the University of California, better known as Berkeley, said on the 5th of August 2009.

Today, what Hal Varian said almost seven years ago has been confirmed, as is highlighted in the following graph taken from Google Trends, which gives a good idea of the current attention to the figure of Data Scientist.

The Observatory for Big Data Analytics & BI of Politecnico di Milano has been working on the Data Scientists theme for a few years, and has now prepared a survey to be submitted to Data Scientists that will be used to trace a profile of Data Scientists, within their company and the context in which they operate.

If you work with data in your company, please support us in our research and take this totally anonymous survey. Thank you from the Observatory for Big Data Analytics & BI.

Graph 1: How many times has the term "Data Scientist" been searched on Google. The numbers in the graph represent the searched term in relation to the highest point in the graph. The value of 100 is given to the point with the maximum number of searches, the other values are proportional.

"Data scientists are involved with gathering data, massaging it into a tractable form, making it tell its story, and presenting that story to others."

We are in the era of Big Data, an era where 2.5 quintillion (10^18) bytes are generated every day. Both the private and public sector everywhere are adapting so that they can leverage the potential of Big Data by introducing people into their organizations who are capable of extracting information from data.

Getting information out of data is of increasing importance because of the huge amount of data available. As Daniel Keys Moran, programmer and science fiction writer, said: “You can have data without information, but you cannot have information without data”.

In companies today, we are seeing positions such as CDO (Chief Data Officer) and Data Scientist more often than we were used to.

The CDO is a business leader, typically a member of the organization’s executive management team, who defines and executes analytic strategies. This is the person actually [ndt. Se effettivamente ok, se attualmente cambiare in currently] responsible for defining and developing the strategies that will direct the company’s data acquisition, data management, data analysis and data governance processes. This means that new governance roles and new professional figures have been introduced in many organizations to leverage what Big Data offers them in terms of opportunities.

According to the report on “Big Success with Big Data” (Accenture, 2014), 89% of companies believe that, without a strong data analytics strategy, in 2015 they risk losing market share and will no longer be competitive.

Collecting data is not simply retrieving information: the Data Scientists’ role is to translate data into information, and currently there is a dearth of people with this set of skills.

It may seem contradictory, but both companies and Data Scientists know very little about what skills are needed. They are operating in a turbulent environment where frequent monitoring is needed to know who actually uses which tools, which tools are considered old and becoming obsolete, and which are used by the highest and lowest earners. According to a study by RJMetrics (2015), the Top 20 Skills of a Data Scientist are those in the following graph.

The graph clearly shows the importance of tools and programming languages such as Rand Python. Machine Learning, Data Mining and Statistics are also high up in the list of most requested skills. Those relating to Big Data are at about the 15th position.

The most recent research on Data Scientists showed that these professionals are more likely to be found in companies belonging to the ICT sector, internet companies and software vendors, such as Microsoft and IBM, rather than in social network companies (Facebook, LinkedIn, Twitter), or Airbnb, Netflix etc. The following graph, provided – as the previous one - by RJMetrics, gives the proportion of Data Scientists by industry.

It is important to keep monitoring Data Scientists across industrial sectors, their diffusion and their main characteristics, because, in the unsettled business world of today, we can certainly expect a great many changes to take place while companies become aware, at different moments in time and in different ways, of the importance of Data Scientists.