Big data is eating the world – but it’s not eating the data scientist

The role of data scientists is changing as companies become more data-driven. But how can organisations embed data into their DNA?
Ben Rossi

Venture capitalist Marc Andreessen famously remarked that ‘software is eating the world’. Today the same can be said about big data.
Without tapping into proprietary data and open data, businesses will fail to differentiate in a world of savvy start-ups where there is an Uber for X, Y and everything else.

Instead of developing good software, businesses will now need applications that create a holistic view of customers and their contexts.
Rich streams of data (social media, connected devices, calendars, clickstreams) will need to be fed into algorithms for predictive analytics and personalisation. Being ‘data-driven’ will no longer be a differentiator but a basic necessity.

For some, that sets alarm bells ringing. Old fears about machines taking human jobs resurface. However, the best companies will combine human talent and tech. If you’re a data scientist, big data may eat the world – but it won’t eat you any time soon.

A rare and expensive breed
Data scientists are at the centre of the big data conversation. They are accomplished technical specialists capable of using an array of tools to interrogate data. They answer the questions businesses ask of their data, and the ones they didn’t even know they should be asking.
Yet, the shortage of data talent is evident in the statistics – CrowdFlower’s 2016 data science report found that 83% of respondents said there weren’t enough data scientists to go around. Demand for data scientists is sky high.

Why?

Some of the greatest data storage and processing technologies of recent years have been the product of a small coterie of the best engineering brains.
For example, Hadoop’s seeds were sown by a group of engineers at Google but grown and open sourced at Yahoo. Spark was developed at UC Berkeley’s AMPLab.
Although these innovations spread like wildfire in the open source tech community, there is a shortage of talent with the analytical experience to understand and deploy these complex technologies effectively.

As the law of supply and demand dictates, this makes data scientists expensive. The median salary is $119,000, nearly double the average developer salary of $65,000, as reported by Glassdoor – as more demand than supply for tech talent is creating a competitive recruiting environment.

Of course these figures depend on factors such as location and seniority, but it’s clear that good talent comes at a premium, and a great place to work with stretching professional challenges will be crucial to hiring and keeping people too.
Decentralising and democratising data

If spending these amounts on technical talent, and even more on technical infrastructure, then it’s important to be getting the most out of those considerable investments.
In practice, this question is very similar to another. Namely: how do we embed data into the DNA of an organisation? Relying on a small group of brains to re-orient a company is unlikely to end in success.