Profile type

What is this career path?

Data scientists analyse data (often very large datasets - see big data) using advanced techniques drawn from the disciplines of computer science and statistics in order to deliver actionable insights for organisations. Data science jobs are related to data analyst jobs -- the main difference between these is that data analysts don't use advanced techniques from computer science and statistics in order to perform their analyses. Data scientists tend to hold PhDs in quantitative subjects.

Drew Conway's Venn Diagram of data science

For most problems you can use existing algorithms - 90-95% of the time.

One of our users said "a data scientist is a statistician who knows a big more software engineering than normal; or a software engineer who knows a bit more statistics than normal."

What are the people like?

This varies widely by company, city and team. We've been told that at highly skilled teams in San Francisco the people are highly talented, hold PhD's in quantitative from elite universities and the environment closely resembles academia, with very cerebral discussions.

"There are in fact edge cases of data scientists getting paid over $250,000 in unique situations – e.g. hedge funds, or special cases of advanced algorithm development – but this well above the norm."[1]

Advocacy potential

Career capital

Common exits

The two main routes for progression within data science are:

Management roles - data scientists can become valuable product managers because they know the limits of data science and software engineering. Well-rounded individuals can become senior leaders.

Individual contributor roles - these are when you are an expert in a specific area

Some fields of academia are easier to go back into after working in industry than others. Computer Science is easier to re-enter. Astrophysics is very difficult to re-enter.

Culture

Exploration value

Personal fit

We've been told that the most important skill for a data scientist is the ability to think like a scientist and is the slowest and hardest to train. Second most important is ability in probability and statistics, and the third is programming.

"If you don't love data for its own sake, then you will find it hard to compete with such candidates. Burtch, however, says everyone should learn to love data, if only for the sake of their career. "Within 10 years, if you're not a data geek, you can forget about being in the C-suite," Burtch says." [2]

Entry requirements.

What does it take to progress?

The two most important qualities for progression suggested to us in an interview with a senior data scientist were:

Being an excellent communicator

Strategic ability - having insight into what will most efficiently get an organisation to meet its high-level goals

Barriers

Job satisfaction

The only downside of the job, Greenberg says, is the time spent "cleaning" data — pruning it to remove irrelevant findings. "That part's not that exciting and you spend a lot of time doing it," he says. (http://mashable.com/2014/12/25/data-scientist/)

Vacation policies tend to be good because of how in demand the skillset is.

Alternatives

Academia

We've been told that in data science what you work on has direct impact on something tangible. Some companies will have data scientists working at the edge of human knowledge and making original contributions, though at some companies and roles data scientists just apply current domain knowledge to achieve organisational goals.

Past experience

Take action

Learn more

Next steps

For getting a job the most important skills to gain are really strong knowledge of statistics and probability, a statistical programming language like R or Python, and the basics of SQL. R is more for high-level research, whereas Python has broader uses and is a larger investment as there is more to learn.

Making a portfolio of projects is important for demonstrating your ability. You should start doing your own projects and write them up on a blog to show off your visualisation and communication skills, and put your code in a GitHub repository.