The Data Science Venn Diagram by Drew Conway

Data Science is a surprisingly hard definition to nail down, especially given the fact that how ubiquitous the term has become.

Vocal critics have variously dismissed the term as a superfluous label (after all, what science doesn’t involve data?)But, these critiques miss something important.

Data science, is perhaps the best label we have for the cross-disciplinary set of skills that are becoming increasingly important in many applications across industry and academia. This cross-disciplinary piece is the key.

In VanderPlas’s opinion, the best existing definition of data science is illustrated by Drew Conway’s Data Science Venn Diagram (see the figure below), first published on Drew Conway’s blog in September 2010.

The Data Science Venn Diagram above captures the essence of what people mean when they say “data science”:

it is fundamentally an interdisciplinary subject. Data science comprises three distinct and overlapping areas:

the skills of a statisticianwho knows how to model and summarize (big) datasets;

the skills of a computer scientist who can design and use algorithms to efficiently store, process, and visualize this data; and

the domain expertise — what we might think of as “classical” training in a subject — necessary both to formulate the right questions and to put their answers in context.

With this in mind, it would be better to think of data science not as a new domain of knowledge to learn, but as a new set of skills that you can apply within your current area of expertise.

(If you want to get started with your data science journey and apply it in your area of expertise, check outthis page for some useful resources that I have collected for you.)