Saturday, July 7, 2018

A Suggested Curriculum for Undergrad Data Science

I take a lot of meetings with young people wanting to be data scientists..

The meetings generally have this feel:

Hi, I have interest in becoming a data scientist and I want to get your perspective on what that might take, let's meet for coffee.

The interns that setup these meetings come from a wide range of backgrounds and skill levels. Many times they are interested in which classes they should take to be competitive for data science roles after they graduate. I thought it would be helpful to put the advice I give them into a blog post so more people can read it.

I reviewed the current course offerings at a few schools, and settled on a list of 14 basic classes (and some electives at the end) that I think should be part of every data science curriculum. Here is my list:

MATH

Calculus I, II, III

Differential Equations

Linear Algebra

Some data scientists say you can skimp on these requirements, but that's universally false. If you don't understand matrix algebra or differentials you simply can't understand the algorithms we implement. The data scientists I see who struggle with higher level math fail in understanding the operations of complex algorithms, which leads to failure in implementation.

STATISTICS

Intro to Stats

Calculus Based Stats

Generalized Research Method Class

Econometrics

Although many modern data science algorithms are not based purely in statistics, the concepts of risk, certainty, and confidence in these classes are key to understanding predictive modeling in general. Having worked with computer science only focused data scientists, I now see a background in stats as key to the fundamentals of making predictions.

Econometrics may seem like an outlier, but there are concepts of predictive modeling such as time-series analysis, dealing with collinearity, endogeneity and auto correlation which are best taught in the context of econometrics.

COMPUTER SCIENCE

Programming I, II

Data Structures

Fundamentals of Computer Algorithms

Introduction to Database Systems

Sometimes I question the value of formal education in coding, some of the best programmers I know have degrees in non-computer fields. That said, computer science is still a core skillset for data scientists, and is required knowledge to be hired by someone like me (if you have the skills from another source that's great, just figure out a way to demonstrate it with an application/in an interview).

ELECTIVES

As for electives in the data science space, these should be modeled towards what specifically you want to do.

If you want to go into business, take classes in economics, business operations, and accounting.

If you want to go into algorithm development, focus more time in advanced computer science classes.

If you want to go into academic research, focus on whichever academic discipline you are most interested in.

CONCLUSION

This is intended to be a reasonable list of classes for young people interested in data science. It serves two purposes really:

Provide a framework for undergrads looking to become a data scientist.

Prevent me from saying things I later regret when confronted by students who want to be data scientists without taking math.

Questions in calculus are sometimes unbearable and too complicated for me. As far as calculus grounds on algebra, trigonometry, and analytic geometry it’s difficult for me. I didn’t understand trigonometry when we were studying it in university and it influenced understanding of calculus. Thanks God, I’ve found this service https://www.assignmentexpert.com/math. Team of experts help me with the highest quality work that puts me back on track to succeed in my math classes. Timely delivery, reasonable prices, and proper formatting! Highly recommend!