Learn how to use IPython and Pandas [know basics of both, more to learn!]

Publish/share a project on GitHub

Complete a machine learning project that helps the University I work for better understand a subset of its data– Partially complete with Machine Learning class Final Project. Update 1/7/2015: My recently-approved “capstone” project for my masters degree will fulfill this goal!

Conduct a thorough study of the data at the University I work for using skills gained in my Data Science coursework to complete an analysis that can help the University raise more money during a fundraising campaign Update 10/15/15: I have received some funding to help my department learn more about how we can apply predictive analytics and data mining as part of what we do for the university!

Become known for a University Advancement (Development/Fundraising) data science-related achievement (present my work at a CASE conference, for instance)

Compete in a Kaggle competition or other challenge where my work is compared to others

NEW GOAL Oct 2015: Build an awesome data science learning directory that uses data science techniques to recommend content to users to help others learn data science! Working on this at DataSciGuide.com

Long Term:

Get to the point where I have enough skills and projects under my belt to call myself a Data Scientist (I’m aware this is quite subjective), probably with a specialty in University Advancement data

If it seems necessary/valuable, get a Data Science certification such as the one at Cloudera

8 Comments

Hello Renee!
Could you share your thoughts on Math involved in Data Science?
As far as I know, some Calculus,Algebra(Linear),Statistics are involved.
How good one should be good in Math to be good in data science?
Where one should start with Math to obtain skills for ds?
And how you are doing about it?

Renee

Sep 24, 2014

Hi Gobinath!

Yes, I think Linear Algebra is pretty necessary (an understanding of transforming matrices), as well as a solid footing in Calculus, and a strong Statistics background.

The answer to your question is dependent on what area of Data Science you plan to focus on. The most advanced math I have done was in my Machine Learning and Optimization classes, where we were learning the calculus behind statistical distributions, and setting up matrices full of partial derivatives. However, I don’t plan to be advancing the state of the art of ML or Optimization algorithms, so I’m not so worried that I struggled with both of those.

I have done well in my Statistics classes, and have bought some books to solidify my understanding of Bayesian statistics. Most of the advanced math I’ve learned has been in graduate classes, but if you already have the calculus and statistics foundation, you could learn more online. If you are like me and want to understand enough to know what already-built tools and algorithms are doing, but aren’t planning to develop new algorithms, you probably don’t need graduate-level math courses to succeed in the field.

I’m not an expert on this, and just explaining my basic understanding of what is needed, but again, it will vary based on your focus. The book “Doing Data Science” gives an overview of the type of skills you should know, and points at lots of references on where to learn more. Check it out if you haven’t already!

Hey, I’m really interested in your website and the path you’ve taken so far. I’m actually a student at JMU undergrad myself (Econ) considering a career path in data analytics/data science. UVA’s Master of Data Science program is something I’ve been considering. I would like to hear your thoughts about why you pursued your particular grad program. Also curious about the specific course load you had before starting grad school. My math course load has been pretty lax (Intro to calc and intro to stats with some econometrics courses,) so I’m worried about whether a grad program would take me without further math courses. Thanks.