The University of Virginia is in the closing stages of creating a Master of Science in Data Science (MSDS) and the eventual goal is to have an undergraduate minor and a Ph.D. program in Data Science. The curriculum for the MSDS contains a nice mix of math, computer science and statistics courses. It even includes coursework in visualization. Also, the program appears to be an entirely new program and not just the renaming of an existing program.

The University of Virginia is definitely taking the correct steps to become a recognized leader in data science education.

Many of the top data scientists you will read about or hear speak have PhD degrees. Therefore, many people think a PhD is a requirement for becoming a data scientist. That is completely not true. There is a lot of work in the data science field that does not require a PhD. In all actuality, there is not a lot of data science work that does require a PhD.

What is a PhD and why would a person get one? A PhD degree is a research degree that usually takes between two and five years of study beyond a master’s degree. The majority of the program will be focused on researching and expanding upon a very specific topic. A PhD student will push the edge of known human knowledge.

In daily tasks, most data scientists do not go that far and do not need a PhD. Most of the necessary skills can be obtained at the bachelors or masters level. Combine that education with the amazing tools available and some experience and being a data scientist is definitely achievable.

The reasons many data scientists have PhD degrees are because of the curiosity and love for learning. Those are essential traits of both a data scientists and PhD students. However, you can be curious and love learning without attending enough school to obtain a PhD.

All of this is not to say that earning a PhD is bad. If you really love learning, thrive in the academic environment, and have the desire; then definitely go for the PhD. However, do not let a lack of a PhD stop you from doing data science.

Due to the large list of Colleges with Data Science Degrees, I receive a number of email inquires with questions about choosing a program. I have not attended any of the programs, and I am not sure how qualified I am to provide guidance. Anyhow, I will do my best to share what information I do have.

Originally, the list started out with 5 schools. Now the list is well over 100 schools, so I have not been able to keep up with all the intricate details of every program. There are not very many undergraduate options, and the list only contains a few PhD programs, so the information here will be focused on pursuing a masters degree.

Start by asking 2 questions:

What are my current data science skills?

What are my future data science goals?

Those 2 questions can provide a lot of guidance. Understand that data science consists of a number of different topic areas:

Mathematical Foundation (Calculus/Matrix Operations)

Computing (DB, programming, machine learning, NoSQL)

Communication (visualization, presentation, writing)

Statistics (regression, trees, classification, diagnostics)

Business (domain specific knowledge)

After seeing the above lists, this is where things get cloudy. Everyone brings a different set of existing skills, and everyone has different future goals. Here are a few scenarios that might clear things up.

Data Scientist

The most common approach is to attempt to build knowledge in all 5 topic areas. If this is your goal, find the topic areas where you are weakest and target a graduate program to help you bolster those weak skills. In the end, you will come out with a broad range of very desired skills.

Specialist

A different approach is to select one topic area and get really, really good. For example, maybe you want to be an expert on machine learning. If that is your goal, then maybe a traditional computer science graduate program is what is best. In the end, you will be well-suited to be an effective member of a data science team or pursue a PhD.

Data Manager

A third and also common approach is from people that want to help fill the expected void of 1.5 million data-savvy managers. These people do not necessarily want to know the deep details of the algorithms, but they would like an understanding of what the algorithms can do and when to use which algorithm. In this case, a graduate program from a business school (MBA) might be a good choice. Just make sure the program also involves coverage from the non-business topics of data science.

Example

I think NYU is the best example of a school that can help a person achieve just about any data science goal. The NYU program is a university-wide initiative, so the program is integrated with many departments (math, CS, Stats, Business, and others). Therefore, a student could possibly tailor a program to reach a variety of future goals. Plus, New York has a lot of companies solving interesting data science problems.

Conclusion

There you have it. It does not narrow the choices down, but it should help to provide some guidance. Other factors to consider are length of a program and/or location.

Good Luck with your decision, and feel free to leave a comment if you have and good/bad experiences with any of the particular graduate programs.

… to establish the country’s leading data science training and research facilities at NYU.

Part of the announcement is an M.S. in Data Science. Applications for the initial class, starting Fall 2013, are now being accepted. The Center for Data Science also plans to offer Ph.D. degrees via the Mathematics, Statistics, and Computer Science departments. I am not sure if an official Ph.D. degree in Data Science is being planned.

Step 2 seems obvious. Math, stats, and computer science are some of the key areas for data science. I would add communication and presentation skills to the list because people with just math, stats, and CS skills are not known to be naturally good communicators. I agree with step 3. More research needs to be done, but most of the research will need to be interdisiplinary. Universities need to put more effort into interdisiplinary research.

Step 1 confused me a bit. The argument was data science has too many necessary skills and an applied focus area. Of course a person cannot learn everything about data science in an undergraduate degree. Earning a computer science degree does not mean you will know everything about computer science. It just means you know the fundamentals about algorithms, architecture, and operating systems. You know enough about computer science to understand the field and learn more as you go. I think 4 years should be enough time to do the same for data science.

Based upon the popularity of a previous post about a certificate program from the University of Washington, it appears that many people are interested in learning the skills necessary to become a data scientist. Thus, I decided to compile a list of some of the possible learning strategies.

Traditional College Education

The most obvious path would be to study at a traditional college or university. Colleges and universities are starting to notice the demand for data science skills, and many colleges are currently offering programs to prepare someone as a data scientist. This path is safe and predictable. Do the homework, complete the courses, and get the degree or certificate. Most people are familiar with the process, and it offers few surprises. The problems here are the costs, lack of flexibility, and time involved.

Starting in the fall of 2012, the University of Washington will be offering a certificate in Data Science. The program has two sections: one located in Seattle and the other online. The certificate consists of three separate courses each lasting approximately 3 months. Thus the program can be completed in 9 months, and the cost is around $3000.

The program content looks quite good. Some of the topics to be covered include: hadoop, NoSQL, machine learning, statistics, graph algorithms, and more. If you are looking to become a data scientist, this just might be the program for you.

With some of the top tech entrepreneurs in the U.S. either dropping out of college or not attending, there is some debate about whether college is the right choice or not. This post will focus on college for data science. However, for college in general, if you know what you want to study, then college or graduate school is a great option. If you are going to college because you do not know what else to do, I would say college is too expensive for that.

College?

Most would agree that an undergraduate degree in some highly analytical field (math, CS, economics, physics) is definitely beneficial. Plus college has a strict set of guidelines and a specific order for the learning. A formal degree program often provides the necessary motivation for a person to continue learning. The U.S. college education system is not perfect, but if it keeps a person from quitting, it will help to reach the goal of becoming a data scientist.

All this leads to a second point. Only a few colleges offer undergraduate degree programs for data science. Thus, graduate school or more learning will still be required. College should provide the necessary prerequisites and many employers will pay for the continued learning.

No College?

A highly motivated person could probably learn most if not all the data science skills on the internet for free or very low cost. The key is being a highly motivated person. That person must have the drive to not quit when the learning becomes difficult. Also, there are no classmates or professors to help with difficult concepts. Sure, the internet can help there, but it requires a bit more work to find the help. Plus, knowing what topics to learn and in what order can be challenging. Already, this blog has much helpful content, but it is not organized based upon a sequence of learning. Not attending college presents some obstacles that only the most highly motivated students will overcome. As more and more learning resources appear online, the no college option may become more popular.

What is the Answer?

Strictly speaking, I would say the answer is NO. However, many people will not succeed without the rigor of school, and some companies will not hire a person without a degree. So, college is not 100% essential to being a data scientist, but for many it is probably the best option.

Colleges and Universities are slowly starting to notice the demand for employees with data science skills. Most of the programs are not named data science, but they all focus on producing data people. Below are a couple of the programs I have noticed so far.