New Undergraduate Data Science Programs

1 August 20152,508 views2 Comments

The number of undergraduate statistics degrees has nearly doubled in the last four years—making it the fastest-growing STEM degree—and master’s degrees are also growing quickly. Further, the number of universities granting undergraduate statistics degrees has increased from 74 in 2003 to more than 110 last year.

Last month in Amstat News, we profiled five new undergraduate data science programs. Here are a few more.

University of Michigan

Kerby Shedden is professor of statistics at the University of Michigan, where he has been on the faculty since 1999. He has served as adviser and program director for the undergraduate programs in statistics, informatics, and data science. His research interests include genomics and statistical computing. He has also served as the director of the university’s statistical consulting center since 2011.

Atul Prakashis a professor in the department of electrical engineering and computer science at the University of Michigan. He received a B.Tech. from IIT Delhi and a PhD from University of California, Berkeley. He was instrumental in the design of the new Data Science undergraduate program and is currently serving as the co-chair of the program committee for the new program.

How do you view the relationship between statistics and data science?

Our view is that, in the near future, most practicing statisticians will also be data scientists, and a large fraction of data scientists will also be considered to be statisticians.

There are fundamental ideas from statistics that every data scientist should master, including thinking rigorously about variation and uncertainty, representativeness and generalization, efficient data collection and analysis, and meaningful reduction and summarization of data. There are also certain parts of data science that are not very statistical in the traditional sense, such as the challenges of engineering computing systems to efficiently store, transmit, and support access to data. We feel that statisticians who do not understand how modern computing systems work will become increasingly disadvantaged as advanced computational and data management tools become more intertwined with statistical theory and practice.

Most importantly, the ultimate goal of data science is to use data to gain insight and make decisions relating to phenomena that occur in the “real world.” Thus every data scientists should understand how data can be effectively analyzed and used to rigorously support insight and decision-making, which is traditionally the core emphasis of statistics.

Please describe the basic elements of your data science curriculum and how it was developed.

The foundation courses taken by our data science students in their first two years essentially encompass all the foundation courses that would be taken by either computer science or statistics majors. The courses include three semesters of calculus, linear algebra, programming (including data structures), and discrete mathematics. All data science students at U-M take upper-division courses in machine learning, databases, probability and statistics, and regression modeling. In addition, students take technical electives (drawn mainly from advanced courses in computer science, mathematics, and statistics), electives in an application domain where data science techniques are used, and a capstone course.

The data science program was developed as an evolution of our slightly less technical major in informatics and data mining, which will ultimately be discontinued. The computer science department and the department of statistics have collaborated closely in these endeavors over a number of years.

What was your primary motivation(s) for developing an undergraduate data science program? What’s been the reaction from students so far?

We think the rapid emergence of data science provides a unique opportunity for our students to be part of what could become a major transformation in the worlds of academic research and business over the next decade and beyond. We also hope to use the new data science major to innovate the courses we offer through the statistics department, so our “traditional” statistics majors will have more opportunities to take courses in emerging areas. Finally, many of the faculty hired into our statistics department in the last 5–10 years have computer science PhDs or major research interests in areas related to data science. The data science program thus opens up new opportunities for undergraduates to engage in research with the statistics and computer science faculty and to gain skills that will open doors to research taking place throughout the university.

Students are quite excited and curious about the new program. They know that many of the tools they use in their personal lives are driven by data, and they are hearing a lot about the career opportunities for people with knowledge and skills in this area. A number of students have commented that they value the opportunity to combine several interests, rather than focusing exclusively on either statistics or computer science.

Describe the reception you received from the partnering departments, other departments, and those at the university who had to approve the program.

We have not had difficulty convincing our colleagues at the college and university levels that this new program is intellectually deep and that the field of data science has the staying power to constitute an undergraduate major in a college of arts and sciences.

Within both the statistics and computer science departments, there was discussion about the wisdom of opening another academic program given the growing popularity of our existing undergraduate and graduate programs. In the end, the faculties of both departments came to see that this is an important new direction for both disciplines and that the time to act is now.

What advice do you have for students considering a data science degree versus a computer science degree, a statistics degree, another degree, or some combination of the above (e.g., a double major of statistics and computer science)?

Many students are relieved to learn that the decision to major in statistics versus data science does not necessarily close doors to them. Both programs are rigorous and provide an opportunity to develop foundational knowledge and skills that can lead to a variety of career paths and graduate programs. In addition, the way our program is structured allows a student to wait to make a final decision until after taking a few courses that will count toward either program.

If a student in data science decides to develop their “traditional” statistics knowledge to a level similar to that of a statistics major, the main thing they are missing is a course in statistical theory, which can be taken as an elective. A double major in computer science and statistics is also an option, and might make sense for someone with interests in areas of computer science that are unrelated to data science.

Miami University

Number of students currently enrolled: 65 (53 in business analytics track + 12 in predictive analytics track)

First students expected to graduate: 2015 (11 graduated May 2015)

Partnering departments: The analytics co-major is jointly offered by the department of statistics in the college of arts and sciences and the department of information systems and analytics in the Farmer School of Business at Miami University.

A. John Bailer is university distinguished professor and chair of the department of statistics at Miami University. He is one of the developers of the analytics co-major and the data visualization class that is a core class in the co-major.

L. Allison Jones-Farmeris the Van Andel Professor of Analytics in the Farmer School of Business at Miami University. She is the director of the newly formed Center for Analytics and Data Science and co-developed the analytics practicum.

How do you view the relationship between statistics, data science, and analytics?

Statistics, data science, and analytics are all problem-solving methods used to turn data into information. The three areas use many of the same tools and techniques, but have emerged in different application domains with different emphases.

Statistics, the original science of data, encompasses data analysis with theoretical underpinnings such as study design, model development, and model evaluation. Data science involves solving complex, multifaceted problems by combining tools from mathematics, statistics, computer science, and information technology. The term “analytics” precedes the term “data science” as a buzzword and was originally coined to describe the use of data-driven decision-making in business and sports. Like data science, analytics problems are often multifaceted, relying on tools from mathematics, statistics, computer science, and information technology.

To be successful, applications of statistics, data science, and analytics all require the decision-maker(s) to have a deep understanding of the context or problem domain.

Please describe the basic elements of your data science/analytics curriculum and how it was developed.

The conception, design, and implementation of the analytics co-major were a partnership between the department of statistics in the college of arts and sciences and the department of information systems and analytics in the Farmer School of Business. The co-major was developed in recognition of the strengths of the two departments in the areas of data management, programming, statistics, and statistical learning that, when combined, could generate a ground-breaking opportunity for our students.

To attain a co-major, a student must complete as many hours in analytics as required for most full majors, but the student must also have another supporting major. This allows our students to develop domain knowledge in many areas (e.g., statistics, mathematics, marketing, computer science, finance, information technology, geography, journalism, etc.) while gaining critical data management and analysis skills.

The basic elements of our program are a set of core classes that address the following topics:

Data description and summarization

Data management – structured and unstructured

Regression models

Visualizing data and digital dashboards

There are multiple classes that meet each of these topics. Although many of these courses already existed, most have been completely revamped since the program’s inception to meet the changing technology and skill requirements in industry. The data visualization class was a new course added to the curriculum. It is worth noting that this class was developed by three departments (statistics, interactive media studies, and journalism) and has been team taught by a statistician and graphic designer each time it has been offered.

All of our major courses emphasize the use of real data provided by our university, community, or corporate partners. In their senior year, students have the option of taking a semester-long analytics practicum course in which they work with an external client to solve an analytics problem.

In addition to the core classes, students sign up for one of two tracks: business analytics (developed by the department of information systems and analytics) or predictive analytics (developed by the department of statistics). We are planning to expand the program to include new tracks. In particular, the department of geography is developing a geospatial analytics track that will include geographic information systems and related coursework. The department of computer science and software engineering is also developing a data science track that will emphasize more technical data-related programming and data architecture skills.

What was your primary motivation(s) for developing an undergraduate data science/analytics program? What’s been the reaction from students so far?

Our motivation for developing the program was to provide students with skills to satisfy the needs of business, industry, and government. Student reaction has been very positive. We have gone from zero to 65 co-majors in less than two years and have already had 11 students graduate with the co-major. In addition, our analytics co-majors are highly recruited, with nearly all having multiple job offers by the fall of their senior year.

Describe the reception you received from the partnering departments, other departments, and those at the university who had to approve the program.

We were careful to build a unique program that complements, rather than cannibalizes, other majors. Thus, the reception has been overwhelmingly positive by the sponsoring departments, their respective divisions, and the campus community at large. To further our collaborative initiatives, we have received seed funding from the provost’s office, the Farmer School of Business, the college of arts and sciences, and the college of engineering and computing to launch a new center for analytics and data science. In addition, we enjoy partnerships with several corporations.

What advice do you have for students considering a data science/analytics degree versus a computer science degree, a statistics degree, another degree, or some combination of the above (e.g., a double major of statistics and computer science)?

We believe a combination of studies is ideal for preparing for a career in data science and analytics. Our program provides a unique opportunity to blend the data science/analytics studies with either more technical or managerial aspects of data-related careers. Statistics and data mining provide the foundation for evidence-based decision-making, designing studies, and building and validating models for prediction. Information systems and computer science provide the managerial and technical foundations for understanding the structure, processing, storage, and extraction of data.

The ability to clearly, correctly, and concisely display, write about, and talk about complex data-related problems is a key differentiator between an analyst who exclusively functions in a technical support role and an analytics or data science professional.

In addition, domain knowledge in areas such as biology, bioinformatics, marketing, finance, sports management, or other application areas will complement the student’s data management, analysis, and communication skills.

Finally, hands-on experience in the form of an internships, practicum, data analysis competition, or project is very important.

The Ohio State University

Partnering departments: The major is co-directed by the department of statistics (college of arts and sciences) and the department of computer science and engineering (college of engineering). The major has curricular partnerships with departments from the college of arts and sciences, the college of engineering, the college of medicine and the Fisher College of Business.

Christopher Hans is an associate professor in the department of statistics at The Ohio State University and co-directs the university’s undergraduate major in data analytics. His research focuses on Bayesian methodological development with an emphasis on statistical computing.

Srinivasan Parthasarathyis a full professor in computer science and engineering at The Ohio State University. He directs the data mining research lab and co-directs the undergraduate major in data analytics. He is well known for his work in data analytics, database systems, and bioinformatics and chairs the SIAM data mining steering committee.

How do you view the relationship between statistics, data science, and analytics?

We view statistics and computer science as forming part of the foundation of data science and analytics. The practice of data analytics requires a large and complex skill set that draws on many disciplines, yet its core is built upon fundamental principles of statistics and computer science. Being a successful data scientist requires being able to solve problems in a variety of areas, and understanding the principles of statistics and computer science allows one to apply knowledge gained in one area of application to another.

Please describe the basic elements of your data science/analytics curriculum and how it was developed.

The major in data analytics leads to the BS degree in the college of arts and sciences and is structured to have four major components. Students obtaining this degree must satisfy the college’s general education requirements, which help develop many of the “soft skills” employers of data scientists find attractive (e.g., communication and critical thinking skills and an appreciation of diverse and foreign cultures).

Students then choose a specialization in the major where they learn how to apply concepts learned in the core to problems in specific areas. Current options include business analytics, computational analytics, and biomedical informatics; a fourth specialization in social science analytics is under development.

Finally, in the experiential component, students integrate components of the general education curriculum with the core and specialization by working on problems supplied through partnerships with business and industry as part of a capstone project. This component may also be further enriched by targeted internships with industry. During these semester- or year-long courses, students work with faculty and specialists from external business and industry partners to formulate and implement approaches to solving contemporary data analytics challenges.

What was your primary motivation(s) for developing an undergraduate data science/analytics program? What’s been the reaction from students so far?

Our primary motivation in developing the program was to construct a coherent curriculum that would prepare students to work in this exciting and growing area. While previously students could only piece together courses from various departments that would address particular aspects of analytics (with no guarantee of consistency of curricular structure), development of the major has provided an integrated approach to data analytics education that provides a natural and cohesive curricular path from start to finish. In fact, many of the courses in the major have been developed from scratch specifically for this program.

The student response has been overwhelming: More than 80 students had selected data analytics as their major plan by the end of the first year of the program. The cohort is also particularly strong when measured by high-school and college performance and scores on standardized tests. Our most popular specialization is the business analytics specialization.

Describe the reception you received from the partnering departments, other departments, and those at the university who had to approve the program.

Our major is truly interdisciplinary and was developed from day one through partnerships between the college of arts and sciences, college of engineering, college of medicine, and Fisher College of Business. Many units across campus were excited to contribute to the major, and collaboration between faculty from these units was a key to its successful development. The university was extremely supportive of the development of the major and provided early and frequent feedback that helped smooth the approval process.

What advice do you have for students considering a data science/analytics degree versus a computer science degree, a statistics degree, another degree, or some combination of the above (e.g., a double major of statistics and computer science)?

As the details of programs vary from university to university, it is difficult to give general recommendations about which program might be right for you. We would recommend students choose programs that match most closely with their interests and career goals. At The Ohio State University, the data analytics major is ideal for students who are passionate about all aspects of working with and learning from data. One major difference between our data analytics major and other related majors at the university is the way in which the core of the major is integrated with a specialization and capstone experience to provide a comprehensive curriculum that prepares students to address real-world data challenges.

Welcome!

Amstat News is the monthly membership magazine of the American Statistical Association, bringing you news and notices of the ASA, its chapters, its sections, and its members. Other departments in the magazine include announcements and news of upcoming meetings, continuing education courses, and statistics awards.

Departments

Archives

QUOTABLE

ADVERTISERS

MISC. PRODUCTS AND SERVICES
University of New Hampshire

PROFESSIONAL OPPORTUNITIES
Academia Sinica
The Chinese University of Hong Kong
Columbia University
Emory University
MD Anderson Cancer Center
NISS
NIH/NIAID
U.S. Census Bureau
University of Minnesota
University of Pittsburgh
Vanderbilt University
Virginia Tech
Westat