The University of California at Berkeley is offering what it calls the country's first fully online master's program for data scientists.

Starting Wednesday, Berkeley will be taking registrations for its Master of Information and Data Science program, which will begin in January 2014. The program is being offered by the School of Information, which emphasizes management of information over information technology. That means students will be expected to have a working competence with software such as R for statistical analysis and will be introduced to technologies such as Hadoop, but won't get quite as deep into the design of systems or advanced algorithms as if they were in a computer science graduate program. Instead, they will develop the skills to put big data to work for all the businesses angling to make more productive use of more data, from a wider variety of sources.

This is a completely new graduate program, offered only online, although it has some overlap with the school's on-campus Master of Information Management and Systems program. The I School, as it's known, also offers a Ph.D. program. Dean AnnaLee Saxenian said it made sense to offer the data science program online because it is the Internet era that has given rise to a huge demand for people who can make sense of all the data produced by the online world. Although she tries to stay away from the term "big data," saying "it's become so amorphous it means everything or nothing," there is a reason "data scientist" has emerged as a new job title at so many companies, she said.

"We're inundated with all this data thrown off by the Web, by sensor networks, and by the mobile devices we carry," Saxenian said. New tools have emerged to analyze masses of heterogeneous data, but the process is different than the methods that worked with structured data and it tends to require a more cross-disciplinary approach, she said. "You need not simply be a programmer or computer scientist, you also need the tools of a statistician. You need to understand research design and how to communicate what comes out of the data to decision makers."

I School faculty will teach their curriculum alongside experienced data science professionals. Classes will range from an introduction to machine learning and data storage and retrieval to the privacy, security and ethics of data. Machine learning is the intersection of computer science and statistics that focuses on finding patterns in data.

"Data science and big data are a very important place for job growth," said Chip Paucek, CEO and co-founder of 2U, which is providing the technical platform and support services for the program. "I know, as a CEO, I will end up hiring several people from the program" because the skills are so scarce and valuable, he said.

A mockup of the Berkeley online program in 2U.

Saxenian said she expects the program to attract computer scientists but also students from other majors such as philosophy and the social sciences. In industry, data scientists typically work in partnership with hard-core programmers, she said. "They need to understand the pitfalls of diving into data and be able to come up with good research questions. They also need to understand our cognitive biases, such as confirmation bias. They need to understand both the opportunities presented by data and the ways you can go wrong with it." These are the people who decide what data to collect, how to collect, how to analyze it, and particularly how to visualize it, she said.

"We are awash with data, but the expertise to analyze and exploit that data is in short supply. The mission of the MIDS degree is to provide that expertise," said Hal Varian, a professor emeritus at the I School and chief economist with Google, in a statement.

Although Berkeley is also a participant in Coursera, the purveyor of massive open online courses offered to thousands of students, the data science program will not be a MOOC. Quite the opposite. Classes will be small, with no more than 15 to 20 students.

Students will participate in live, face-to-face classes with fellow students and professors via the Web. Additional coursework will include lectures, interactive case studies and collaborative assignments. Classes will use 2U's online platform and feature self-paced content developed by I School faculty and a video chat feature for online discussions.

"This will be our first major effort in online education, so we wanted a partner to help us do a good job," Saxenian said. Engaging with 2U as a cloud service made sense even though Berkeley has its own technology in place for offering online courses, she said.

"Online education is really less about the technology than you might think. It's really a service business as much as anything," Saxenian said. "So much of the education process is about getting students engaged, making them feel comfortable on the platform, and building the social world they feel comfortable in. What we know about education is that students need to be engaged and motivated. A lot of hand-holding goes on with the students at every stage, and the high-touch piece of this is very important."

2U is also "rightly proud of its completion rates, which are very high -- 80% or more -- which is not true of the MOOCs," Saxenian said.

2U's Paucek said he is also enthusiastic about Saxenian and the rest of the I School's leadership. "They're ready to do something that's not just a toe in the water," he said. Students will also be required to participate in a one-week immersion program, on campus, so that they will meet their professors and the other students in person.

Some of the technical aspects of the program are still under development. For example, Saxenian is investigating providing every student with an Amazon Web Services account for data-crunching projects as part of the package. In partnership with 2U, faculty members are hard at work developing videos and other course content that will be delivered asynchronously.

Because 2U's contracts are structured on a revenue-sharing basis, Paucek said his company is also making a bet on the program as a good one to get involved in. "Unlike some of the other companies in space, we're not trying to power as many programs as possible. We'll put in $10 million before we really start to see a return, so we have to be very careful about what we work on. It's not a small decision, and it's definitely a two-way decision, because we're not really a vendor. It's really a partnership."