Data, Data Everywhere

By

Massive open online course providers are collecting troves of data about their students, but what good is it if researchers can't use the information?

The MOOC Research Initiative formally released its results on Monday, six months after researchers met in Arlington, Texas, to brief one another on initial findings. The body of research -- 22 projects examining everything from how social networks form in MOOCs to how the courses can be used for remedial education -- can perhaps best be described as the first chapter of MOOC research, confirming some widely held beliefs about the medium while casting doubt on others.

Common to many of the projects, however, were the difficulties of working with MOOC data.

“It’s a huge issue,” said John Whitmer, program manager of academic technology and analytics in the Office of the Chancellor at the California State University System. “We spent about 80 to 90 percent of our time on fundamental data transformation.”

MOOC providers, Whitmer said, have focused on supporting internal research or creating analytics tools to benefit MOOC instructors. “They have not paid attention to this kind of nebulous interested research community -- and I’m not saying that’s a bad thing,” he said. “They have to focus their resources how they want.”

Whitmer gave the MOOC provider Coursera credit for making data available as exportable tables, but said his team still needed to spend more grant money than initially planned to convert the data into a useful format. “It’s like saying you can build a house from iron ore and trees,” he said.

Coursera isn’t the only MOOC provider to leave researchers longing for better data collection procedures. When Harvard University and the Massachusetts Institute of Technology last week released student data collected by edX, some higher education consultants remarked that the data provided "no insight into learner patterns of behavior over time.”

“It’s not as simple as them providing better data,” Whitmer said. “They should have some skin in it, because this is their job. They should be helping us with this.”

The initiative was funded by a $840,000 grant from the Bill & Melinda Gates Foundation, and some of the projects will be featured in a special issue of the International Review of Research in Open and Distance Learning. Research from the University of Pennsylvania won’t be featured, said project leader Laura W. Perna, as data issues prevented the team from submitting a manuscript. Although they could see when students accessed a lecture or took a quiz, she said, the researchers still had to line up that data with what was actually occurring in the course.

“There’s just so much data,” Perna, a professor in the Graduate School of Education, said. “Just because you have a lot of data doesn’t mean you have good data.”

Perna’s project studied participation among the roughly 1 million students who enrolled in 16 Coursera courses offered at Penn during the 2012-13 academic year, and found that only about one in every 10 students made it to the final week. Those findings echo a familiar refrain about MOOCs -- particularly heard in the last year, as hype about "saving" higher education has given way to more criticism.

But it also why Perna decided to get involved with the initiative. The research, she said, needed to catch up with the rhetoric.

“To some extent, we need to have this foundational research,” Perna said. “This is something that has moved ahead pretty quickly without necessarily what we know from a research perspective. People have made decisions based on what they think is happening.”

While projects such as Perna's confirm some of the assumptions about MOOCs, others challenge them.

Whitmer’s project also examined student participation -- this time in a remedial English writing MOOC -- finding that 23 percent of students engaged meaningfully with the content even though, by traditional metrics, only 8 percent of the 48,174 enrolled actually passed the course.

“I would agree that there is value in validating,” said Whitmer. “We can confirm something we knew," he added, "but what if those are the wrong questions?”

In that report, the researchers also argue that students’ own assumptions -- for example about their motivations for taking the MOOC -- don’t always hold up. None of the answers to an entry survey about demographics or persistence ended up being statistically significant when it came to predicting engagement, Whitmer said.

“Everybody says that those two are related, but I don’t know if there are other studies that confirm or validate that,” Whitmer said. “When you have a sample size like we do, it’s really easy to get significance.... But we didn’t even get that.”

To build on the results from the MOOC Research Initiative, Perna suggested more research should be conducted into how MOOCs can be tailored to suit different student demographics.

"I came at this research because a year ago there was a lot of rhetoric about how MOOCs were going to solve the college access problem, solve the college finance problem," Perna said. "Clearly what we have right now isn’t solving these two issues, but we do need to have innovation if we are to make progress on these important issues facing higher education in the U.S."