As Big-Data Companies Come to Teaching, a Pioneer Issues a Warning (CHE)

She has founded the Open Learning Initiative at Carnegie Mellon University, won millions of dollars in grants, and been a fixture on the lecture circuit about the power of so-called adaptive learning, where data-powered algorithms serve up content keyed to what a student is ready to learn next. Publishers, venture-capital investors, and foundations have followed her lead. They’ve poured hundreds of millions into new companies and new products vying to score big contracts with colleges, sometimes promising to be the “robot tutor” for struggling students.

It seems like a classic business success story.

These days, though, Ms. Thille has begun to have darker thoughts about an industry she helped spark.

She still believes that adaptive learning will become an increasingly important tool in teaching. But she fears that rapid commercialization is exactly the wrong way to foster innovation at this early stage. What’s more, she thinks professors and higher-education leaders are making a dangerous mistake by letting companies take the lead in shaping the learning-analytics market.

Ms. Thille, who moved to Stanford University in the fall of 2013, has only recently begun to go public with that critique, voicing it to a few small audiences. But as she shared during extensive interviews with The Chronicle over the past few weeks, it’s a message she hopes college leaders and professors will heed, if only because she’s a messenger who understands, despite all the hype, both how “crude” and simplistic many of the products are today, and how educationally valuable they could one day become.

Her concerns boil down to these:

Colleges should have more control over this field. Like it or not, she argues, using data to predict student needs and deliver the right material at the right time will become essential. “And a core tenet of any business is that you don’t outsource your core business process,” she notes.

Companies aren’t as well equipped to develop and test new teaching algorithms as colleges are. She argues that colleges are the ideal living laboratories for any teaching system because they are home both to the research on learning and the actual teaching. As she puts it, “You have a very quick feedback loop, where the research informs the practice and the practice informs the research.”

When companies lead the development of learning software, the decisions those systems make are hidden from professors and colleges. Ms. Thille says companies that won’t share their processes are essentially saying, “Just trust the black box.” To most academics, she says, “That’s alchemy, that’s not science.”

The proprietary “black boxes” are the algorithms that might automatically serve up, say, an extra lesson on quadratic equations when a student’s responses to a quiz indicate that she didn’t quite grasp the concept. Every algorithm, and every decision about what data it will weigh, is also ultimately a pedagogical judgment call. For the companies selling adaptive software, “that’s where the gold is,” says Ms. Thille.

She contends that the demands of the market — with venture capitalists expecting returns 30 times what they’ve invested and companies facing pressure to deliver products priced so they don’t scare away customers — will inhibit innovation rather than foster it, even if the companies have the best intentions.

Ms. Thille (it’s pronounced “Till”), who is in her late 50s and spent 18 years in the private sector before joining academe, is a person who chooses her words carefully. She recognizes that her critique of the burgeoning learning-analytics marketplace is also fundamentally a critique of the commercialization model of the investor-fueled ed-tech sector, and in some sense, of her own Stanford community.

But she is also convinced that colleges need to find ways to raise the money for research and development of learning software so that companies don’t end up owning the classroom delivery system of the future. Colleges could come together to build such systems, or perhaps the federal government could step in, as it did with the Darpa research that led to the Internet. Society has stepped up before for matters of importance, notes Ms. Thille. “I would claim that this is one of them.”

Big Claims

Ms. Thille’s argument could easily be dismissed as naïve, or even self-interested — she does, after all, head up a research lab that lives by grants. But it meshes with a broader national conversation now surfacing among academics and other experts over the growing role that data and algorithms play in higher education.

Data-driven technologies already touch many corners of the student experience. “Student success” tools send text messages reminding students to see an adviser. Course-suggestion engines at some colleges can recommend that a student switch majors, from pre-med to history, if he did poorly in Biology 101. But educators are especially focused on the ways data engines are used in teaching.

Companies such as Knewton, whose chief executive boasted to an NPR reporter that the software was like “a robot tutor in the sky that can semi-read your mind,” epitomize the problem, says George Siemens, an internationally known theorist on digital technology and the professor who co-taught the very first MOOC, in 2008.

“They make very bold claims, but they aren’t involved in the research community at all. That means we can’t validate their algorithms. We can’t validate the results that they say they’re getting,” he says. “That’s a system that doesn’t serve the future of education well at all.”

“They make very bold claims, but they aren’t involved in the research community at all. That means we can’t validate their algorithms.”

He compares the state of learning platforms to the earliest days of networked computing, when it wasn’t clear whether the open Internet or a paid service like America Online would become dominant. If the learning-analytics model goes the way AOL hoped to go, says Mr. Siemens, “we don’t have the ability to create a rich ecosystem on top of that platform.”

The chief executive of Knewton, Jose Ferreira, declined to comment for this article.

A small community of academic researchers and others are beginning to press on this issue. Mr. Siemens hopes others will too. “Students should care an awful lot because they’re the ones being sorted algorithmically,” he adds. Among those concerned is Ms. Thille’s successor at Carnegie Mellon, Norman Bier. Without greater transparency for the way tools direct students and dashboards signal their progress to professors, “we face a real danger of hurting our students,” he says. The systems need to be more open, he says, so professors can understand and trust them without having to get an advanced degree in computer science.

But the trend is going in the other direction. More and more colleges are turning to the commercial market for their adaptive-learning products, often in response to vigorous marketing by companies. That includes Acrobatiq, a company that was spun off by Carnegie Mellon after Ms. Thille left. The university still owns a stake in it.

Acrobatiq advertises that its products are “powered by Carnegie Mellon” and “based on cognitive science and educational theory” of the Open Learning Initiative, even though none of the company’s products use OLI-developed software.

Eric Frank, Acrobatiq’s CEO, counters that companies can innovate, and notes that his company focuses on the important role of making sure its products work for a broad range of professors. “We spend less time refining our predictive models and algorithms and more time trying to help faculty to use them,” he says.

As more and more colleges work with more and more companies, he acknowledges, researchers will lose out on opportunities to collect large sets of data on how students learn and use them to advance the field because the data will be “bifurcated into these little fiefdoms.” Acrobatiq’s contracts, he notes, do ensure that colleges own the data. But he notes that the learning-analytics industry still lacks the kinds of standards and protocols that would make it easy to extract, organize, and share such data among institutions for research purposes.

Even some foundations and education associations can operate in ways that undermine the momentum for open learning analytics. For example, ininviting universities to take part in a new $4.6-million Bill & Melinda Gates Foundation grant for the members of a new Personalized Learning Consortium, the Association of Public and Land-Grant Universities specified that applicants could use only 19 specific products approved by the foundation. All of them are owned by companies. One of them is Acrobatiq; the Gates foundation is also an investor in that company.

The foundation specified those “approved suppliers” for the quality of their products and because they “appear likely to be robust and sustainable for the future” as businesses, according to the request for proposals for the grant.

Travis Reindl, a spokesman for the Gates foundation, said last week that the list was not “the be-all and end-all of the possible providers” for the grantees and that other products could be added even after the winning institutions are selected. at the end of May. Asked specifically about whether open courses from OLI or those Ms. Thille is now creating at Stanford would be allowed, he said, “APLU will provide the updated provider list to the selected institutions. The list is still being developed.”

As for Acrobatiq, he said the foundation was “not in any circumstances” trying to favor that company. “There is no thumb on the scale” for Acrobatiq, he said.