They may not, says a new study. It may not be what companies want to hear, but the math scores of U.S. students using educational software are no better than those of their counterparts who follow a traditional curriculum. That's the conclusion of the study, which is loaded with caveats about what it means for students and educators.

The $14.5 million study was designed to find out whether students are benefiting from the growing use of educational software. Preliminary results[2] released 2 years ago suggested that the answer, for first- and fourth-grade students in reading and for sixth- and ninth-grade students in math, was no. But those results were widely criticized by educators and software makers for lumping together the outcomes from many different products and for testing their impact on student achievement in the 1st year the teacher had used the material. The 2nd-year results[3], posted last week, address those concerns by reporting results for individual software packages, and by testing a new cohort of students whose teachers had a year of using the software under their belts. (Image credit: Department of Commerce)

What it found is that none of the four math products tested produced a statistically significant difference in achievement between control and treatment groups over the 2-year study, which began in the fall of 2004. (One of the six reading packages registered a meaningful jump in test scores for fourth graders.) Students of teachers using Cognitive Tutor, a computer-based curriculum for Algebra I students developed by Carnegie Mellon University researchers, for a second year showed a meaningful improvement[4] in test scores. But the software had no overall impact on student achievement when data from both student cohorts were combined.

Education research is rarely definitive, however, and this study, funded by the U.S. Department of Education and conducted by Mathematica Policy Research Inc. in Princeton, New Jersey, is no exception. For one thing, the students and teachers were not randomly assigned to the products, explains Mathematica's Mark Dynarski. That rules out comparing one software package against another. It also means that the results shouldn't be generalized to different student populations. "Everyone began at different starting points. So we can only say how this product works at a particular school," he says.

Another confounding factor is that researchers didn't have enough money to continue observing teachers in their classrooms or surveying them about how they used the product, something that was done in the first cohort. That makes it difficult, for example, to interpret large fluctuations from one year to the next in the amount of time spent using particular software.

Does an increase for Algebra 1 students, for example, mean that teachers found the product more helpful once they became comfortable with it? That seems a logical explanation to Steve Ritter of Cognitive Tutor in Pittsburgh, Pennsylvania, who points to his company's emphasis on supporting teachers. "For us, usage went up, and second-year teachers achieved statistically significant improvements," Ritter says. "That's worth talking about."

In contrast, does a decline in usage mean that teachers found that it wasn't right for their students, who are disproportionately from large, poor, and urban schools? Or was it simply due to a dearth of computer facilities, inadequate tech support, or scheduling problems? And what effect does an extremely mobile teacher corps—only one in four taught the same grade at the same school for both years—have on student outcomes?

Dynarski cautions against overinterpretation of the data. He notes that Congress wanted to know if education software improves student learning, not whether individual products can help certain populations taking certain subjects. "There might be a lot of positive things going on in the classroom, such as greater fluency with computers," he says. "But it's not leading to the kind of results that people want, which are better test scores."