One of the great intellectual pleasures is to hear an idea that not only seems right, but that strikes you as so terribly obvious (now that you've heard it) you're in disbelief that no one has ever made the point before.

The paper concerned the adequacy of control groups in intervention studies--interventions like (but not limited to) "brain games" meant to improve cognition, and the playing of video games, thought to improve certain aspects of perception and attention.

Control group

To appreciate the point made in this paper, consider what a control group is supposed to be and do. It is supposed to be a group of subjects as similar to the experimental group as possible, except for the critical variable under study.

The performance of the control group is to be compared to the performance of the experimental group, which should allow an assessment of the impact of the critical variable on the outcome measure.

Now consider video gaming or brain training. Subjects in an experiment might very well guess the suspected relationship between the critical variable and the outcome. They have an expectation as to what is likely to happen. If they do, then there might be a placebo effect--people perform better on the outcome test simply because they expect that the training will help just as some people feel less pain when given a placebo that they believe is a analgesic.

Active control group

The standard way to deal with that problem is the use an "active control." That means that the control group doesn't do nothing--they do something, but it's something that the experimenter does not believe will affect the outcome variable. So in some experiments testing the impact of action video games on attention and perception, the active control plays slow-paced video games like Tetris or Sims.

The purpose of the active control is that it is supposed to make expectations equivalent in the two groups. Boot et al.'s simple and valid point is that it probably doesn't do that. People don't believe playing Sims will improve attention.

The experimenters gathered some data on this point. They had subjects watch a brief video demonstrating what an action video game was like or what the active control game was like. Then they showed them videos of the measures of attention and perception that are often used in these experiments. And they asked subjects "if you played the video game a lot, do you think it would influence how well you would do on those other tasks?"

Out of control group

And sure enough, people think that action video games will help on measures of attention and perception. Importantly, they don't think that they would have an impact on a measure like story recall. And subjects who saw the game Tetris were less likely to think it would help the perception measures, but were more likely to say it would help with mental rotation.

In other words, subjects see the underlying similarities between games and the outcome measures, and they figure that higher similarity between them means a greater likelihood of transfer.

As the authors note, this problem is not limited to the video gaming literature; the need for an active control that deals with subject expectations also applies to the brain training literature.

More broadly, it applies to studies of classroom interventions. Many of these studies don't use active controls at all. The control is business-as-usual.

In that case, I suspect you have double the problem. You not only have the placebo effect affecting students, you also have one set of teachers asked to do something new, and another set teaching as they typically do. It seems at least plausible that the former will be extra reflective on their practice--they would almost have to be--and that alone might lead to improved student performance.

It's hard to say how big these placebo effects might be, but this is something to watch for when you read research in the future.

I'm so glad you brought this up, Dan. Most of the studies I've read about certain practices or program are "TEST GROUP" v. "BUSINESS-AS-USUAL". So Scholastic will say the Read180 works wonders when compared against... people doing essentially nothing, or doing something that Scholastic knows already doesn't work.

I had a recent experience in educational software development where I was completely rejected for suggesting that the software be tested in A/B comparison with other similar software. "Why would we want do that?" people asked. "It would just cost more money and time."

I often get the feeling—especially companies test their own stuff—that everyone is looking for effects but not necessarily solutions. Most head-to-head studies of different programs show very little difference between them. What's more, the experimental group usually receives tremendous support from the publisher, as the teacher did in the case where I was peripherally involved, while the "control" group isn't really a control at all—it's just a bunch of teachers who are left alone, are teaching alone, and using whatever random approaches they use.

I think there's the general challenge, too, in field research around random assignment (kids aren't randomly assigned to teachers nor are teachers randomly assigned to schools) and double-blindness (it's pretty hard for a teacher in the experimental group not to know he or she is trying something that is being measured).

A finally problem, one that came up a lot in relation to Reading Recovery, is whether a method helps students or whether a method simply attracts more effective teachers. Reading Recovery seems to do pretty well in some studies. But it's one of the hardest methods to learn, one of the most costly certifications to attain, and anybody who does it really has to be intensely curious about the teaching of reading to begin with.

I've been in hundreds of schools and mostly I've seen things not work. But when I look a little more closely, I realize most of the time that what doesn't work is a bunch of people not implementing something. By contrast, consistency of implementation is almost always successful regardless of the program being implemented. I see this play out in Core Knowledge schools. There's high fidelity to the curriculum and the curriculum seems to fairly easy to deliver because the content is so well structured and also because much of it, especially in the early grades, is so familiar to most college-educated Americans who are likely to be teaching it. As I often advice schools who are considering it: "It's a very teachable approach; you'll likely be successful in implementing it because it's well organized and easier to deliver than most other curricula."

To me, this idea of "teachabiltiy" doesn't diminish the value of a method, it actually seems to enhance it. The message I take from this is that we should be trying to kill two birds with one stone: Giving people good things to use that are also fairly easy to implement across a range of teachers with varying ability.

But it does bring up the issue for me of how difficult it is to bring anything truly innovative into teaching. "Innovators" and "Early Adopters" as Everett Rogers' work shows, are the most likely to come on board with something new. They're likely to share the value of "newness" = "goodness". And they probably have a lot of experience with, and passion for, trying new things. This alone creates a bond of sorts and can embolden a group to work harder to take more risks that lead to great rewards. So it's hard to tease out what I've always thought as a "teacher effect" in these situations.

There may also just be a "group effect" if teachers in a group using a particular approach all know they're sharing a particular approach. Teaching can feel very isolating. Many teachers seem to really get a boost simply by working together on anything.

All this leaves me with two ideas: (1) Teaching together might be more important that what we teach; and (2) The more thinking that is required to use a method, the more very thoughtful teachers will be drawn to it. Both of these things make field research very difficult, it seems to me.

I wonder, then, if the most constructive approach is to find things that are incremental improvements on traditional practice and work hard to get large numbers of people working together within them. Then perhaps to gradually introduce slightly more innovation to the group each time it coalesces successfully around a particular set of practices.

This isn't a research agenda, but I wonder if it isn't a viable school improvement agenda.

"One of the great intellectual pleasures is to hear an idea that not only seems right, but that strikes you as so terribly obvious (now that you've heard it) you're in disbelief that no one has ever made the point before."

This is precisely how I felt when I learned of the early literacy method of introducing letter sounds with lowercase letters before letter names with upper case letters.

ryan

7/15/2013 09:32:01 am

An alternative strategy would be to include a 3rd group that receives some treatment that is known to have an effect on the DV. Then you could look at equivalency simultaneously.

A second alternative would be just to include individuals as their own controls. The design would just need to make sure the treatments are given in a balanced order to account for sequencing & carry-over effects.