I used to think that scientific data and the conclusions reported by scientists were rock solid. They had to be because they were published in scientific journals and had been through rigorous peer-review. Now I know better. There are major flaws in the peer-review process and one reads many articles that should not have been published or should have been published with big caveats on them. All papers that report new findings should have DO NOT TREAT AS GOSPEL; WAIT FOR INDEPENDENT REPLICATION in big red letters on page 1.

The prestigious journal PNAS recently published a paper claiming that the wearing of an FM listening device for a school year resulted in improved reading ability in children who had dyslexia. The claim has been picked up media and has been reported on many a parent forum. That would be fine if the data supported the excited conclusions. They don’t.

Hornickel et al selected 38 children who they claimed had dyslexia. They divided them into two groups. Group 1 wore an FM device for a year while in most classes (~4 hours/day). The control group did not.

Problem 1 arises in their classification method and therefore the claim of dyslexia. The study included children in the dyslexia groups if they had an IQ score >80 (fine) AND a score below 100 on either of two tests of word-reading OR a score 15 standard score points below their IQ. So what’s the problem?

First, a standard score of 100 is average. It is smack bang in the middle of normal; better than 50% of your peers. “Below” 100 can mean 99, which is no different from 100; AVERAGE. Second, a score of 15 points < your IQ score is a nonsense. Suppose you have an IQ of 115 and a reading score of 100. You’d be classified as being in the dyslexic group, but you’re an AVERAGE reader.

So bottom line there’s a fair chance that at least some of the children in the study didn’t have dyslexia. Look at Table S1 that I’ve copied below. Psychologists generally consider a score >85 to be within normal limits. If you look at the Woodcock-Johnson basic reading cluster score both dyslexic groups are within the average range (standard scores of 94.84 and 97 respectively). That’s AVERAGE. There’s considerable variability within groups (look at the standard deviations) and I have little doubt some of the children had dyslexia. However, how can you sustain the use of the classification dyslexia when your group means are AVERAGE?

Problem 2. The authors claimed that the FM system worked because they found a statistically significant interaction effect for the phonological awareness measures and the Woodcock-Johnson (WJ) basic reading cluster score. That is, the FM wearing group improved more than the control group of children with dyslexia who didn’t wear the FM system. Two issues with this. First, no interaction was found for the two reading fluency measures. Why not? They were used as part of the classification process so presumably the authors believed them relevant to ‘dyslexia’. They measure the word-level reading skills that characterise dyslexia. Surely we’d expect improvements on these measures if FM systems fix dyslexia. The authors failed to mention anything about this issue.

Second, the statistical effects were very small. The WJ scores in the control dyslexic (non-FM) group improved by 1.22 standard score points. The improvement in the FM group was 4.56. That’s a mere 3.34 difference. That’s not meaningful.

Problem 3. All children went to the same school. All except 7. Of those 7, five were in the control (non-FM group). Given the tiny effect and small sample size, it’s possible that having three children attend a school in which the educational opportunities were not as good affected the results.

Claims were made about phonological awareness and scores on tests of speech processing. I’m not bothered about those scores. If I have a child who has dyslexia and I subject him/her to 12-months of wearing ear pieces in each ear I could care less about responses on a test of phonological awareness or an experimental measure of whether one can tell the difference between /ba/ /da/ syllables. I care about improvements in reading ability.

Did the FM treatment deliver that? Yes, statistically, but not meaningfully. +3.44 standard score points isn’t enough for me to start suggesting FM systems as treatment. Sadly, the media attention like this , this , and this will probably see that occurring.