Saturday, 27 October 2012

A remarkable
schism is developing between audiologists in the UK and the USA on the topic of
auditory processing disorder (APD) in children. In 2010, the American Academy
of Audiology published clinical practice guidelines for auditory processing disorder. In 2011, the British Society
of Audiology published a position statement on the same topic, which came to
rather different conclusions. This month a White Paper by the British Society
of Audiology appeared reaffirming their position alongside invited
commentaries.

So what is all
the fuss about? The argument centres on how to diagnose APD in children. Most
of the tests used in the USA to identify APD involve responding to speech. One
of the most widely-used assessments is the SCAN-C battery which has four
subtests:

Filtered words:
Repeat words that have been low-pass filtered, so they sound muffled

Competing words:
Repeat words that are presented simultaneously, one to each ear (dichotically)

Competing
sentences: Repeat sentences presented to one ear while ignoring those presented
simultaneously to the other ear

In 2006, David
Moore, Director of the Medical Research Council’s Institute of Hearing Research
in Nottingham, created a stir when he published a paper arguing that APD
diagnosis should be based on performance on non-linguistic tests of auditory
perception. Moore’s concern was that tests such as SCAN-C, which use speech
stimuli, can’t distinguish an auditory problem from a language problem. I made
similar arguments in a blog post written last year. Consider the task of doing a speech perception test in a
foreign language: if you don’t know the language very well, then you may fail
the test because you are poor at distinguishing unfamiliar speech sounds or
recognising specific words. This wouldn’t mean you had an auditory disorder.

A recent paper by
Loo et al (2012) provided concrete evidence for this concern. They compared
multilingual and monolingual children on performance on an APD battery. All
children were schooled in English, but a high proportion spoke another language
at home. The child’s language background
did not affect performance on non-linguistic APD tests, but had a significant
effect on most of the speech-based tests.

Results from the study were reported in 2010 and presented a challenge for the concept of
APD. Specifically, Moore et al concluded
that, when effect of task demands had been subtracted out, non-linguistic measures of auditory processing
“bore little relationship to measures of speech perception or to cognitive,
communication, and listening skills that are considered the hallmarks of APD in
children. This finding provides little support for the hypothesis that APD
involves impaired processing of basic sounds by the brain, as currently
embodied in definitions of APD.”

Overall, Moore et
al found that if we use auditory measures that are carefully controlled to
minimise effects of task demands and language ability, we find that they don’t
identify children about whom there is clinical concern. Nevertheless, children exist for whom there
is a clinical concern, insofar as the child reports difficulty in perceiving
speech in noise. So how on earth are we to proceed?

In the White
Paper, the BSA special interest group suggest that the focus should be on
developing standardized methods for identifying clinical characteristics of
APD, particularly through the use of parental questionnaires.

The experts who
responded to Moore and colleagues took a very different line. The specific points they raised varied, but
they were not happy with the idea of reliance on parental report as the basis
for APD diagnosis. In general, they
argued for more refined measures of auditory function. Jerger and Martin (USA)
expressed substantial agreement with Moore et al about the nature of the
problem confronting the APD concept. “There can be no doubt that attention,
memory, and language disorder are the elephants in the room. One can view them
either as confounds in traditional behavioral tests of an assumed sensory
disorder or, indeed, as key factors underlying the very nature of a ‘more
general neurodevelopmental delay’” . They rejected, however, the idea of
questionnaires for diagnosis, and suggested that methods such as
electroencephalography and brain imaging could be used to give more reliable
and valid measures of APD.

Dillon and
Cameron (Australia) queried the usefulness of a general term such as APD, when
the reality was that there may be many different types of auditory difficulty,
each requiring its own specific test. They described their own work on ‘spatial
listening disorder’, arguing that this did relate to clinical presentation.

The most critical
of Moore et al’s arguments were Bellis and colleagues (USA). They implied that
a good clinician can get around the confound between language and auditory
assessments: “Additional controls in cases in which the possible presence of a
linguistic or memory confound exists may include assessing performance in the
non-manipulated condition (e.g. monaural versus dichotic, nonfiltered versus
filtered, etc.) to ensure that performance deficits seen on CAPD tests are due
to the acoustic manipulations rather than to lack of familiarity with the
language and/or significantly reduced memory skills.” Furthermore, according to
Bellis et al, the fact that speech tasks don’t correlate with non-speech tasks
is all the more reason for using speech tasks in an assessment, because “in
some cases central auditory processing deficits may only be revealed using
speech tasks”.

Moore et al were
not swayed by these arguments. They argued first, that neurobiological
measures, such as electroencephalography, are no easier to interpret than
behavioural measures. I’d agree that it would be a mistake to assume such
measures are immune from top-down influences (cf. Bishop et al, 2012) and
reliability of measurement can be a serious problem (Bishop & Hardiman,2010). Moore et al were also critical of the idea that language factors can be
controlled for by within-task manipulations when speech tasks are used. This is
because the use of top-down information (e.g. using knowledge of vocabulary to
guess what a word is) becomes more important as a task gets harder, so a child
whose poor language has little impact on performance in an easy condition (e.g.
listening in quiet) may be much more affected when conditions get hard (e.g.
listening in noise). In addition, I would argue that the account by Bellis et
al implies that they know just how much allowance to make for a child’s
language level when giving a clinical interpretation of test findings. That is
a dangerous assumption in the absence of hard evidence from empirical studies.

So are we stuck
with the idea of diagnosing APD from parental questionnaires? Moore et al argue this is
preferable to other methods because it would at least reflect the child’s
symptoms, in a way that auditory tests don’t. I share the reservations of the
commentators about this, but for different reasons. To my mind this approach
would be justified only if we also changed the label that was used to refer to
these children. The research to date
suggests that children who report listening difficulties typically have
deficits in language, literacy, attention and/or social cognition (Dawes & Bishop, 2010; Ferguson et al, 2011). There’s not much evidence that these
problems are usually caused by low-level auditory disorder. It is therefore
misleading to diagnose children with APD on the basis of parental report alone,
as this label implies a primary auditory deficit.

In my view, we
should reserve APD as a term for
low-level auditory perceptual problems in children with normal hearing,
which are not secondary consequences of language or attentional deficits. The
problem is that we can’t make this diagnosis without more information about the
ways in which top-down influences impact on auditory measures, be they
behavioural or neurobiological. The population study by Moore et al (2010) made
a start on assessing how far non-linguistic auditory deficits related (or
failed to relate) to cognitive deficits and clinical symptoms in the general
population. The study by Loo et al (2012) adopts a novel approach to
understanding how language limitations can affect auditory test results, when
those limitations are due to the child’s language background, rather than any inherent
language disorder. The onus is now on those who advocate diagnosing APD on the
basis of existing tests to demonstrate that they are not only reliable but also
valid according to these kinds of criteria. Until they do so, the diagnosis of
APD will remain questionable.

P.S. 12th November 2012Brief video by me on "Auditory processing disorder and language impairment" available here:http://tinyurl.com/c2adbsy (with links to supporting slideshow and references)

Monday, 1 October 2012

The new phonics screening test for children
has been highly controversial.I’ve been
surprised at the amount of hostility engendered by the idea of testing
children’s knowledge of how letters and sounds go together. There’s plenty of
evidence that this is a foundational skill for reading, and poor ability to do
phonics is a good predictor of later reading problems. So while I can see there
are aspects of the implementation of the phonics screen that could be
improved,I don’t buy arguments that it
will ‘confuse’ children, or prevent them reading for meaning.

I discovered today that some early data on
the phonics screen had recently been published by the Department for Education,
and my inner nerd was immediately stimulated to visit the website and
download the tables.What I found was
both surprising and disturbing.

Most of the results are presented in terms
of proportions of children ‘passing’ the screen, i.e. scoring 32 or more. There
are tables showing how this proportion varies with gender, ethnic background,
language background, and provision of free school meals. But I was more
interested in raw scores: after all, a cutoff of 32 is pretty arbitrary. I
wanted to see the range and distribution of scores.I found just one table showing the relevant
data, subdivided by gender, and I have plotted the results here.

Those of you who are also statistics nerds
will immediately see something very odd, but other readers may need a bit more
explanation.When you have a test like
the phonics test, where each item is scored right or wrong, and the number of
correct items is totalled up, you’d normally expect to get a continuous
distribution of scores. That is to say, the numbers of children obtaining a
given score should increase gradually up to some point corresponding to the
most typical score (the mode), and then gradually decline again. If the test is
pretty easy, you may get a ceiling effect, i.e. the mode may be at or close to
the maximum score, so you will see a peak at the right hand side of the plot,
with a long straggly tail of lower scores.There may also be a ‘bump’ at the left hand edge of the distribution,
corresponding to those children who can’t read at all – a so-called ‘floor’
effect.That's evident in the scores for boys. But there's also something else. There’s a sudden upswing in the distribution, just at
the ‘pass’ mark. Okay, you might think, that’s because the clever people at the
DfE have devised the phonics test that way, so that 31 of the items are really
easy, and most children can read them, but then they suddenly get much
harder.Well, that seems unlikely, and
it would be a rather odd way to develop a test, but it’s not impossible. The
really unbelievable bit is the distribution of scores just above and below the
cutoff. What you can see is that for both boys and girls, fewer children score
31 than 30, in contrast to the general upward trend that was seen for lower
scores. Then there’s a sudden leap , so that about five times as many children
score 32 than 31. But then there’s another dip: fewer children score 33 than
32. Overall, there’s a kind of ‘scalloped’ pattern to the distribution of
scores above 32, which is exactly the kind of distribution you’d expect if a
score of 32 was giving a kind of ‘floor effect’.But, of course, 32 is not the test floor.

This is so striking, and so abnormal, that
I fear it provides clear-cut evidence that the data have been manipulated, so
that children whose scores would put them just one or two points below the
magic cutoff of 32 have been given the benefit of the doubt, and had their
scores nudged up above cutoff.

This is most unlikely to indicate a problem
inherent in the test itself. It looks like human bias that arises when people
know there is a cutoff and, for whatever reason, are reluctant to have children
score below that cutoff.As one who is basically in favour of phonics testing, I’m sorry to put another cat among the
educational pigeons, but on the basis of this evidence, I do query whether
these data can be trusted.