A framework for the use and interpretation of statistics in reading instruction

View/Open

Date

Author

Metadata

Abstract

There are few instructional tasks more important than teaching children to
read. The consequences of low achievement in reading are costly both to
individuals and society. Low achievement in literacy correlates with high rates
of school drop-out, poverty, and underemployment. The far-reaching effects of
literacy achievement have heightened the interest of educators and non-educators
alike in the teaching of reading. Successful efforts to improve
reading achievement emphasise identification and implementation of
evidence-based practices that promote high rates of achievement when used
in classrooms by teachers with diverse instructional styles with children who
have diverse instructional needs and interests.
Being able to recognise what characterises rigorous evidence-based reading
instruction is essential to choosing the right reading curriculum (i.e., what
method or approach). It will be necessary to ensure that general classroom
reading instruction is of universally high quality and that practitioners are
prepared to effectively implement validated reading interventions. When
educators are not familiar with research methodologies and findings, national
and provincial departments of education may find themselves implementing
fads or incomplete findings.
The choice of method of instruction is very often based on empirical research
studies. The selection of statistical procedures is an integral part of the
research process. Statistical significance testing is a prominent feature of data
analysis in language learning studies and also specifically, reading instruction
studies.
For many years, methodologists have debated what statistical significance
testing means and how it should be used in the interpretation of substantive
results. Researchers have long placed a premium on the use of statistical
significance testing. However, criticisms of the statistical significance testing
procedure are prevalent and occur across many scientific disciplines.
Critics of statistical significance tests have made several suggestions, with the
underlying theme being for researchers to examine and interpret their data
carefully and thoroughly, rather than relying solely upon p values in
determining which results are important enough to examine further and report
in journals. Specific suggestions include the use of effect sizes, confidence
intervals, and power.
The purpose of this study was to:
determine what the state of affairs is with regard to statistical significance
testing in reading instruction research, with specific reference to post-1999
literature (post-I999 literature was selected because of the specific
request, made by Wilkinson and the Task Force on Statistical Inference in
1999, to include the reporting of effect sizes in empirical research studies);
determine what the criticisms as well as the defences are that have been
offered for statistical significance testing;
determine what the alternatives or supplements are to statistical
significance testing in reading instruction research;
To provide a framework for the most effective and appropriate selection,
use and representation of statistical significance testing in the reading
instruction research field.
A comprehensive survey on the use of statistical significance testing, as
manifested in randomly selected journals, was undertaken. Six journals (i.e.,
System, Language Learning and Technology, The Reading Matrix, Scientific
Studies of Reading, Teaching English as a Second or Foreign Language
(TESL-EJ); South African Journal for Language Teaching) regularly including
articles related to reading instruction research and publishing articles reporting
statistical analyses, were reviewed and analysed. All articles in these journals
from 2000-2005, employing statistical analyses were reviewed and analysed.
The data was analysed by means of descriptive statistics (i.e., frequency
counts and percentages). Qualitative reporting was also utilized.
A review of six readily accessible (online) journals publishing research on
reading instruction indicated that researchers/authors rely very heavily on
statistical significance testing and very seldom, if ever, report effect size/effect
magnitude or confidence interval measures when documenting their results.
A review of the literature indicates that null hypothesis significance testing has
been and is a controversial method of extracting information from
experimental data and of guiding the formation of scientific conclusions.
Several alternatives or complements to null hypothesis significance testing,
namely effect sizes, confidence intervals and power analysis have been
suggested.
The following central theoretical statement was formulated for this study:
Statistical significance tests should be supplemented with accurate
reports of effect size, power analyses and confidence intervals in
reading research studies. In addition, quantitative studies, utilising
statistics as stated in the previous sentence, should be supplemented
with qualitative studies in order to obtain a more comprehensive picture
of reading instruction research.
Research indicates that no single study ever establishes a programme or
practice as effective; moreover it is the convergence of evidence from a
variety of study designs that is ultimately scientifically convincing. When
evaluating studies and claims of evidence, educators must not determine
whether the study is quantitative or qualitative in nature, but rather if the study
meets the standards of scientific research.
The proposed framework presented in this study consists of three main parts,
namely, part one focuses on the study's description of the intervention and the
random assignment process, part two focuses on the study's collection of data
and part three focuses on the study's reporting of results, specifically the
statistical reporting of the results.