Test data misuse reaches absurd levels

Author

Disclosure statement

Flynn Ross does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond the academic appointment above.

The promise of the bipartisan No Child Left Behind (NCLB) 2001 legislation was, as the name says, that no child would be left behind. A key piece of this legislation is the annual testing of every child from third through eighth grade and then once in high school.

The data from these tests were intended to provide policymakers and educators with evidence to improve educational outcomes for the most disadvantaged students. But instead of promoting equity and social justice, the data are being used, in some cases, to further punish and disenfranchise the most vulnerable students.

As an educational researcher, teacher and mom, I understand the potential as well as the unintended impacts of the annual testing regime. I also know that it doesn’t have to remain this way. We, as a nation, can do better.

NCLB’s use of standardized testing has been widely criticized for its inability to improve learning outcomes, especially for the most vulnerable students. It’s not just excessive testing, but an inappropriate use of the results that are now threatening the quality of public education.

AERA’s statement lists a set of conditions under which testing programs need to be implemented: alignment of curriculum with the test items, adequate resources and opportunity to learn, validation of the passing scores and means to address the needs of students with language and learning differences.

In addition, AERA has said that test scores should follow a strict ethical code when it comes to evaluations. Much of this is currently missing.

A range of tests

Let’s take stock of just how many tests are currently “out there” and what their different purposes are.

US Secretary of Education Arne Duncan has pushed for teacher evaluation to be based in part on students’ standardized test scores despite the experiences of Tennessee, Houston and Florida, where misuse of test data has been seen and challenged in court.

In these states, art and physical education teachers were evaluated on students’ English and math test scores. This error has already led to lawsuits in Tennessee and in Florida.

Luke Flynt, an Indian River County teacher, in public testimony to the school committee, discussed how absurdly unreasonable these models of testing are. Flynt was a teacher in Florida who received unsatisfactory ratings because the computer model predicted that his students would score above a perfect score.

Luke Flynt, an Indian River County teacher, goes on the record.

Similarly, last year, Sheri Lederman, a fourth grade teacher in New York’s Great Neck Public School district, has challenged the inappropriateness of her teacher evaluation rating. The case will be heard by the New York Supreme Court.

As it is, teachers are frustrated. Testing has only added to it. Between 40% and 50% of new teachers are leaving the profession within five years. This is leading to a huge loss of social capital and institutional capacity in the highest-need schools, where the rate of teacher exodus is highest. The annual cost of teacher dissatisfaction, expressed in the high turnover, is estimated to be $2.2 billion.

This misuse of data is also one of the reasons behind the nationalopt out movement, as parents and teachers say no to testing.

This misuse of data is also driving states to opt out of the Common Core State Standards (CCSS).

At least 10 states have already dropped the CCSS, and similar legislation is pending in most other states. Several states are “rebranding” the standards by having more local input and revising elements of the standards.

Testing has not worked

The National Academies, the premier source of expert advice on pressing societal challenges, have documented that the current test-based accountability models of incentives and sanctions has not been effective for improving learning or achievement.

They have also called for reformed models of accountability that would consider broader-based measures of progress.

As is evident in these details, the true failure of education, as stated by the American Statistical Association (ASA), has been in preparing our legislators and educational policymakers in the ethical use of statistics.

In particular, the Value Added Model (VAM), a complex statistical tool, is being inapproriately used for assessing teachers’ performance.

The ASA has cautioned that these data are not an accurate measure, as standardized test scores are not “causational.” In other words, test results are affected by many factors – not just the teacher. Results need to be interpreted with caution.

And also, for this reason, no high-stakes decisions such as job termination should be made based on the test results.

The basic scientific premise of quality assessment and evaluation is taking multiple measures, using multiple methods, and making use of multiple opportunities for a more accurate representation of anything being studied, particularly something as complex as teaching.

The aspirations of “No Child Left Behind” are essential for our nation’s success. However, the current models based on limited standardized test scores significantly underrepresent the complexity of learning.