L.A. Times testing series raises more questions

Few education stories have excited me as much as the series on teacher assessment being done by reporters Jason Song, Jason Felch and Doug Smith of the Los Angeles Times. They have dug up a goldmine of data on the student test score gains of 6,000 individual elementary school teachers in the Los Angeles Unified School District, information that the district has refused to show to parents despite pleas from its staff to do so.

The latest story in the series, "L.A.'s leaders in learning," does many things that I think are crucial to improving American education, and fit what I have been trying to do calculating the level of challenge in high schools, nationally and in the Washington area, the last 12 years.

The latest Times story focuses on how schools as a whole, not individual teachers, are doing in raising achievement. That emphasis encourages schools to create team-like cultures in which everyone works to make everyone else better. The story buttresses the central point of the series--that schools that seem similar to parents trying to choose where to send their children look very different when unreported data like relative test score gains are revealed. It also shows in a dramatic way the uselessness of our usual means of rating schools. Those that have the highest test scores are considered the best, even though achievement measured that way reflects the average incomes of the parents far more than it does the quality of the teaching.

But I found reading the series, particularly this latest part, frustrating because it often fails to answer questions raised by the deep digging its reporters have done. Also, the stories seem to me to mischaracterize, in some spots, the data they present.

A prime example is their reporting on the focus of the latest story, Wilbur Avenue Elementary School in an affluent part of the San Fernando Valley. Wilbur is highly sought after even by families outside the neighborhood, who line up days in advance to register for a few empty places. The third paragraph of the story prepares the reader for surprisingly awful results from the Times value-added analysis, compared to the school's high test score averages as reported by the state each year. "Wilbur's record was among the worst in Los Angeles for boosting student performance in math and English," the story said.

But a reader eager to see the details backing up this statement has to go 76 more paragraphs before finding it, in this sentence: "On average, students started the third grade in the 77th percentile in math, but by the end of the fifth grade were in the 67th. In English, they slid from 79th to 76th."

You see my problem? Not only did I have to go almost to the end of the story to get this information, but the numbers do not seem that bad to me. These are after all percentiles, not actual scores. They are dependent on the how wide the scale of measure from first to 99th percentile is--something we are not told in the story. A ten point percentile drop in math is likely significant, but this is an average for the entire school and leaves Wilbur still considerably above average. I doubt that the three-point percentile drop in English is statistically significant at all. Again, we are not told. To say that the English achievement level "slid" seems wrong to me, since like candidates within three points of each other on a poll, this seems the equivalent of no difference at all.

The reporters probably sensed that some readers would react that way, so four paragraphs later they say that other elementary schools with similarly affluent families do much better. Wonderland Avenue Elementary in the Hollywood Hills, for instance, made "some of the biggest gains in the district, particularly in math," the story says. But, sadly, it does not tell us what those gains were so we can decide for ourselves, as they let us do with the Wilbur statistics.

This story also fails to address adequately what might be statistical peculiarities when judging schools and teachers whose students are performing to far above average. The Times is analyzing data based on how much students improve compared to their past performances. But if the students being examined are near the top of the scale in preparation because of their affluent and well-educated parents, might this mean their past performances were unusually high, and that a drop in test scores in a subsequent year would mean less than it seems?

The story says only that "research has shown no significant 'ceiling effect.'" I think we need more than that. It says other high-income schools "made great strides" compared to Wilbur, but it doesn't tell us which schools those were or what their results were. In discussing another school in an affluent area, Topeka Elementary, it says high achievers at the school were "essentially flat in English and steadily failing behind in math," but again doesn't give the actual scores. That is a problem because I think they mislabeled the slide in English scores at Wilbur. I would prefer to make my own assessment of the changes in scores at other places.

The story suggests that the growth in Wilbur's average test scores as reported by the state may be the result of better educated families moving into the area. But it does not say if anyone determined if the education levels of the parents of the students taking the tests were higher in the most recent year tested than before, since a large portion of students at Wilbur did not take the tests.

I can hear the mutters of the terrific reporters who are doing this series. "Is Mathews serious? Give a reporter a blog or a column and he is ruined. Has he forgotten what it is like to do a long news story with space limits and stubborn editors?" I confess that if I had $10 for every time, during my 35 years as a reporter, an editor told me to take out some details I thought were essential to a story, I would be writing this blog on the Riviera, not in an upstairs bedroom with our cats' litter box a few feet away.

Newspaper editors fear that readers will drown in too much detail. Most of the time they are right. But the Times editors should realize that this is the rare series that is being read not only by Times subscribers in southern California, but by people all over the country interested in the issue of how to assess teachers. The portion of readers in this case who know something about such assessments, and have questions about what the Times is finding, is much higher than usual.

So please give us a break and provide a few more facts. I know that the data will be there in total when you release all the results, but you haven't told us exactly when that will be, We don't want to wait to know more about the schools you are writing about now.

I will send this post to the Times reporters and ask them to fill in the blanks for me, if they have the time. Whatever they send me I will put here as a special addendum to this blog post.

even with great reporters with the Post, NPR, Edweek, and the NY Times, the story will conform to standard tropes. Next we'll be seeing the 80 parents who gave a thumbs up to Wilbur in Great Schools transferring their from their robotics courses to Espearanze that has a Science proficiency rate of 16.

I thought "the ceiling effect" on high test scores was pretty well established, so how can the Times dismiss it with one sentence? When Wilbur's English students drop from 79 percentile in 3rd grade to 76 percentile in 5th grade, and that knocks them down to the bottom 11% in Los Angeles, common sense says that there must be a statistical penalty when using tests 1. "designed to give more credit to low achievers" to improve scores that are already high. Besides, why is the Time's favored statistic more important than Wilbur's 2010 record of 81% proficiency in 5th grade in contrast to 62% proficiency in 3rd grade, or raising 5th grade proficiency rates by 3% since 2008?

The Times contrasts value-added data with the discredited Academic Performance Index that was perverted by NCLB, saying that it can improve instruction. But would it not be better to invest in intensive diagnostic testing to inform instruction, rather than rely on complex manipulation of data derived from simplistic tests? And above all, what happens if school systems use data the way the Times did? Will all instruction be narrowed to test prep?

I just read the Times article and my questions were exactly the ones you raised. In your column last week, I had concerns about the *meaning* of value-added. Let's all agree that what we would want and expect to see are significant results. The Wilbur example was just nit-picking over a few percentage points.

Posted by: edlharris
...............................
Not a teacher in a Title 1 poverty public school.

One tires of this insanity.

Provide a safe school which is the normal standard for the non Title 1 poverty public school.

Provide a public school that makes sure that in a classroom a teacher can teach and children can learn instead of a classroom with disruptive and/or prone to violence students. The normal standard for the non Title 1 poverty public school is classrooms without the disruptive and/or prone to violence.

As long as the public school of your child is not a a Title 1 poverty public school and provides average teachers your child will have an opportunity to learn.

White children in the public school system for years have had the HIGHEST scores in national tests in the nation while black students in the same public school system have had the LOWEST scores in national tests in the nation.

This is the same public school system and yet for years Mr. Mathews has supported Ms. Rhee in her claims that the problem is the teachers while ignoring the fact that the teachers who teach white children in this school system apparently have no problems in children learning.

Instead of blaming teachers Mr. Mathews should have been calling for new ideas to deal with the problem of large numbers of black children that have great difficulty in learning because of multiple generations of poverty.

Bob Herbert in the New York Times has no difficulty in telling us the problems of multiple generations of poverty.
"More than 70 percent of black children are born to unwed mothers."

How do the reporters know that the scores they saw are accurate? Did they just accept them as the truth? Did they administer the test themselves? Did they see the test being given? Do they know that the tests are not professionally administered and proctored? Have they read the articles about Atlanta, New York, DC and other cities with "improved" test scores?

I think the reporters forgot to check the validity of the tests regarding their administration, so let me explain what it is like:

The CA tests are the same or nearly the same from year to year. Most teachers know exactly what is on them. When they are delivered to the school, they are kept in the office. The principal is not supposed to check them out until the day of administration but many encourage teachers to "check them out and familiarize yourself with them" days before administration. Some teachers drill the kids on the exact items from September to May while others walk around the room during administration pointing to correct answers or whispering, "I know you can do that one" or "Look at that one again." Of course this type of outright cheating is (hopefully) rare but do the reporters know who the guilty parties are? Do they know that only one kind of testing "irregularity" can be detected and that is the kind that involves erasures?
After administration, the booklets go to the school office where they sometimes stay for days. Do the reporters know what happens during this time? Were they there?

Let's look at "John Smith" and Karen Caruso, the two teachers whose students received low scores on the test. Are these teachers weak or did they give the test as directed (no peeking)? Is it possible that the most honest teachers are being depicted as the weakest? Do the reporters know for certain? Do we?

I am not against using test scores to evaluate teachers, but if these scores are going to be "high stakes" their integrity must be strictly protected. They must be different each year and the handling and administration must be done by an independent testing agency. And of course the test must be designed to measure teacher effectiveness. Do the tests in question meet that criterion?

It is my belief that the reporters have committed a grave injustice but I don't think they realize it. I'm hoping the courts will intervene before more teachers are unfairly branded with their scarlet letters. Frankly, I find it very difficult to believe that this situation is even legal. Is the real story about the recession and how scarce jobs and goods make people do terrible things to each other?

The public DOES have a right to know how students score, but if they want this information, they'll have to pay for outside agencies to handle and administer these tests. Teachers and students have a right to fair and accurate testing.

Jay Mathews has consistently not reported on the D.C. schools which are the bottom in the nation and pretended that Ms. Rhee has the answers.

The D.C. public school is simply a separate but equal school system that is a throw back. D.C. is a small enough city to have fully integrated public schools but Mr. Mathews has remained silent on this for years.

Black students with the lowest scores in the nation while the white students in the segregated area of the city have the highest scores in the nation.

Couple this with the make believe that the problem is the teachers of the public school system when the same school system staffs the teachers in the white segregated schools of the city.

I would not be surprised if the the old textbooks of the white segregated schools of the city are given to the black schools when the white schools get new textbooks.

One more point to consider: Any teacher who has taught in both high and low-income schools knows that the high income schools almost always have high test scores; therefore the teachers usually do not teach to the test or drill the students on exact test items. However, in the poor schools the teachers often drill the children from September to May in a desperate attempt to bring up scores just a little bit.

So if High Income Elementary's test scores are flat but Low Income Elementary's show significant "growth," which school would be "better?" Which students might have learned the most?

For an excellent insight into this topic, read "Tested" by Linda Perlstein. this book clearly shows that high test scores don't always represent learning.

@bsallamack,
You might be interested in the fact that last spring I filed a civil rights complaint against the school district in Albany New York. The complaint process is mainly geared to individual complaints of civil rights violations, but they do take on class action type complaints if the situation warrants it. The situation in Albany is exactly as you describe in Washington DC, only worse, and I bet it's worse than you think in Washington too. The discrimination isn't really typical overt or even intentional discrimination, but economic, cultural, and behavioral discrimination that has a disparate impact on Blacks and Hispanics. What I was told off the record by school district employees is that the school district and some of the local politicians have been so desperate not to loose the last of their middle class families(mainly white and Asian but also a fair number of blacks and Hispanics) to the suburbs that they absolutely roll out the red carpet with the best of everything for those kids. Those kids get AP and IB courses starting in 9th grade, the best and most expensive new textbooks, the best classrooms, with the best teachers – some with PHD's – and just the best of everything - particularly at the High School and to a somewhat lesser extent in the Middle Schools and not that much in the grade schools. By the time they are in High School the “worst” of the poor Black and Hispanic kids don't get ****. They get stuck in interior classrooms with no windows and malfunctioning environmental control systems (violent swings between 50 degrees and 95 degrees), no books, no control over the out of control kids, uncertified “permanent substitutes, and the teachers with substance abuse and mental health problems that they haven't been able to get rid of yet(not that they really tried), and NO TEXTBOOKS!!!
If you live in DC and want to file a complaint here's the web site
http://www2.ed.gov/about/offices/list/ocr/docs/howto.html?src=rt
If you do file a complaint you'll need some sort of documentation. They turned down my complaints that were just on my word alone but are pursuing the ones where I had documentation. It may be harder for you than me since Albany City School District has less than 20 schools and I was able to use the NYS Dept. of Ed. School Report Cards that list the ethnic breakdown of each school, class sizes, school size and teacher qualification :). The district also actually responded to my e-mail complaints and ended up (3 years running) admitting that there was a problem(after first denying it) and that it would be fixed :) Of course they only fixed the specific complaints for specific teachers so I would go away, then never fixed the underlying problem.

I thought the second story was a bit absurd and only understandable if you figure there was an ideological bent.

I mean, does anyone really want to argue that well-informed parents would be standing in line to get into the Title I school? Let's all take bets on that likelihood, shall we? Next up: the implication that parents really don't care about their kids' achievement; they just want their kids going to school with other well-off kids. Ha. Non-starter.

And from what I can tell, the Times ignores the interesting story, which is whether or not low ability kids do better at the rich school or the poor school. Now *that* would be good information.

This morning I woke up to a new thought about the Los Angeles Times series on test scores. In California, and probably most other states, the legal evaluator of the teacher is a school administrator, usually the principal. In the first Times article on test scores, the writers profiled a teacher Karen Caruso who was considered highly effective by her school principal. And yet the newspaper basically negated this evaluation on the basis of the test scores of Ms. Caruso's students. And of course it was done for the whole world to read about. Was this legal? Ethical? Can Ms. Caruso sue for libel? I hope she finds out.

"Those that have the highest test scores are considered the best, even though achievement measured that way reflects the average incomes of the parents far more than it does the quality of the teaching."

You realize,of course, that this sentence contradicts everything promoted by Arne Duncan and RTTT, Rhee, TFA, Joel Kline, Fenty, Bloomberg, Wendy Kopp, KIPP and the rest of the phoney reformers you promote. On the other hand, it is exactly what real education researchers like Linda Darling-Hammond, Ira Sokol and , most recently, Diane Ravitch have tried to explain: America's educaion problem is a social one. Our society mis-directs educational funding from those that need it most to those that need it least.

A little historical research is helpful. The achievement of poor and minority students improved dramatically during the 1950's '60's and '70's. During that time we had force bussing for integration, a relatively even distribution of wealth, and marginal income tax rates that topped out at over 70%, and support for single mothers such that they could concentrate on raising and educating their children. Since then we have lowered top tax rates, increased concentrations of wealth and re-segregated our schools. Why? Was it bacause those poor children were finally beginning to compete in education and in the economy? Is that why Rhee, TFA and the rest of the new "reformers" are so hostile to teachers and studnets in poor communities?

Like most people, I don't understand much about statistics, but I think I remember that "percentile" ranks the students relative to the others. "67th percentile" merely means that that student scored higher than 66% of the students taking the test. In other words, if 1% of the fifth-grade students read at a fourth-grade level and all of the others tested read at a second- or third-grade level, wouldn't the top students be in the 99th percentile, even though they were nowhere near where they should be academically? And if a class happened to score in the 99% percentile on both tests, would they be considered as failing to improve?

If I have misunderstood percentile, or if Mr. Mathews has quoted the scores wrong, could someone explain it to this statistic ignoramus in plain, simple English?

I don't understand statistics to any great degree, but I thought "percentiles" compared a student's score to the other students taking the test. A student in the 99th percentile was better (or at least answered more questions right) than 99 percent of all the students tested. If this is right, then the top students, the 99th percentile, might still be one or two years behind their grade level if the rest of the students were even further behind. If a student's percentile score drops, did he fail to make progress or did he make normal progress and the others caught up? If it stays the same, is that because the teacher is teaching everything he or she is supposed to and the student is learning at the desired rate, or is the teacher penalized because of lack of progress? And if a student is in the 99th percentile on both tests used to measure "improvement," is the teacher penalized because he didn't make progress?

If my understanding of statistics is wrong, can someone explain percentiles to me in plain, simple English?

Yes, my second post was pretty much the same as the first--I have spent most of the evening on the phone with technical support trying to get the computer to open screens properly and refresh the screen when I tell it to. But I'd still like an explanation of percentiles.