Kindly Log In

State's largest urban districts post gains on national assessment

December 18, 2013

December 18, 2013

Students in Los Angeles Unified improved in reading and math on a national assessment of urban districts. Credit: LAUSD, Selma Avenue Elementary School

Three of California’s largest school districts showed gains on a national assessment of urban districts that also singled out Los Angeles and Fresno Unified for special recognition from U.S. Secretary of Education Arne Duncan.

Los Angeles had been steadily but slowly progressing since it joined the voluntary urban district testing program in 2003, but scores rose significantly this year over 2011, when the test was last administered. Only Washington, D.C., showed greater improvement.

Gains beat nation

Across the three districts, students made larger gains than the national average and bested the state in nearly every category.

In a statement, Secretary Duncan cited Los Angeles and Fresno Unified among three TUDA districts – D.C. Public Schools is the other – “that pressed ahead of ambitious reforms” and “made notable progress since 2011.”

In Los Angeles, fourth and eighth grade students boosted their average reading scores by 4 points each. In math, while eighth graders showed no statistically significant increases, fourth graders gained five points overall in improvements that crossed racial, ethnic and income levels. Hispanic fourth grade students went up by 4 points, while African-American students showed an 8-point jump in two years.

The district has implemented programs targeted at improving algebra instruction in middle and high schools and providing extra support for African-American, Latino and low-income students. The district has also stepped up professional development for teachers and staff.

Other successful changes in Los Angeles focus on student health and well-being. Fourth grade teacher Shannon Garrison credited the district’s breakfast in the classroom program with giving all students, but especially those living in poverty, the nourishment they need to focus in class.

“I can see a huge difference for some of my students who maybe don’t eat well in the morning, or maybe only have a couple of meals a day, just as far as their attention, their ability to engage in work,” Garrison said. “They’re not worried about being hungry.”

She is one of the three classroom teachers on the governing board that sets policy for the National Assessment of Educational Progress, or NAEP, which is given to a representative sample of fourth, eighth and twelfth grade students. In the most recent NAEP results, released last month, California was well below the national average for all grades and areas except for eighth grade reading, which rose by 8 points, the largest gain in the country.

‘Good day’ in Fresno

TUDA results are taken from the NAEP scores; no additional testing is involved. Participation is voluntary, but districts do have to meet certain eligibility criteria. At least half of district students must be ethnic or racial minorities and half must be eligible for free and reduced-price lunch. Districts must also be located in cities with a population of 250,000 or more and have a minimum of about 1,500 students in fourth and eighth grade math and reading classes.

The purpose of pulling out NAEP data for large urban schools districts is to demonstrate that these districts are “committed to the highest academic standards,” said Mike Casserly, executive director of the Council of the Great City Schools, which partners with the U.S. Department of Education on TUDA. Casserly, who pressed for the program, also said there’s greater benefit for districts to assess their progress and reform efforts through comparisons with districts that share their demographics and challenges.

“It’s for me, the first time that I’ve been really glad I put us in TUDA,” said Fresno Unified Superintendent Mike Hanson, whose district started participating in the 11-year-old program in 2009. “I would call it a good day,” he added.

Fresno’s eighth grade students gained 7 points in reading, the second highest among TUDA districts, and four points in eighth grade math, which Hanson attributes, in part, to the pre-advanced placement work under way in the district to increase the number of students taking the higher level course when they get to high school.

He also feels somewhat vindicated about Fresno being on the right track following its two-point drop on the Academic Performance Index this year, the first drop in years, mirroring the statewide decline on the index that tracks student scores on standardized tests. If the district had fallen on TUDA as well, that would have signaled a big problem, said Hanson, who views TUDA as a better indicator than the API of how well the district is doing.

Unlike the statewide data from the regular NAEP exam, with TUDA Hanson can tell whether district reforms are working and can identify and contact other urban districts around the country to exchange ideas.

Racial gaps remain

TUDA also shines a spotlight on areas that need improvement and this year, as with most others, despite the increases in scores among racial and ethnic minority students and poor children, the gap in test scores along racial lines shows nearly no movement.

In Fresno, for example, white eighth graders scored more than 20 points higher on average than black and Hispanic students in math. The comparable difference in L.A. Unified was 40 points. The gap with San Diego’s low-income students widened in some areas, although going into the assessment, San Diego’s scores were already higher than those in Los Angeles and Fresno.

“Certainly there is disappointment that we had not done more to close those achievement gaps,” said Ron Rode, San Diego Unified’s executive director of accountability.

Rode said the new superintendent, Cindy Marten, is doubling down efforts to improve teaching and learning through half-day site visits to schools with an instructional team.

San Diego students overall improved their scores between one and four points in all grades and subjects with the exception of eighth grade math, where results dropped by a point. However, the district is still above Fresno, Los Angles and the state in proficiency rates.

TUDA has an interactive website that allows people to plug in different grades, subject and students groups and to show comparisons between districts and states, other TUDA districts and the country.

Overall, California’s participating districts did well, Casserly said. California would not have shown the increase in reading on the regular NAEP exam with the districts’ improvements, he said.

“The results give us confidence that urban education can and is being improved across the country,” Casserly said.

Contact senior reporter Kathryn Baron and follow her @Tcherspet. Sign up here for a no-cost online subscription to EdSource Today for reports from the largest education reporting team in California.

Comments Policy

Manuel6 years ago6 years ago

BTW, do Governor Brown’s recent remarks about standardized testing (he thinks that national education standards are “just a form of national control”) have any role on this conversation about the validity of this “notable progress?”

Doug McRae6 years ago6 years ago

Manuel: The governor's views on standards and testing [provided your description of his view that national standards are just a form of national control is a accurate description of his view] are indeed a bit independent of most Dem governors across the country, and rather more in concert with many Rep governors. The criticism of the common core from the tea party right focuses mostly on the fear than common core standards are or … Read More

Manuel: The governor’s views on standards and testing [provided your description of his view that national standards are just a form of national control is a accurate description of his view] are indeed a bit independent of most Dem governors across the country, and rather more in concert with many Rep governors. The criticism of the common core from the tea party right focuses mostly on the fear than common core standards are or can be a form of national control, particularly over curriculum and instruction at the local level, potentially via high stakes tests influencing curriculum and instruction at the local level. As a testing guy, I share the governor’s concerns along these lines — I think it’s a bad use of statewide tests to directly influence local district curriculum and instruction in a particular direction, bad for good curriculum and instruction and bad for large scale testing. Yet, this exactly seems to be the direction that California is headed via acceptance of the Smarter Balanced consortium tests.

FYI, the Gov was Atty Gen when common core was adopted, but he was Gov when CA joined SBAC. My guess is if he knew that Smarter Balanced assessments would be a violation of his principle of subsidiarity (or local control as much as possible, particularly for curriculum and instruction choices), my guess is he would not have signed the letter joining the Smarter Balanced consortium in June 2011.

Manuel6 years ago6 years ago

Doug, "All I know is what I read in the papers."
Governor Brown's remarks were carried by the LA Times and by the Sacramento Bee.
The remarks were part of "an on-stage interview Monday with the Atlantic magazine's James Bennet at the Computer History Museum" in Mountain View.
As for his relative silence until now, he could have made his opinion known when he was the AG. My guess is that when he signed the SBAC letter he … Read More

The remarks were part of “an on-stage interview Monday with the Atlantic magazine’s James Bennet at the Computer History Museum” in Mountain View.

As for his relative silence until now, he could have made his opinion known when he was the AG. My guess is that when he signed the SBAC letter he knew that the horse had left the barn and just bid his time. Now there are no CSTs and the SBAC is a couple of years down the line. Maybe he wants to see if the schools will collapse without standardized testing. When that doesn’t happen, he will probably say I told you there’s no need for testing. But that’s just a guess, a judgment call, if you will 😉

Gary Ravani6 years ago6 years ago

"Not everything that can be counted counts, and not everything that counts can be counted."
Albert Einstein
"Not everything that is 'statistically significant' is actually significant in real world terms."
(I made that up.)
Since the nation has embarked on standards and test based accountability in 2002, and CA nearly a decade before that, there has been little in the way of "significant" progress particularly for the poor and minority population even within the limited scope of what … Read More

“Not everything that can be counted counts, and not everything that counts can be counted.”

Albert Einstein

“Not everything that is ‘statistically significant’ is actually significant in real world terms.”

(I made that up.)

Since the nation has embarked on standards and test based accountability in 2002, and CA nearly a decade before that, there has been little in the way of “significant” progress particularly for the poor and minority population even within the limited scope of what can be measured by various tests.

As Yogi once said, “When you come to a fork in the road, take it.” Well, we did and it was demonstrably the wrong policy fork. It could be said that it is time to “re-think” educational policy choices if it were true that there was a lot of thinking that went into the policy choices the first time. Of course, there was not. Just a lot of finger pointing and attempting to hold people “accountable” who have little control over the real conditions that create low achievement coupled to campaigns to divert public attention from the policies that could result in improved learning. These include closing the, school funding gap, the preschool gap, the affordable housing gap, the healthcare gap, the living wage gap and all the other gaps that, cumulatively, create the “achievement gap.” Well, its a new year. There is always hope and the chance for a new campaign for quality public education. Cheers!

Kathy Baron6 years ago6 years ago

Manuel,
I spoke with a psychometrician at the National Center for Education Statistics and asked how they define statistically significant results. He said it's not a single measure. It includes the sample size and the percentage of low income, English learner, special ed, and racial/ethnic minority students; the size of the district; how high the gains were overall and within each subgroup; and how the gains compare to the state and national results.
You can … Read More

Manuel,
I spoke with a psychometrician at the National Center for Education Statistics and asked how they define statistically significant results. He said it’s not a single measure. It includes the sample size and the percentage of low income, English learner, special ed, and racial/ethnic minority students; the size of the district; how high the gains were overall and within each subgroup; and how the gains compare to the state and national results.

navigio6 years ago6 years ago

IMHO a psychometrician should be able to tell you off the top of his head what statistically significant is for each of these districts and subgroups. When I become one, I’ll let you know. 😉

Doug McRae6 years ago6 years ago

Navigio: You overestimate what a psychometrician can do "off the top of his head." A good psychomatrician may be able to provide a "back of a napkin" estimate for statistical significance based only on sample size [also called a "poor man's statistical estimate" by a really high powered pure statistician I got to know in graduate school], but the kind of precise statistical significance calculation done by NCES / NAEP folks for TUDA data … Read More

Navigio: You overestimate what a psychometrician can do “off the top of his head.” A good psychomatrician may be able to provide a “back of a napkin” estimate for statistical significance based only on sample size [also called a “poor man’s statistical estimate” by a really high powered pure statistician I got to know in graduate school], but the kind of precise statistical significance calculation done by NCES / NAEP folks for TUDA data cannot be done on an off-the-top-of-my-head basis. When you become one, you’ll understand why [grin].

navigio6 years ago6 years ago

But Doug, I do them all the time and I'm not even a psychometrician yet. ;-)
Kidding aside, she spoke with a psychometrician from the national center for education statistics, not just some random statistician. It seems surprising that one of them would have no feel for the variables needed in conjunction with sample size for something like NAEP. Anyway, I expect that information is in the study itself. I've only been on my phone today … Read More

But Doug, I do them all the time and I’m not even a psychometrician yet. 😉

Kidding aside, she spoke with a psychometrician from the national center for education statistics, not just some random statistician. It seems surprising that one of them would have no feel for the variables needed in conjunction with sample size for something like NAEP. Anyway, I expect that information is in the study itself. I’ve only been on my phone today so I’ll go read that later.

Manuel6 years ago6 years ago

Kathy, thank you for the reference.
Here's my problem: "notable progress" is a very relative term because no context is given until one takes a look at the provided graphs. But the average person out there would not classify a less than 2% change on anything as "notable progress." Anyone getting a 2% pay raise, for example, will never call that "notable progress." 5% maybe, 10%, definitively.
Then we get into statistics. A psychometrician's job is to … Read More

Kathy, thank you for the reference.

Here’s my problem: “notable progress” is a very relative term because no context is given until one takes a look at the provided graphs. But the average person out there would not classify a less than 2% change on anything as “notable progress.” Anyone getting a 2% pay raise, for example, will never call that “notable progress.” 5% maybe, 10%, definitively.

Then we get into statistics. A psychometrician’s job is to design tests that can be used to compare populations both in time and across cohorts. To do that, the most common tool is designing a test whose results will match the Bell Curve as much as possible. If that’s indeed the case, then the average must be stable because if it isn’t, then you can’t compare populations. Yes, Doug may come by and tell us that “growth” is allowed but after 10 years of CSTs the number of kids not proficient always hovered around 50%, which is not surprising since the average was where the proficient cutoff point was set at.

What this means is that the NAEP test, which ought to also be Bell-Curved not “criterion-reference,” probably has the same issues: the average value will fluctuate but will never show large swings. (If it did, it would not be a well designed test.) Hence, Duncan, Deasy, et al, will talk about “notable progress” and “be elated” and declare victory when 2% positive changes to the average happen. (Please note that the change in the national average is even smaller, but you don’t hear Duncan bemoaning that there is no growth nationally!)

Unfortunately, no psychometrician that wants to keep his/her job is going to raise these issues. That is why they like to “take the fifth” and say it is not a single measure. But if that is the case, why call this almost-meaningless fluctuations “notable progress?”

Anyway, I did look at the pdf and I’d be willing to bet that the sampling for LAUSD is mostly Latino kids who are poor. Why do I state this? Because the latest enrollment figures show that roughly 10% are white, 10% African American, and less than 10% Asian Americans. Plus roughly 76% of students qualify for a free lunch. With those demographics, it is not surprising that LAUSD has a relatively lower average than the nation. But what else is new?

BTW, I hope you note that my disappointment is with how officials can turn even the most minute “positive” change into a major cause for celebrations. Unfortunately, you have to report it and, unless you can find a Deep Throat in the testing industry, you won’t be able to include contrarian opinions. Unless, of course, you listen to non-experts such as me (or navigio!) 🙂

Doug McRae6 years ago6 years ago

Yup, Manuel, I'm coming by to repeat the information that the Bell Curve or normal distribution characteristic of test scores is a reflection of human behavior (in this case, spread of academic achievement) rather than type of test. NAEP actually has a real history of being a criterion-referenced test -- from the late 60's when NAEP was first introduced to the mid-80's, all interpretation was based on item or small clusters of items data, no … Read More

Yup, Manuel, I’m coming by to repeat the information that the Bell Curve or normal distribution characteristic of test scores is a reflection of human behavior (in this case, spread of academic achievement) rather than type of test. NAEP actually has a real history of being a criterion-referenced test — from the late 60’s when NAEP was first introduced to the mid-80’s, all interpretation was based on item or small clusters of items data, no total scores at all, which is one of the characteristics of true criterion-referenced testing. But, since the mid- or late-80s, we have had some sort of total scores from NAEP, for the last 20 years we’ve had state-by-state NAEP results, and for the last 10 years we’ve had NAEP data for self-selected large urban districts. In all of these cases, the score cutoff’s remain constant over time, and thus allow for gain or growth measurement over time. And folks can do the necessary computations to determine whether the gains are statistically significant or not. But as Gary notes below, not everything that is statistically significant is educationally meaningful, and it is a judgment call (not a statistical call) whether or not 2 or 4 or 6 or whatever point gains are “notable” or not.

I’d also take issue with your comment that, for CA’s STAR CSTs that “after 10 years the number of kids proficient always hovered around 50 percent.” That’s not true, CA growth data over time shows average percent proficient mostly in the 30’s back in 03 with gains over time for most CSTs to greater than 50 with many over 60 percent proficient by 2013. The real question for growth measurement is not arguing over one or two points being meaningful from year-to-year, but rather whether trend measurement over multiple years shows meaningful gains. For guidelines for good or meaningful gains for statewide standards-based tests, Bob Linn commented about 10 years ago that consistent gains in the 3 to 4 percent range over time should be interpreted as good meaningful gains; I agree with Bob’s observation. For darn near 40 years, I’ve suggested to schools [not districts or states with much larger sample sizes] that gains of 10 percent should be interpreted as meaningful and that frequently it takes several years for a school to establish that kind of growth across grades and content areas. With the national NAEP, given the size of the population it is designed to track, gains in the 4-point range (reasonable target for statewide results) or in the 10-point range (reasonable target for school level results) are too large, in my judgment. TUDA NAEP gains in the 3-4 point range for the large urban districts from year-to-year qualify as “notable” gains, again in my judgment. Statistically significant gains would be less than these “educationally meaningful” guidelines.

navigio6 years ago6 years ago

I dont think it's merely a judgement call whether a given point gain is 'notable'. If the gain is not statistically significant in the first place, then its equivalent to no gain whatsoever. Treating zero gain as notable makes no sense. It's true that once something is considered statistically significant, it is a judgement call as to what you do with that information, but I think thats something other than what is being discussed.
FWIW, according … Read More

I dont think it’s merely a judgement call whether a given point gain is ‘notable’. If the gain is not statistically significant in the first place, then its equivalent to no gain whatsoever. Treating zero gain as notable makes no sense. It’s true that once something is considered statistically significant, it is a judgement call as to what you do with that information, but I think thats something other than what is being discussed.

FWIW, according to the study, 10-year changes of less than about 10 points in math and about 8 points in reading were not considered statistically significant. For LAUSD, all-student changes were considered statistically significant in all tests except 8th grade math (mentioned in the story).

Its probably also important to mention that very few of the district-level subgroup results were considered statistically significant when compared to 2011.

And regarding the notion of ‘notable’. I did find it notable that the fresno super indicated he put more stock into these results than those we get from our state accountability metrics. It would be very interesting to hear why he thinks that and what he thinks we should change.

Doug McRae6 years ago6 years ago

Navigio: OK, if a gain isn't stat significant, I'd concede it isn't a judgment call. But for educational testing data, the stat significance calculations assume that kids are randomly assigned to states or schools or subgroups or whatever, and we know that ain't the case. So, the assumption of random assignment translates to unrealistically low gain or growth numbers being stat significant, by and large, and results in softer judgmental higher numbers … Read More

Navigio: OK, if a gain isn’t stat significant, I’d concede it isn’t a judgment call. But for educational testing data, the stat significance calculations assume that kids are randomly assigned to states or schools or subgroups or whatever, and we know that ain’t the case. So, the assumption of random assignment translates to unrealistically low gain or growth numbers being stat significant, by and large, and results in softer judgmental higher numbers before folks should treat gains as noteworthy or notable. It is for this reason that interpretation of test result gains is for the most part a judgmental rather than scientific or statistical thingie.

Manuel6 years ago6 years ago

Bottom line: any "pronouncements" about test results are "judgement calls" and based on the experience and/or bias of the source.
So how can policy (such as "we want better outcomes with LCFF money") be based on such ethereal conceits? If there is no science or, the Goddess forbid, statistics involved, how can there be any "authority" to base policy on something that not even Deasy can put his finger on other than to claim credit for?
Even … Read More

Bottom line: any “pronouncements” about test results are “judgement calls” and based on the experience and/or bias of the source.

So how can policy (such as “we want better outcomes with LCFF money”) be based on such ethereal conceits? If there is no science or, the Goddess forbid, statistics involved, how can there be any “authority” to base policy on something that not even Deasy can put his finger on other than to claim credit for?

Even the LA Times editorial board is starting to wonder about this if one can believe their editorial on the TUDA results. Here are the last three paragraphs from this editorial:

“Researchers say it’s impossible to ferret out the reasons because the implementation of school reforms tends be haphazard, overly broad and seldom assessed. The higher scores seem to indicate, as reformers have claimed, that smaller class sizes don’t necessarily matter much; class sizes increased during the last few years because of the state’s budget crisis even as the test scores went up. At the same time, scores rose without the change sought by Supt. John Deasy and other reformers that would tie teachers’ performance ratings to their students’ test scores. Apparently, teachers are successfully improving scores without that kind of pressure.

The higher test scores might reflect policies from years ago that are only now starting to show results. Or some factors might not even be related to changes at schools at all, said UC Berkeley education professor Bruce Fuller. Education levels among Latina mothers have been rising, and maternal education has long been considered an important factor in early literacy.

With hundreds of millions of dollars coming to L.A. Unified from an improved state budget and a new school funding formula, it’s more important than ever for the district to use the money in targeted ways that can be measured and then copied if they’re successful. Future progress depends on knowing what works.”

Notice that while they call for finding out what works they don’t demand that this answer be found before spending all the “new” money. Judgement call, indeed.

Manuel6 years ago6 years ago

An increase of 4 points from a baseline of 201/246 at LAUSD is considered “notable progress” by Duncan? Really?

That’s a 2/1.6% increase.

“Notable progress?”

Maybe in an alternate universe.

(Please forgive my inability to see clothes on this very naked emperor.)

navigio6 years ago6 years ago

John Fensterwald6 years ago6 years ago

NAEP scores weren’t one of the metrics that are tied to the waiver, navigio, but I imagine that the encouraging results won’t hurt when Arne Duncan decides whether to grant the waiver for another year next summer.

navigio6 years ago6 years ago

Nice redirect John. ;-)
To the extent the results reflect anything, they do not reflect any of the changes that the waiver allowed. On the contrary, they indicate that whatever was happening prior to the waiver was actually working (again, subject to interpretation). Some of the proposed changes, including cutting SES intervention, will result a in a real and significant change for students (though based on earlier comments at least Fresno seems to recognize the danger … Read More

Nice redirect John. 😉

To the extent the results reflect anything, they do not reflect any of the changes that the waiver allowed. On the contrary, they indicate that whatever was happening prior to the waiver was actually working (again, subject to interpretation). Some of the proposed changes, including cutting SES intervention, will result a in a real and significant change for students (though based on earlier comments at least Fresno seems to recognize the danger in that).

If these results are not matched in future assessments the waiver will be interpreted as having caused that failure given that it’s freedom only happened after these tests were taken.

It is quite noteworthy that Deasy was so surprised by this given how he characterized last years performance gains. I’m sorry to say that that heightens my suspicion that he actually does not believe assessment results can be tied in a causal way to district policies.

Kathy Baron6 years ago6 years ago

Navigio, Please let me clarify John Deasy’s remarks. He was more elated than surprised. He expected the district to do well, but the gains exceeded those expectations.