School Accountability

In the 2013-14 school year, twenty-four states required students to be proficient on standardized tests in order to graduate from high school. But starting next year, and in the years to come, states will launch more rigorous, college- and career-ready assessments aligned to the Common Core. As they do so, they should revisit the stakes on these tests for students and consider eliminating, or modifying, their exit exam policies.

This is the fourth in a series of posts reflecting on terminology pervading today’s polarizing debates about American education. In each post, we ask how various buzzwords—“professionalism,” “accountability,” “equity,” and the like—influence the conversations we have. What are the strengths, weaknesses, and blind spots that come with framing our arguments in each of these terms? The hope is that assessing the implications of the way we talk will prompt more productive discussions about improving PreK-12 education.

In the last “The Way We Talk” post, I argued that equity is the closest thing that American public education has to a sacred purpose. We expect our schools to be equal opportunity catalysts; once students complete their PreK–12 (or sometimes PreK–College) education, we generally act as though society has provided them an adequate platform for determining the course of their lives. Put another way, public schools are our community’s most tangible, most democratic commitment to sustaining the American Dream.

But democratic equity only covers part of the story. The United States is a liberaldemocracy. Its attitude towards education (and politics more generally) also stems from individualist liberals like Thomas Jefferson, Thomas Paine, and John Locke. If we care about equity for all, we also care about choice. We care about freedom.

2013-14: the school year all American students, in all public schools, were expected to be proficient in reading and math. It’s finally here. But don’t kid yourself – nobody expects American schools to meet that goal this spring. And thanks to the U.S. Department of Education’s ESEA waivers (and waivers of waivers), most won’t have to.

Instead, 41 states, Washington, D.C. and eight California school districts have different goals for student performance. Goals that cut achievement gaps, or delay the universal proficiency deadline, or lead to college and career readiness, or something else. In some states, failing to meet these goals – like failing to make AYP – triggers interventions as a priority or focus school. But in others, there isn’t any meaningful accountability attached to the new targets. Moreover, many states won’t name any new priority or focus schools this year.

While states don’t have a clean record when it comes to gaming accountability systems, that isn’t necessarily what’s happening here. Holding priority and focus lists constant from 2013 to 2014 is a pragmatic decision on states’ parts, because school performance goals aren’t the only thing changing. Across the country, in waiver and non-waiver states alike, students will also be field testing the new Common Core-aligned tests developed by SmarterBalanced and PARCC. And the Department is letting all states apply for additional flexibility so that some students, in some schools, would take the field test and wouldn’t take state standardized tests in at least one subject. In other words, states won’t have to “double test.”

This creates a problem for the continuity of school accountability systems. How can you judge school performance fairly and accurately when the data aren’t comparable between schools? That’s like determining the faster runner when one competitor has hurdles in their lane and the other doesn’t. Further, the field tests aren’t designed to be used for making school accountability determinations, or frankly, measuring student performance. As Tom Kane notes in his incredibly smart take on the issue, the field test is meant to test the validity of individual test items, not produce a valid score for an individual student. Making a trade-off between high-stakes consequences for school accountability and a valid field test seems like a relatively easy choice. Hold accountability determinations steady and continue all current school improvement efforts, but make sure the field test is conducted to the highest possible standard so that the tests are ready for primetime in 2015.

But it isn’t quite that simple. After all, accountability is about more than being labeled a “failing” school or a priority one. Accountability relies first and foremost on transparent, accurate reporting of student achievement data. And this is where the field test creates a much more harmful trade-off. The U.S. Department of Education will still require all students to be assessed in both reading and math, but states will not be required to publicly announce the results for students taking a field test. For the first time in the NCLB era, there will not be achievement data available for a significant number of students and public schools.

This is a big deal. These data (should) inform nearly every decision made in education – for families, for educators, and for policymakers. Should we send our child to the neighborhood school, or try to enroll her in a charter school nearby? How effective was our new 7th grade math curriculum? Did our new professional development program improve teaching quality? Are the interventions in our focus schools working? All of these questions will be much more difficult to answer without student assessment data. Further, as Bellwether’s Chad Aldeman and Andy Smarick write, this compromises efforts to measure student achievement and growth as required in Race to the Top, ESEA waivers, and a host of other Obama education reforms. Yes, states could continue to administer their current tests during the field test, as they have during previous assessment revisions. But many will not. While the Department was clearly trying to appease (or even subtly encourage) states to participate in the field test, would states have really balked at field testing if every student was also given the state assessment?

Giving up a year of meaningful school accountability is a high price for getting better, more rigorous assessments that reflect what truly matters: whether students are ready for college and career. But the Department didn’t just give up meaningful accountability. They’re also giving up public reporting of test results at the same time. Does that make the price too high?

While it’s too late to reverse the Department’s decision, states participating in the field test should take a prudent and limited approach to it. Let field tests be field tests. They weren’t designed to be used as measures of individual student achievement or school performance. And the more students and schools participating in field testing, the larger the effect on transparency and accountability will be. Unfortunately, a few states – most notably California – are already gearing up to scrap their state assessments entirely this year and only administer the field test.

2014 was never going to be the year we saw universal proficiency. Unfortunately, it could shape up to be the year we see universal missing data. Let’s hope other states don’t follow California’s lead.

appearing worthy of belief <the argument was both powerful and plausible>

Last week, a report commissioned by the Indiana state legislature provided more detail on just exactly what happened a year ago as then-Superintendent of Education Tony Bennett prepared to release the state’s first A-F school grades – one of many high-profile reforms Bennett and his staff championed. The emails uncovered a last-minute scramble to change grades for certain schools and, ultimately, led to Bennett’s resignation as schools’ chief in Florida. But many unanswered questions remained. Did last week’s report answer them?

Here’s what we knew before:

Emails obtained by the Associated Press showed that the final grade for Christel House Academy and a dozen other schools with nontraditional configurations (e.g. grades 5-9, K-10) were changed internally prior to the public release of the grades. The emails showed Bennett and his staff were particularly concerned about an initial ‘C’ grade for Christel House, founded by a prominent political donor and regarded as a top-performing charter school. Christel House had recently expanded to offer high school, in addition to elementary and middle grades.

After reading the emails and analyzing the data, I reported that the grading change was only possible once performance data from these schools’ high school grades were eliminated from the formula. And I had a problem with that. The changes were made thanks to a “loophole,” in secret and without a public explanation. This made it near impossible for parents and families to know that Christel House’s K-8 grades earned the ‘A’ grade, but its high school did not.

Further reporting found that another 165 schools, including Christel House, benefitted from a second tweak to the formula: removing a cap on the number of bonus points schools could receive for high rates of student growth. This allowed elementary and middle schools with high growth in one subject to compensate for low performance in another subject area.

And here’s what the new report says: “In the end, Authors found that the two adjustments administered to determine Christel House Academy’s final grade were plausible and the treatment afforded to the school was consistently applied to other schools with similar circumstances.”

This isn’t news. That’s because the question wasn’t whether the changes were applied consistently – dozens of schools benefitted from the two changes, and there was never any indication that this wasn’t the case. The question wasn’t even if the changes were plausible. Of course officials could make some reasonable-sounding explanation for increasing the emphasis on student growth in the grading formula, or for removing the high school data for some schools… just as officials could also make a reasonable-sounding explanation for limiting the emphasis on growth, or for using all available high school data for all schools. In fact, that’s exactly what officials did with schools that only serve grades 9 and 10 (they just didn’t apply the same plausible logic to schools serving grades 5-10).

In other words, just because Bennett’s decisions were plausible – “superficially fair and reasonable” – it doesn’t mean they were right or in the best interests of students and families. The decision to ignore these schools’ high school data in the A-F system is like telling parents their child is an honor roll student, but only after tossing out a failing grade in Spanish because it’s their first year taking the language. Sure, it’s a plausible argument, but is it the right call? That's debatable, and it's a debate that should have happened in public.

The larger takeaway here isn’t just about the plausibility of these changes. It's also the process by which they were made. Would we be having this conversation if Bennett and his staff had been open about altering the formula and removing the high school data or the growth caps from the beginning? Or at least once the emails were released?

School accountability systems cannot function without public trust. Anyone – from parents, to policy analysts, to reporters – should be able to determine how a school’s grade was calculated. Students can determine why they earned a B+ on a math test by looking at which questions they missed and how many points they lost for each, just as the public should be able to look at a school’s ‘B’ grade, understand how it was calculated, and note the school’s strengths and weaknesses. And if changes are made to the grading rubric or the weighting of components, they must be announced and explained publicly – not buried in Excel files or internal emails.

Predictably, friends of Bennett have been quick to forgive, just as his political foes were once quick to judge. But these black-and-white pronouncements overlook many of the valuable lessons that can be learned from the report – as the authors note, their work neither condemns, nor vindicates Bennett.

The report does confirm how the grading changes occurred. Moreover, it lays out several useful recommendations for Indiana’s A-F school accountability system, notably increasing the transparency of the decisionmaking process, improving capacity within the state education agency to handle the technical aspects of A-F development, and piloting school grades before full implementation. And as PoliticsK-12reported, these lessons extend beyond Indiana to every state updating its accountability system under ESEA flexibility. Instead of judgment and vindication, let’s also talk about how these accountability systems can be improved. It’s time to focus on the process, not just the politics.

The Federal Education Budget Project (FEBP) today announced new K-12 achievement data available on its website for the 2010-2011 academic year. The data are available at the state level, as well as for each of the 13,776 traditional public school districts throughout the country. Specifically, we added in the 2011 percentage of students who scored at least proficient in mathematics and reading on state standardized tests in fourth grade, eighth grade, and high school.

To check out your local school district, visit the FEBP database and begin typing your school district name into the PreK-12 search box. Once you’ve selected your district, check out the Achievement section of the page at the bottom. There, you’ll find the new student achievement data, with the state average listed below for comparison. Mouse over the title – “4th Grade Reading District NCLB” – for a definition and source. You can also compare each district to others that are within 10 percent of the same proficiency level on a number of variables – just click the “compare” button when you mouse over the title.

NOTE: Hilliard City School District (OH)

We have a five-year snapshot of state and school district-level performance in FEBP (stretching back to the 2006 school year in the downloadable files), illustrating progress toward the benchmark set by No Child Left Behind (NCLB) of 100 percent proficiency by the 2013-2014 school year.

Beware, though: Because every state has set its own benchmark for student proficiency, data are not comparable from state to state, at least until some states start reporting accountability under common exams. Further, states periodically update their definitions for proficiency – often through adjusting the “cut scores” on their state standardized tests – which creates additional barriers for comparing the data longitudinally, even within a state.

Data for the state of Michigan for the 2010-2011 academic year illustrate this challenge. Michigan tests students each fall for the previous year’s learning (for example, students were tested in fall 2011 to demonstrate their 2010-11 school year proficiency). In the 2010-11 school year, students’ proficiency levels plummeted. That’s because the state adopted “more rigorous ‘cut scores’” for the state exam, the Michigan Education Assessment Program (MEAP), that reflect a more rigorous college- and career-ready standard (Michigan.gov). They also raised the cut scores for the Michigan Merit Exam, the state’s high-stakes high school assessment, so high school scores also declined in the most recent FEBP data. This is all in preparation for the state’s anticipated switch to the Common Core-aligned assessments in the 2014-2015 school year.

Those challenges will confront other states as well. As states adopt the Common Core-aligned exams, they’ll be starting over on these accountability metrics. Scores are expected to drop for many students under the new, more-rigorous standards.

And many more changes are coming to the data for states in the 2011-12 academic year, and even more for the 2012-2013 academic year. That’s because over the past two years, states began receiving waivers of many of No Child Left Behind’s accountability provisions from the Department of Education. So far, 39 states and a group of school districts (called the “CORE Districts”) in California have received waivers. In many cases, this will affect the way these states report their student achievement data by altering the definition of proficiency and redefining the groups of students on which states report.

Tennessee is one of the many states that has made changes to the way it reports student achievement data. Under NCLB, states only count students in schools that are enrolled on or prior to the twentieth day of the school year; under Tennessee’s NCLB waiver, starting in the 2011-2012 school year the state will include all students enrolled at some point during the school year for federal reporting purposes. We have this complication, and more, to look forward to starting with next year’s data.

In the meantime, find out how your district has measured up over the years. The data are also downloadable in an open data file here.

Like many in D.C.’s family-heavy Ward 4, Sam Chaltain sends his children to charter schools. His older son attends Latin American Montessori Bilingual, and his younger son will follow in a few years. This is just one of the area’s charters; it also boasts E.L. Haynes, Capital City, and several others that rank among the District’s very best, according to D.C.’s Public Charter School Board’s accountability rating system.

On Tuesday the National Student Clearinghouse Research Center released the report, Baccalaureate Attainment: A National View of the Postsecondary Outcomes of Students Who Transfer From Two-Year to Four-Year Institutions. Their research shows that students who transfer from community colleges to four-year institutions after having already obtained a credential are obtaining a certificate or associate’s degree and transferring to a four-year institution. In addition, the report indicated that completion rates vary per college type. Those attending public institutions, for example, had a 65 percent completion rate, whereas non-profit institutions had 60 percent, and for-profit institutions had 30 percent of their students completing college. According to a study conducted in North Carolina the economic benefits of obtaining an associate’s degree before transferring to a four-year institution could be $50,000.

The House Education and Workforce Committee recently asked for ideas and advice as it moves to reauthorize the Higher Education Act (HEA). Many higher-education membership organizations reported with a wish list of proposals including more spending for key programs (Pell Grants, work study, funds for minority institutions, etc.). Although the number of groups who submitted reauthorization proposals is unclear; however, they did include the American Council on Education on behalf of 40 colleges and accrediting groups, and the Association of Private Sector Colleges and Universities (APSCU). According to APSCU’s comments “we all agree that higher education faces critical challenges. These range from cost, access, technology and the skills gap to quality, productivity, accountability, and globalization.” Even though the Education and Workforce committee appears to be gearing up for reauthorization, most experts agree that given Congressional gridlock, HEA probably won’t be reauthorized until the next president is in office.

College graduates can now add various assessment results that indicate what they learned in college to their job applications. Three non-profit testing agents –Collegiate Learning Assessments, Educational Testing Service (ETS), and ACT Inc. –are using new assessments that were created to help students and institutions track learning outcomes. Not only can the assessments test basic competencies such as soft-skills mathematics, but the testing agencies claim they can also measure mastery of critical thinking, reading, and writing. Additionally, some testing firms provide students the ability to earn an, “Electronic certificate which can be shared with an unlimited number of recipients in academia and beyond,” to prove their various competencies. These certificates are affordable, costing only around $20 per certificate.

This is the second in a series of posts reflecting on terminology pervading today’s polarizing debates about American education. In each post, we ask how various buzzwords—“professionalism,” “accountability,” and the like—influence the conversations we have. What are the strengths, weaknesses, and blind spots that come with framing our arguments in each of these terms? The hope is that assessing the implications of the way we talk will prompt more productive discussions about improving PreK-12 education.

I. Holding ourselves to account

Last week, I wrote about the advantages and disadvantages of approaching education policy in terms of professionalism. This week, we’ll take a look at accountability, the regnant ideal guiding most education reformers today. Indeed, the last two presidents have made it the cornerstone of their education agendas.

Mel Horowitz: You mean to tell me that you argued your way from a C+ to an A-? Cher: Totally based on my powers of persuasion, you proud? Mel Horowitz: Honey, I couldn't be happier than if they were based on real grades.

Turns out we’ve all been Clueless when it comes to Indiana’s A-F school grades. Former Indiana (and current Florida) schools’ chief Tony Bennett has been under fire for released emails that show he and officials at the Indiana Department of Education altered the grades for certain schools prior to the very-public release of the new accountability measures last fall. What’s particularly worrisome is that the change to the grading methodology wasn’t so public. In fact, it was never announced. And from the emails obtained by AP reporter Tom LoBianco, it’s clear that Christel House’s initial grade set off a firestorm of panic at the IN DOE.

In a press call and separate interview with AEI’s Rick Hess, Bennett explained the matter by saying that Christel House Academy and a dozen other schools were unfairly penalized due to their unconventional grade configurations. Because they didn’t serve students in grades 11 or 12, these schools were missing key data elements for the high school calculation – namely, graduation rates and college readiness indicators, which typically count for 40 percent of the high school model. In Bennett’s words:

“The backstory is simple here, Rick. In our first run of the new school calculations in Indiana, we turned up an anomaly in the results. As we were looking at the grades we were giving our schools, we realized that state law created an unfair penalty for schools that didn't have 11th and 12th grades. Statewide, there were 13 schools in question had unusual grade configurations. The data for grades 11 and 12 came in as zero. When we caught it, we fixed it. That's what this is all about…. Because Christel House was a K-10 school, the systems essentially counted the other two grades as zeroes. That brought the school's score down from an "A" to a "C".”

Turns out it’s not quite that simple. The state has several variations of its grading rubric to apply to different school situations and set-ups. The basic models are 1) elementary and/or middle school grades and 2) high school grades. Then, there is a combined model for schools that have students in grades preK-8 and grades 9-12 – like Christel House, which served students through 10th grade in 2011-12. The grade point averages for the 3-8 portion of the school and the 9-12 portion of the school are weighted according to the percentage of enrolled students in each grade span to arrive at one final, combined grade. (The final scale: 3.51 – 4.00 points = A; 3.00 – 3.50 points = B; 2.00 – 2.99 points = C; 1.00 – 1.99 points = D; 0.00 – 0.99 points = F)

Within the two basic models (ES/MS and HS), there are also deviations for special circumstances. Typically high school grades are calculated with a 60% weight on proficiency in end-of-course exams in Algebra I and English 10 (with potential bonus points for increases in proficiency rates from grades 8-10 and grades 10-12), 30% weight on graduation rates, and 10% weight on college readiness indicators. But some high schools are given special consideration: small schools, HS feeder schools (grade 9 only), 9-10 schools, and 11-12 schools. In the 9-10 model, proficiency rates make up the entire school grade, split evenly between Algebra I and English 10, and the bonus points do not apply.

Confused yet? Bear with me. Christel House should have been evaluated using a mixture of two of the models: the 9-10 model and the combined ES/MS + HS model. Except they weren’t. Because Christel House wouldn’t have gotten an ‘A’ that way. In fact, one of the released emails walks through the calculation (using preliminary, rather than final, achievement data). Under this method, Christel House earned a ‘C’ grade, “a HUGE problem for us” according to officials. And it set off the panic within the Indiana Department of Education – at 2:30 in the morning on September 13.

However, state officials soon – that same day, in fact – came upon a solution. Or in their words, a “loophole,” in the combined model calculation. Here’s the original definition (as written in one of the emails):

(j) A school’s… grade shall be determined by:(i) Multiplying the average of the ELA and Math points for the EMS grades by the percentage of all students(ii) Multiplying the sum of the four weighted scores for the high school by the percentage of students.”

Those three bold words contain the loophole Will Krebs, then Director of Policy and Research, found later that day – dubbed “option one.” Because Christel House didn’t have four weighted scores for its high school, the argument was that the combined school methodology was invalid. Without graduation rates and college readiness indicators, the school only had two of the four weighted components. Jon Gubera, Chief Accountability Officer, signed off on this option the following morning writing, “Option one works…. This would eliminate the HS points and ensure Christel House receives at least a B.”

So what does that mean, exactly? In truth, Christel House was never evaluated on its poor high school performance. Instead, all of the high school data were thrown out – a little detail Bennett failed to mention. Christel House’s ‘A’ is based on the ES/MS model only. As you can see below, Christel House’s grade was clearly inflated. The initial data run showed the school with a ‘C’ grade. Using the combined methodology sans “loophole” with its final performance data, however, the school would have actually earned a ‘B.’ Yet the school still received an ‘A’ from the state and was treated as only having elementary and middle school grades. Further, there is no indication anywhere on the state’s school report card that Christel House’s grade fails to reflect the school’s poor high school math performance.

According to the Indianapolis Star, Bennett refused to allow two regular public schools facing state takeover to use a similar "loophole" a year earlier. In both cases, poor middle school performance (where the school had recently expanded) penalized the high school. If their grades could not be separated, why was Bennett so eager to make an exception for Christel House?

These kinds of shenanigans are unacceptable and have chipped away at public faith in the legitimacy of school accountability systems over the last 10+ years of No Child Left Behind. Christel House’s grade is simply more false advertising from states and local districts that have a long history of finding loopholes in accountability systems and exploiting them. In fact, Indiana officials questioned whether using the loophole in this case would encourage other schools to adopt a grade 6-10 model to avoid accountability. Gubera replied: “Not in the immediate if we don’t advertise this everywhere.”

This just illustrates the problem. Christel House is an ‘A’ school… but only for its elementary and middle school program. Yet that isn’t the story Bennett and his staff are telling. This grade inflation is particularly unfortunate in Indiana, where parents and families have a greater degree of school choice than in most states and rely on information like A-F grades to determine where to enroll their children.

“This kind of system has to make sense for the end user, in this case, the family… Back in Indiana, we were trying to build a new system. It's an interesting parallel. My recommendation to the Florida board was, "If your system doesn't fully make sense, then how do you defend it?" If the results come out suspect, then, in the end, you can really question the integrity of the system.”

Commissioner Bennett, Christel House’s inflated grade is suspect, and I’m questioning the integrity of the system. Accountability systems – even those required from the U.S. Department of Education – can be done right, but Tony Bennett unfortunately just made it that much harder to make the case for them.

Note: To see option 1 in action for yourself, check out the attached speadsheet from Indiana's Office of Accountability. Christel House Academy appears on the Elementary/Middle School tab, but not on the High School or Combined School tabs.

Yesterday, US News & World Report asked five experts in its Debate Club whether the Senate should pass the House’s No Child Left Behind rewrite – the Student Success Act. With the last week's House action, the Student Success Act is the first piece of legislation to make it to a floor vote in the six years since NCLB has been due for reauthorization. Sounds like progress, right?

Well I don’t agree. Here’s what I had to say about the Student Success Act: “Unfortunately, the Student Success Act isn't going to fix either policy [NCLB or NCLB waivers]. Because the Student Success Act doesn't want to fix the federal role in education – it wants to eliminate it.”

What does that mean? While the bill would reduce the scope of the federal role in education by freezing funding at sequester levels and eliminating programs and U.S. Department of Education staff, funding isn’t my biggest issue with the Student Success Act. The larger problem is that the bill guts federal accountability for schools and educators at the same time. There are no requirements for states to adopt college- and career-ready standards, no requirements for states to implement rigorous school accountability systems or teacher evaluations, and no requirements for states to meaningfully support school improvement. (You can see a detailed comparison of all the various NCLB reauthorization proposals here.)

Yes, NCLB was too prescriptive for states in certain areas. But that shouldn’t be an excuse for no federal role whatsoever. As I explain:

"Skeptics say that the federal government can make states do things, but can't make them do things well. But that's the point: without a strong federal role, states may not do anything at all. Instead of giving states slack in the right places (e.g. how to improve schools, how to produce effective teachers), the Student Success Act gives up entirely – no standards, no accountability, no improvement.”

You can read (and vote for) my full response in the Debate Club here, along with commentary from Rep. George Miller (D-CA); president of the American Federation of Teachers, Randi Weingarten; the Center for American Progress, and the American Enterprise Institute.