Or who should be held accountable for the accountability design work being done by MMSD? These aren’t easy questions. Accountability is confusing, maybe not as confusing as the Abbott and Costello routine, but confusing (who should or should not be held accountable for the results of accountability measures is even more confusing….add teachers, families, the economy, inequality, …. to the list below). The chain of accountability goes from the voters who elect Board Members, to the Superintendent who the Board hires, fires and evaluates, to the administrators the Superintendent hires (with the consent of the Board, but for better or worse this has been a rubber stamp consent), supervises and evaluates. It also loops back to the Board, because they are responsible for making sure administrators have the resources they need to do good work, but this chain continues back to the Superintendent and the administrators who prepare draft budgets and should communicate their needs and capacities to the Board. The Superintendent is the bottleneck in this chain each time it loops around because the the MMSD Board has almost entirely limited their action in evaluation, hiring and firing to the Superintendent. Right now MMSD has an Interim Superintendent, so evaluation, hiring and firing is moot and the key link in the chain is broken. Like I said, confused.

What is clear is that the only lever of accountability community members hold is their vote in school elections. Three seats are up in April (Board President James Howard has announced his intent to run for re-election; Maya Cole and Beth Moss have not publicly stated their plans).

The impetus for creating the “Accountability Requirements” was a budget amendment from Board Member Mary Burke. I believe it passed unanimously. For the purposes here, I’m saying “The Administration” is at bat and the Board of Education is the manager sending signals from the dugout. I didn’t count, but there are at least a half dozen administrator names listed on the “Accountability Requirements for Achievement Gap Plan,” if you want to get more personal with who should be accountable, feel free.

Swinging for the Fence or “Small Ball”?

The public loves power hitters; the long ball is a crowd pleaser. Baseball insiders and aficionados understand that swinging for the fence increases the likelihood of striking out and that often the situation calls for “small ball,” like trying to draw a walk, attempting a sacrifice bunt, hitting behind the runner, or lining a single in the gap. the key to small ball is that you do many little things and they combine to produce runs.

With educational “accountability” I would argue that setting “goals” (any goals at all, but especially unrealistic ones like the NCLB 100% proficiency, or the “goals” listed in the draft MMSD”Accountability Requirements,” more on the latter below) is the equivalent of swinging for the fence. This is part of the “data driven” mentality. I think the situation calls for an educational version of small ball, something not as crowd-pleasing, demanding a higher level of engagement by all involved, and more likely to produce a productive unerstanding. What I have in mind is monitoring multiple measures, or “data guided” decision making.

Although the reporting has not been good, MMSD tried something like this with the Strategic Plan “Core Performance Measures.” Unfortunately there seemed to be collective agreement among Board Members and administrators at a recent meeting that these measures would be set aside in favor of the “Accountability Requirements” now under consideration and by implication that all the Strategic Plan work would be left to gather dust. There were targets associated with “Core Measures” but the main idea was that the Board and the Administration pay regular attention to multiple measures and their movement, individually and collectively. This is far different than stating as a goal that 90% of students will score in the proficient range by year 3. The first thing policy makers need to know is whether things are getting better or worse and at what pace. The use of standardized test score goals (and goals for many other measures) in “accountability” doesn’t help with that and creates difficulties.

What is the “accountable” action if some measures go up and some go down? What if demographics or the tests themselves change along the way? And then there are the uncomfortable questions of who will be held accountable and how if none of the goals are met. We should have learned from NCLB that this approach is not what the situation calls for, but apparently MMSD administrators did not.

At a previous meeting on the “Accountability Requirements” Board Member Ed Hughes moved closer to the small ball position by suggesting that instead of absolute goals, the goals be presented in terms of change or growth. Better, but the problems identified remain. The whole goal oriented approach could be called “Strike One,” but I’m not going to do that.

Strike One

The first draft of the “Accountability Requirements” was presented to the Student Achievement and Performance Monitoring Committee on September 30th and appeared essentially unchanged on the full Board October 29th agenda as part of a Committee Report. In baseball parlance it was a unbalanced, badly mistimed swing for the fences at a ball well outside the strike zone. It isn’t pretty. Strike one.

Some managers would have been tempted to pull the batter and send up a pinch hitter, but instead Board Members sent some signals from the dugout, pointing out some of the mistakes and offering tips for improvement.

Mary Burke noted that the left hand and the right hand didn’t appear to be coordinating. To be more specific, she pointed out that on page 15 (of the pdf) there is a chart with the stated goal “95% of all 11th graders will take the ACT in 2012-13,” but chart itself shows annual incremental increases, culminating at 95% for all groups in 2016-17. It was long ago decided that all students would take the ACT in 2012-13, whoever prepared the left part of the chart knew this, but whoever did the increments on the right did not (and apparently didn’t read the left part). Here it is:

Other problems with the swing are more subtle. There is also another section where ACT goals are expressed in terms average scale scores. This appears to be another case of lack of coordination between the two hands. As discussed below, the sections related to students reaching the ACT “College Readiness” benchmarks are left mostly blank in recognition of the fact that increased participation due to the test-taking mandate will almost certainly lower the starting point. The people doing the average scale score section don’t seem to have understood that. Their chart shows steady and unrealistic growth (except a 0.1 drop for white students in the final year), with all reaching 24 after five years. Here it is:

You may think this is nitpicking, but these are highly paid professionals who didn’t do their homework to arrive at realistic goals and have made the kind of stupid errors that would cost students serious points on the standardized tests that these same highly paid professionals are employing in the name of “accountability.” Shouldn’t they be accountable?

Despite some coaching from the Board that resulted in fixing the above issues, problems related problems remain the second version. Those are covered in the “Strike Two” section.

Strike Two

The second swing — the version of the “Accountability Requirements for Achievement Gap Plan” on the 11/5/12 agenda — is much expanded (61 pages in comparison to 31), but not much improved. Another wild, unbalanced and mistimed lunge at an almost unhittable pitch. Like the first (and so many of the things produced by MMSD administration) much space is devoted to documenting that staff are very busy (of course repeatedly documenting this helps keep people busy) and very little to what is going on with students (I’m not sure why this is “accountability’). Like the first, the actual “accountability” focus is on “goals.” Like the first, many of these goals (and many of the benchmark starting points) are left blank or labeled “TBD.” Like the first, where there are numbers attached to the goals, they are wildly unrealistic.

As the play-by-play announcer, I’m going to limit detailing how this swing misses to two places where numbers are attached to standardized test based goals. The first involves the ACT; the second the state achievement tests (now WKCE, soon to be “SMARTER Balanced Assessments”).

As explained above, I don’t like “goals” in standardized test based “accountability systems” (I’m not very fond of standardized test based “accountability systems” in general, but no room for all that here), but if you are going to have goals, they should be realistic, they should be based on in-depth knowledge of the tests, the performance of comparable students on these tests, and the improvements achieved elsewhere using similar programs. As one Board Member pointed out at a recent meeting, this is exactly the kind of expertise that the Board expects from their highly paid professional administrators. They ain’t getting what they paid for (in baseball terms, we are approaching Alex Rodriguez in the last post-season).

The error Mary Burke pointed out with ACT participation has been corrected.. At the previous meeting there was a discussion of how expanded ACT participation will yield new baseline starting scores, and this was (in the first version) and is (in the second version) reflected by leaving blank most those portions covering percents of students scoring at or above the college ready benchmarks set by the ACT. For the same reasons, the ACT “Average Composite Score” section discussed above is now blank. All this is good, but in the left hand column of the benchmark charts in both versions ,for each subject area there is a 40% goal (page 32). I’m going to leave aside important criticisms of the ACT Benchmarks, to address why the 40% goal is problematic. Nationally last year, only 25% of the mostly self-selected test-takers met the benchmark in all four subjects. The percents varied greatly, from 67% in English to 31% in science. At Hersey High (with their test friendly demographics and over ten years of emphasizing the ACT) only 39.2% of test-takers made all four benchmarks. The goals for MMSD should reflect this reality, (and similar evidence on subgroups;i it should be noted that you can reach the 40% goal in each individual subject and still not have 40% meeting all four benchmarks, but my point is that the data we have shows that 40% is easier or harder for different subjects , and that 40% in any may be out of reach in some subjects).

There are similar, but more pronounced and complex problems with the section that sets goals of 90% “proficiency” on state tests in Mathematics and Literacy at the end of five years (page 17). Here is the chart for literacy (sorry for the bad reproduction):

Although the WKCE is referred to, the numbers in the far right column reflect the very problematic “WKCE as mapped to NAEP cut scores” (see “The news from Lake Gonetowoe” for some of the problems with these cut scores) and the WKCE is on the way out to be replaced by “SMARTER Balanced Assessments,” Some confusion here that I’m going to avoid by simply saying “state tests.” Since the NAEP derived cut scores are the order of the day, I guess MMSD has to use them, but they have a choice about which levels to concentrate on and “Proficient” is the wrong level.

My preference would be to do the multiple measures, small ball thing and track movement among scale scores, or failing that movement among the various cut score defined levels (which is what the “Growth” calculation in the new Report Cards does). If you are only going to use one level and are going to set goals, “Basic” is the level you want. It is where you will see the most movement and get the most useful information.

Although standards-based reporting offers much of potential value, there are also possible negative consequences as well. The public may be misled if they infer a different meaning from the achievement-level descriptions than is intended. (For example, for performance at the advanced level, the public and policy makers could infer a meaning based on other uses of the label “advanced,” such as advanced placement, that implies a different standard. That is, reporting that 10 percent of grade 12 students are performing at an “advanced” level on NAEP does not bear any relation to the percentage of students performing successfully in advanced placement courses, although we have noted instances in which this inference has been drawn.) In addition, the public may misread the degree of consensus that actually exists about the performance standards and thus have undue confidence in the meaning of the results. Similarly, audiences for NAEP reports may not understand the judgmental basis underlying the standards. All of these false impressions could lead the public and policy makers to erroneous conclusions about the status and progress of education in this country. (Emphasis added)

This level denotes partial mastery of prerequisite knowledge and skills that are fundamental for proficient work at each grade.

Proficient

This level represents solid academic performance for each grade assessed. Students reaching this level have demonstrated competency over challenging subject matter, including subject-matter knowledge, application of such knowledge to real world situations, and analytical skills appropriate to the subject matter.

Advanced

This level signifies superior performance.

I think that at this time the “Achievement Gaps” work in MMSD should concentrate on getting students to the “Basic” level, as defined by NAEP.

This belief is reinforced by national data on student NAEP performance. This first chart shows the 8th grade NAEP level distribution for all students (NAEP tests a sample of students and adjusts reporting to reflect the entire population, charts from here):

In 2011 42% were in the “Basic” level. This is where the median and mean are. If we are most concerned with the students who aren’t reading and can’t do simple math, that means moving them from “Below Basic” to “Basic.” I have no problem with also monitoring “Proficient” and “Advanced,” but the heart of this is in the basic category.

Two more graphs to show a little more of this and transition to the goals being set. This one shows the distribution of scores for those students not eligible for Free and Reduced Lunch:

The next is for Free Lunch students (NAEP reporting here does not combine Free and Reduced):

I’m not going to deny that the “proficiency” gap between these two groups of 28 points isn’t worthy of attention, but I will argue that the gap in “Basic” or above of 26 points and the gap of 22% in those reaching “Basic” are more important and more likely to be narrowed by the programs in in the Achievement Gaps Plan. This is where the action should be and what we should be watching (if we are only going to watch one level).

If it isn’t already obvious from these charts, the 90% “Proficiency” in five years set as a goal in versions 1 and 2 is a pipe dream, like the Chicago Cubs winning the World Series. No competent education professional familiar with NAEP cut scores and performance levels and MMSD would put this before the Board of Education for consideration, yet some combination of MMSD administrators signed off on it, twice. Strike two.

The Next Pitch

I wanted to get this finished and posted before the 11/5 meeting, but I didn’t. I also wanted to attend the meeting, but it is/was my son’s birthday. I hope that some of these issues and some others were raised at the meeting (I’ll watch the video and find out).

There are many other issues, like the fact that the AVID section doesn’t appear to recognize that if the other “goals” are reached, the comparison group will be an upwardly moving target; that “Stakeholders” is most often defined as district staff and not students, parents or community members; that the Cultural Responsiveness work has no academic results attached to it; that in Madison — a Union Town — the Career Academy section has no role for organized labor in planning or implementation, but business interests have the best seats at the table (and some will be paid for being there, this is what you expect from Scott Walker, not MMSD); and to repeat what was said above that much of this is documenting staff being busy and in many key places where measurement of one sort or another is called for the lines are blank or say ‘TBD.” On this last (with the exception of the ACT where the mandated participation warrants holding off) , the idea of attaching a requirement to have an accountability plan was to have a plan, not a promise to come up with one at some future date. I could go on (and on), but I think I’ve made the point that the quality of thought and work that has gone into this by the administration thus far has been lacking in many areas.

It looks like another draft (the third pitch) will be coming back to the Board on November 26th. I very much hope that draft is much better than the work we have seen to this point. I hope it isn’t strike three. The administrators have demonstrated that they can make corrections when problems are pointed out to them (like the inecusable errors with ACT participation in the first draft), when they get good coaching from the Board. That is a good thing, but expectations should be higher. It isn’t the Board’s job to know the distribution of NAEP scores, and it certainly isn’t their job to educate the administration on this (it goes without saying that there is something very wrong when it falls to me — an interested community member — to point out their apparent ignorance in the very areas they are being paid to be experts in). There needs to be some accountability here, the Board and the community have a right to expect better work. If we aren’t getting it from those now responsible, we need to find people who can provide it. The Board is not going to make good decisions without good information. The improvements our students and community need and deserve are not going to happen without competent people at the top. There needs to be some accountability. The Board needs to hold the administration accountable , and we need to hold the Board accountable for doing that.

Three Board seats on the ballot in April 2013. Could be a whole new ballgame.

One response to “How Many Strikes? or Whither Accountability? #2”

I just saw this from Matt DiCarlo and I think it applies here,and to the new Wisconsin “Accountability” system as a whole, especially the NAEP derived performance levels. Read the entire post. here is an excerpt:

One big idea for setting higher expectations is to incentivize improvement (e.g., innovation, effort) among teachers and administrators. Yet, particularly in lower-scoring schools and districts, it’s difficult to believe that the pressure to boost test scores can increase too far beyond current, high levels (and, if it can increase, it probably shouldn’t). Let’s remember that schools have for ten years been at the business end of the test-based accountability gun. The pressure is there; teachers and administrators feel it, every day. It may not be particularly compelling to argue that a school with a current rate of 40 percent is going to respond much differently to a 95 percent target compared with 85 percent.