Education Reform in Context: Research, Politics, and Civil Rights

Christopher Edley, Jr.

THE CONTEXT

The modern civil rights movement made popular the aspiration that we improve educational outcomes for children from communities to whom America historically denied equal rights and equal opportunities to advance. Political tides notwithstanding, the moral claim has grown stronger with time, not weaker. And now the structural changes in the economy combine with inexorable, almost breathtaking demographic changes to add a material urgency making that moral claim an imperative for all.

Disparities as Reflection of History and Portent for the Future

The conference papers and presentations highlighted matters of context that, in a reasonable world, would lead to a redoubling of efforts to promote equal opportunity. First, the dramatic racial disparities, summarized in Chapter 2, speak to our past, present, and future. They are the evidence of the lingering effects of historical sins, and of the legacy of racial caste. The disparities also signal painful imperfections in the ma-chinery of opportunity today. But for the future, and especially in light of the demographics, the disparities measure a challenge to the nation’s future greatness: deepening, persistent divisions threaten our collective economic prosperity, social stability, and capacity for democratic self-governance. Moreover, this is a challenge to our national character. If we

accept that racial and ethnic disparities are impervious to intergenerational mobility, then we confess that the American myth is a lie.

A dimension of this future threat is our growing separateness by color and class in our schools. The consequences are evident in learning outcomes, but also in such broader societal outcomes as shared community and intercultural competence in the workplace, the political arena, and the civic sphere generally. Nonwhite students already constitute majorities in California, Texas, Mississippi, Louisiana, Hawaii, and New Mexico and make up 67 percent of all students in the nation’s 100 largest school districts.1 Schools with large majorities of minority children are far more likely to have high concentrations of poverty, which in turn makes those schools far less likely to be successful.2

We know that the workforce will be increasingly Hispanic and black, but will these workers have the skills to be competitive and to keep America competitive? The wage advantage of young adult men with bachelor’s degrees over young men who did not complete high school increased from 40 percent in 1973 to 124 percent in 1998.3 Moreover, the data indicate that minority drop out rates exceed college completion rates (Table 1). Without more effective public policies and private practices,

TABLE 1 Percent of High School and College Graduates, Ages 18-29 by Age, Race and Hispanic Origin

our divisions will widen as the growing market premium on education makes poor schooling a socioeconomic death sentence.

Political Context

A second salient aspect of the context is the politics of school reform. In the 2000 national election and in the opening months of the Bush presidency there was partisan competition to be passionate and “bold” on the subject of school improvement.4 Such competition, while a good place to start, does not necessarily translate into thoughtful proposals.

Although through much of the 1980s and 1990s there were partisan battles over whether to eliminate the federal Department of Education, President George W. Bush abandoned that oft-stated GOP position and instead proposed greater percentage increases in education funding than for any other domestic program in his first budget.5 Congressional Democrats successfully sought still more, but this merely confirmed a recent pattern of bipartisan congressional interest in an expanded federal financial role in K-12 education, even while the form for federal activity remains hotly debated. The prototype Republican plan tends toward block grants with few federal requirements apart from intensive state-defined testing programs for public disclosure and accountability purposes, and perhaps augmented by encouragement for private school vouchers. The prototype Democratic plan tends toward substantial additional funding for more specific needs widely thought to be critical ingredients for school improvement, including more and better-trained teachers, capital investments in facilities and technology, and smaller class size in the early grades. The legislative compromise lies between these positions, and includes more resources, substantial emphasis on testing, and flexibility short of block grants.6 The general nature of this national legislative consensus seems likely to remain stable for several years, and much of the programmatic and structural change will continue to be driven at the state level, with some significant but not revolutionary expansions in federal support for those efforts.

A key unresolved question, however, is whether the equity and disparity issues beginning to emerge in the national discussions, and a few states, including Texas under former Governor Bush, will become a powerful force shaping state and local policies. At this writing, new federal legislation seems likely to include a requirement that state accountability systems report the results of their frequent student tests disaggregated by race, disability, English language proficiency, and class.7 Civil rights and other advocates unsuccessfully urged Congress to go a step farther by requiring that published evidence of disparity be more than a hoped-for prod for popular political accountability. In addition, some of these advo-

cates and observers argued that the change in achievement disparities should be an ingredient of the statutory requirement that states make “adequate yearly progress” in school improvement or face administrative and fiscal sanctions from the federal Department of Education. Traditional conservatives have been opposed to such prescriptiveness, and the traditional liberals have been opposed to fiscal sanctions which, they believe, ultimately hurt needy children and school districts.

All of this points to the need for an ambitious research agenda along the lines of the work discussed in this volume in order to continue to refine the newly ambitious federal role and the increasingly activist state reform role over the coming decade.

The Civil Rights Connection

A third area of concern, even for a convocation primarily of social scientists, is the civil rights context. The foundation of the modern civil rights movement was the attack on school segregation, not because black leaders believed that black children could only learn if seated next to a white child, but because they believed that apartheid in education would mean apartheid in opportunity; that separate could never be equal; and that unequal education would perpetuate the entire structure of injustice for generations to come. Contemporary racial justice advocates, following decades of attack on barriers in voting, employment, housing, entrepreneurship, criminal justice, and so forth, are now revisiting education issues with renewed vigor. There is a growing consensus within that community that equal education opportunity and the elimination of disparities in achievement and attainment must be the number one agenda item for the civil rights movement in the decade ahead.8 As some have put it, algebra is a civil right.9 While liberals stress the mantra that “every child can learn,”10 conservatives argue that poor and minority families deserve private school vouchers so that they will supposedly have choices like other families to escape failing schools, and people across the spectrum proclaim that we must “leave no child behind.”11

Another aspect of the civil rights context, however, is less about the rekindled aspirations for educational successes than about insistence that the antidiscrimination and equality norms familiar to civil rights law be given their appropriate, contemporary interpretation and aggressively enforced. One prominent example concerns testing.

When President Clinton proposed a voluntary national test (VNT) in his 1997 State of the Union Address,12 he viewed it as an important device to promote comparability and accountability, and a needed spur to the standards-based school reform movement. Several members of the Congressional Black Caucus, among other leaders in minority communities,

opposed the VNT. Among their reasons were the risk that such tests would be used not only for diagnostic and intervention purposes, but for high stakes imposed on students who may not have had the opportunity to learn the material included on the tests—denial of diplomas, tracking into dead-end curricula, and retention in grade. Thus, went the critique, the tests would almost surely be used to penalize the very students who were being ill-served by failing schools, rather than used to identify underperformance by teachers, administrators, and officials at all levels. President Clinton and Secretary Riley reacted to such civil rights concerns rather dismissively, suggesting privately that perhaps these leaders were not committed to excellence or high standards.13

This charge was, of course, utterly false. The civil rights claim has three central components. First, conventional civil rights antidiscrimination law suggests that when a policy, although race-neutral on its face, is applied and produces racially disparate results, there is a prima facie case of discrimination under regulations implementing the Civil Rights Act of 1964.14 The burden then shifts to the policy maker—in this case school authorities—to demonstrate that the policy is “educationally necessary” to the legitimate purposes of the government. If officials meet this burden, then the civil rights plaintiff would have the burden of showing that, even if educationally necessary, there are alternative means of pursuing the legitimate goals without so serious a disparate impact. There are, of course, “antitesting” advocates who oppose so-called standardized testing in most forms and contexts. The civil rights complaint, however, is not against the test, but against the high-stakes use of the test for retention in grade or denial of diplomas, rather than for the wide range of other accountability and intervention measures that would not punish the ill-taught or poorly performing student. Relatedly, the civil rights claim is that a high-stakes regime cannot be “educationally necessary” if the assessments fail to satisfy the generally accepted professional norms of the psychometric and testing community—see the principles in the “Joint Standards” and in various NRC publications.15

The important civil rights thesis, underlying all antidiscrimination law, is this: When a policy or practice is favored by powerful interests but noxious to a “discrete and insular minority,”16 we cannot be confident that the ordinary rules of majority politics and democratic policy making will produce just outcomes, even over an extended period of time. Put bluntly, if the victims of a policy are largely minority and poor, the self-correcting mechanisms of deliberation and reform may not work so well. Antidiscrimination laws, whether rooted in the Constitution or in statute, are intended to be antidotes to the antiminority tilt of democratic rule— in, for example, a subordinate jurisdiction, or at some future moment. In that special sense, antidiscrimination laws are antidemocratic and at cer-

tain times and in certain places contrary to popular wisdom or a majority’s preferences. That’s their purpose.

The structure of this legal argument has become clear over the past few years. The relationship between scientifically sound testing practices and civil rights law was examined in an important 1999 publication by the National Research Council (NRC), High Stakes: Testing for Tracking, Promotion and Graduation, edited by Robert Hauser and Jay Heubert. That same analysis was largely adopted in a formally published guidance on test use produced by the Department of Education’s Office for Civil Rights in December 2000, since “archived” by the new Bush administration pending detailed review.17 It has met with little success in the courts, however, because judges so naturally tend to defer to the expertise of state and local school officials, and the judges themselves are, like politicians and much of the public, seemingly in the thrall of testing.18

To be sure, there is a largely unexamined empirical assertion underlying the arguments of high-stakes proponents: attaching high-stakes consequences for the students provides an indispensable, otherwise unobtainable incentive for students, parents, and teachers to pay careful attention to learning tasks. For the countless parents, policy makers, and observers who approach these debates as instrumentalists, the accuracy of this assertion is a central mystery as we struggle to close the education gap.

High-stakes testing is also problematic from a civil rights perspective if curriculum is not aligned with the test, or if instruction is not aligned with the curriculum.19 The simple insight, reflected in both case law and professional testing standards, is that it is a denial of due process to punish a student when he or she has not even had a chance to prepare for the exam. This is the most pointed form of a general concern about providing adequate and equitable opportunity to students before imposing on them a potentially devastating decision about tracking, retention in grade (with, many believe, resulting increases in the risk of dropping out),20 or diploma denial. While liberal education reformers tried during the first Clinton administration to include general “opportunity to learn” provisions as a condition of federal financial assistance to the states and a necessary complement to standards-based accountability, this linkage was soundly rejected in Congress and has not generally been made in state policies. The narrower legal claim of civil rights and other advocates is that, in some circumstances, opportunities may be so inadequate in relation to the high-stakes test as to amount to fundamental unfairness in a constitutional sense. Court decisions and state policy makers have often responded by building a lag into the schedule between announcement of a high-stakes test and its implementation, presumably to permit alignment of curriculum and instruction so that everyone has a fair chance to

get ready.21 The deeper question, requiring case-specific research, is whether the alignment and preparation really take place for the neediest and least powerful before the accountability axe falls.

This issue of adequate opportunity has civil rights resonance outside of the testing arena. For example, Michael Rebell’s contribution in Part III of this volume describes a thus far successful effort in New York state courts to demand greater equality in the provision of the minimum adequate education guaranteed by that state’s constitution. Failure to do so is a denial of rights. I would add that, given this right under state law, it therefore because a denial of federal constitutional due process rights to deprive a child of that right, and a violation of federal civil rights statutes as well.22 Indeed, there are at least two major strands of civil rights claims being pursued under various state constitutional law theories: failure to provide disadvantaged students with a minimally adequate basic education, and failure to assure some rough comparability in education finances or services across school districts. These interdistrict equity claims, while impossible under the U.S. Supreme Court’s interpretation of federal equal protection doctrine,23 have met with significant success in the state courts, as Rebell details.

It is important to bear in mind, however, that attention to these fancy, still evolving civil rights claims should not cause us to ignore the myriad garden variety discrimination claims based on intradistrict inequalities (e.g., minority schools without text books or certified teachers),24 or discrimination in the administration of ability grouping, special education, school discipline, and so forth. Beneath much of the subtle discrimination, which advocates believe is all too common among educators and officials, is a form of racial stereotyping or “academic racial profiling” in which expectations are lower for students of color.25 Against this backdrop, thoughtful focus on racial disparities, as represented in this volume, is a vital antidote.

The gravamen of all this is that the success or failures of minority children in our schools must be understood to be a matter of civil rights urgency—and the concerns are far broader than the historical attention to racial isolation and state-sponsored segregation. The agenda in this new century encompasses a whole vision of opportunity and achievement.

The Urgency of School Improvement

A fourth and final aspect of the context is the broad sense that there is a crisis in public education. Polling evidence suggests that many parents feel that, while my child’s school is fine, public schools in general are in serious trouble.26 Another piece of evidence is the continuing interest in private school vouchers, public school choice, charter schools, and other

strategies that, in one way or another, amount to a rejection of business as usual in the public school system and in particular a skepticism that the customary strategies for bureaucratic innovation and reform will suffice. At present, the bulk of leadership in minority communities, both nationally and regionally, support public schools, oppose private school vouchers, and voice at least cautious commitment to the ordinary processes of incremental progressive reform. It seems likely, however, that the erosion of this commitment will accelerate unless leaders and their constituents see substantial gains in minority achievement and reductions in disparities within the next few years. There has been too little attention in policy and political debates to the rate of school improvement, as though truly modest movement in the right direction is cause for celebration and self-satisfied media events by officials from the White House to the school house.27 The linchpin of federal accountability imposed on the states, in fact, has been the requirement that states adopt some kind of assessment system and demonstrate “adequate yearly progress.” To any dispassionate observer of such policy outputs, this is all but laughable: “progress” has only the thinnest of statutory definitions, and “adequate” has no definition whatsoever.28 Surely, the findings surveyed in this volume suggest that the dismaying disparities along lines of color and class are too dangerous for half measure or slow cures. Yet, curiously, there is little public debate and little research about the rate of change we should require of school reform efforts in order to win the continuing support of voters and taxpayers. Part of the context for this examination, I suggest, is that patience is wearing thin, and is not inexhaustible. In short, improvements must be pursued and indeed accomplished with a sense of urgency, lest the consensus for supporting public education vanish over the course of the next generation—or sooner.

Our task in light of this context is to take a set of normative propositions—about the opportunity, achievement, and justice we want—and recast them so that they are more than mere statements of aspiration, hortatory in character. Instead, they must be scientifically descriptive statements about closing achievement gaps that are then married to an enforceable regulatory regime. Surely the facts presented in this volume and at the conference suggest no less.

HOW STRONG IS THE RESEARCH FOUNDATION FOR CHANGE?

From the perspective of the National Research Council, however, this raises the question of whether we have a research predicate for the dramatic if not revolutionary K-12 change I believe the context demands. We might consider research in three dimensions: it is a foundation for policy

choice, a critical guide for implementation engineering, and a foundation for enforcement.

There is more to this than an academic’s standard plea for more research. Return, for example, to the issue of a minimally adequate education under state constitutional and federal due process theories. Unless there is a research predicate to help define and measure the vague “adequacy” concept derived from legal doctrine (not to mention education policy), it will be impossible to create a judicially manageable standard or a useful set of objectives for policy makers to attend to. Or, to use another example, understanding scientific principles regarding the predicate for appropriate use of tests (construct validity, reliability, alignment, inferential validity, etc.) is necessary. But it is obviously not a sufficient predicate for enforcing fidelity to those norms in the political, bureaucratic, or legal processes that shape school change. Is the research predicate adequate? The conference and this volume suggest that it is actually pretty good. This requires some caveats. Not withstanding daunting uncertainties, the findings are good enough for policy making—good enough for government work, as the expression goes. This is because if politics presses, politicians will act; when the research base is nonexistent or inconveniently inaccessible, then the dispositive “research” is provided by pollsters who ferret out hot-button phrases and symbolic gimmicks, not research-based policy proposals. Pollsters drive the policy choices, rather than research evidence. My favorite example is the early Clinton administration, strapped for cash, touting school uniforms as though it were a central component for bold federal leadership on school improvement. Why? It polled well, and fit with the desired political message.29 Anecdotal evidence sufficed.

There is a further, crucial caveat. Certainly much research remains to be done—conceptualized, even—in the continuing effort to give educators and parents the insights needed to promote learning. The exploding diversity in school districts and classrooms makes some dimensions of the research urgent.

Research on Achievement and Learning

This volume, building on the conference, does much to illuminate the gap, its dynamic over time, and to some extent its determinants. This kind of research is critical in order (a) to target treatments; (b) to some extent to actually design the treatments; and (c) importantly, to help build political will for needed changes by demonstrating that the problems are frightening but the possibilities for success are real. Many policy interventions do not depend upon a detailed understanding of how the achievement gap comes to be. Instead, there are some treatments likely to be helpful no

matter what the origin of the disease, so to speak. Moreover, even if we are not using the evidence about the etiology of disparities to target or design our treatments, research that goes only to the magnitudes helps build the moral consensus needed if we are to find and apply resources in a sustainable way. Certainly, we must continue with an even more ambitious research agenda. But meanwhile, leaders must be prepared to act.

Following discussion of the achievement gap, the conference turned to the subject of learning: the research on how we learn, on early childhood learning and appropriate interventions, and on reading specifically as the indispensable foundation (see Chapter 3). Of course there are, again, continuing disagreements about what the research demonstrates, but a substantial body of work, including important reports by the NRC (see Box 1-1 in Chapter 1, Part I, of this volume), offer important findings that do deserve wide acceptance. In particular, Lauren Resnick made a critical observation: we now have a conceptual and an empirical foundation to substantiate the claim that virtually all students can learn at high levels (see Chapter 6, Part I). This conclusion is of singular importance for policy makers and politicians. The principle is more than an eloquent turn of phrase.

Tools for Policy Change

Turning to particular programmatic strategies to address adequacy and equity, the conference discussion covered the five most salient strands of the broader policy debate—choice, teaching, assessment, accountability, and integration.

One of these topics, choice in its various forms sparked little discussion, perhaps because from a research perspective it is speculative. Indeed, much of the school choice debate has long struck me as an ideological matter in a central sense, in particular those species of “choice” embodied in private school vouchers and in large-scale public school choice. The commanding question for reformers is whether quasi-market incentive and signaling schemes based on family decision makers will be more effective at driving change than the alternative reform schemes. Those alternatives promise school improvement driven by politico-professional and bureaucratic methods, including, of course, assorted incentive elements. This question of comparative efficacy—the market or not the market—simply has not been answered by research, leaving the strategic choice even more open than most to ideological battle and policy prejudice.

For many serious policy analysts, the choice issue is uninteresting because there is so little good science to digest, the methodological challenges seem all but imponderable, and purists insist that there should be

large-scale randomized experiments, which seem impossible on practical grounds. The few studies to date have feuled a firestorm of controversy out of proportion to the available evidence.30 This is unfortunate because coarse political decision making will flourish in such science-starved environments—like a staph infection with no disinfectants in sight. So the politico-policy system will muddle through, perhaps making some dangerous choices along the way. And we should not count on bold new research and evaluation efforts to detect and correct promptly the errors of our ways, especially with poor and powerless victims. Here is where the enormous decentralization and diversity in the public school system may be a blessing indeed.

On the question of teaching, the most important insight is that basic “research” result: In order to improve student achievement, pick better students; failing that, do better and more teaching of the students you are stuck with. The former strategy is illustrated by retention, over-referrals to special education, “push-out” strategies, and choice schemes that involve overt or subtle screening on family, motivational, or academic variables. The latter strategy is illustrated by reducing class size, investments in greater teacher professionalism and development, extended school day or school year, research-proven instructional strategies, curriculum that is aligned with the achievement goals, and so forth. It is not difficult to inventory the list of “do’s” and even many of the “don’ts.” The question is largely one of will (resources, leadership) and implementation—which is not to gainsay the difficulties there.

That brings us to assessment and accountability. The conference discussion included substantial attention to the critical distinction between using tests for diagnostic or assessment purposes on the one hand, and attaching high-stakes consequences to those test results. High stakes for students raise concerns among those in the civil rights community, as discussed earlier. High stakes for teachers raise concerns among many teachers and unions, and not simply for job security reasons. There are daunting methodological questions31 of how to measure “value added,” ranging from assessment validity to fluid student enrollments, and those problems of method are considered by many to be unacceptable if the purpose of the measurement has high stakes for some powerful constituency. Finally, in any high-stakes context, there are serious questions of testing reliability—the random and other variability one might observe between hypothetical administrations of a test—the political policy makers seem never to confront.

Children, of course, are less powerful, so doubts about student-edged high stakes have far less political potency. Nevertheless, there is growing discussion of evidence concerning the misuse of such tests, as judged by reference to the Joint Standards,32 and especially the question of how such

tests may drive up retention rates and special education referral rates, while driving down diploma completion rates.33 I refer to diploma completion, because most official data on dropouts is seriously incomplete and misleading,34 and because the GED is a far less valuable credential in the labor market.35

The concerns over assessment and student-edged accountability are only heightened by the intriguing work presented by Claude Steele concerning stereotype threat and disidentification, described in Chapter 4, Part I, of this volume. There should be little doubt that test-driven standards-based reforms taken as a whole are spurring important school improvement in a great many places. There is, however, collateral damage. Steele’s work raises questions both about a particular form of collateral damage among traumatized test-takers, and even more fundamental questions about the validity of the underlying assessments and inferences from them. If, as he suggests, the test and its context produce psychological responses that depress the performance of the test-taker, then the resulting measurement has a systematic error that biases the results downward, generally to an unknown degree. Warning lights, hazard signals, and sirens going off continuously. And they have to be louder and brighter, because of the imperatives for revolutionary change and coupled with the fairness demands of a civil rights sensibility.

Integration

With respect to school integration by class and race, the most important point to be gleaned from the conference is that there is far too little attention in political and policy debates to the importance of integration as a tool for improving learning outcomes and, ultimately as important if not more so, as a tool for improving societal outcomes. Without an integration strategy responsive to our exploding diversity, one must worry about civic virtues and about our personal and collective capacity to thrive.

SPECULATIONS AND FURTHER WORK

Finally, we turn to a few speculations, focusing on several matters for further investigation and consideration.

English Language Learners (ELLs)

The political and policy conflict over how best to educate students who are not proficient in English continues,36 while the number of ELLs enrolled in public schools increases. Between 1980 and 1995, students

speaking a language other than English at home increased from 8.8 percent of the total student population to 13.3 percent.37 Meanwhile, to date, research shows that the difference in academic learning acquired through bilingual education programs that use native language support and English immersion programs are not that significant.38 However, the knowledge gap between ELLs and their non-ELL peers is great. One leading expert, Kenji Hakuta, has noted several findings he believes are well supported and widely accepted in the research community (if not among politicians and policy makers), including:

There is significant variation in the definition and implementation details of ELL programs, creating enormous difficulties for research and evaluation.39

77 percent of ELLs come from low-income backgrounds and are generally concentrated in linguistically segregated schools in which most of the school population comes from low-income backgrounds.40 Among ELL programs, students receiving transitional bilingual education are more socioeconomically disadvantaged and attend higher-poverty schools than students in ESL. As between the two dominant models, transitional bilingual education and ESL, the former appears to be modestly better, but neither makes a substantial dent in the achievement gap between poor ELL and middle-class English speakers. In other words, the furious political debate between bilingual strategies is, from the perspective of student achievement, almost entirely beside the point.41

The research evidence is that no-support, sink-or-swim “immersion” strategies are distinctly inferior for the typical student; indeed, this was the basis for the Supreme Court’s 1974 decision in Lau v. Nichols.

How long does the language transition take? The evidence is that the time needed to achieve English proficiency depends on many factors, including age of the child, level and quality of prior schooling of the child, education level obtained by the parents, type of language instruction provided, the child’s exposure to English in his or her community, quality of the teachers, and quality of the instruction, including the bilingual education instruction, that a child receives.42 Given all these variables, researchers generally agree that the time it takes to become proficient in English ranges from two to eight years.43 There is no substantial research support for a one- or two-year time limit on bilingual services applicable to all students.

The legal principles are simple to state, if not apply: students with limited English proficiency may not be denied access to an education due to failure of the schools to make reasonable accommodations through some form of language or translation assistance. The leading case,

Castenada v. Pickard, established a three-part test for determining whether a school district “has taken appropriate action to overcome language barriers” (648F.2d989[5th Cir. 1981]). It requires that the school district’s program (1) be based on sound educational theories, (2) effectively implement the education theories, and (3) produce results showing that language barriers are being overcome. Given the state of social science research, these legal principles suggest that no one approach to bilingual education should be mandated. Implementing strict one-year English immersion programs or mandating three-year time limits on bilingual education instruction would likely violate the rights of many children granted under the Equal Educational Opportunities Act.44

So, interestingly, the antidiscimination legal framework puts the minimal adequacy of policy research directly at issue, at least in principle. (Ultimately, judges tend to defer to government policy makers, rather than make a more independent judgment, based on expert testimony, of which choices the research supports.) The political framework, however, is far less attentive to research evidence. And when social scientists for good and principled reasons dither with definitiveness, they invite irrelevance in policy debates, and there is more space for error and even demagoguery, as in the sometimes xenophobic demands for English-only laws.

Looking to the future, this situation must not stand. Language barriers are an increasingly important component of the racial and ethnic gap in achievement, the sharp wedge that widens economic and social divisions. We must have research of sufficient quantity and quality to match the growing challenge that this represents in so many communities.

High Stakes and Accountability for Others Besides Students

While there has been much attention to high-stakes testing for students, and an enormous scientific enterprise of psychometric and other disciplines focused on student assessments in that context, there is far less intellectual capital concerning high stakes for teachers, schools, districts, and states. For example, researchers have raised important questions about “value added” models that attempt to make valid inferences about achievement gains over time.45 Despite the scientific difficulties, the very structure of federal legislation now demands that states demonstrate “adequate yearly progress” in student achievement.46 Many states—among them Kentucky, Texas, New York, Florida, and California—purport to attach financial and administrative rewards and sanctions to measured changes in school and district performance on tests.47 The standards-based reform movement finds its motive force in accountability, which requires that the targeted actors above students demonstrate improvement over time.

Why is the emphasis on high stakes for students—diploma denials, retention in grade, tracking, even alternative schools—rather than high stakes for other actors? In part it is because students are the least politically powerful in the system, especially if they are poor and minority.48 An additional explanation, however, is that the problems of measurement are supposedly even more daunting when we contemplate high-stakes judgments at higher organizational levels: the number of exogenous variables seems to mount exponentially as one moves up the chain of responsibility; the data problems multiply (flux in student population, for example); authority is often diffuse; and so forth. All of this makes establishing causation, attribution, and culpability arguably more difficult—or so teachers, administrators and elected officials say when deflecting calls for high stakes directed at them rather than the students.

I am not persuaded that these defenses are true, that accountability is from a scientific perspective dramatically more difficult for teachers or districts than for students. Indeed, from a purely analytical perspective, some of the “noise” and randomness of individual test results and micro-level data becomes less of a problem when you aggregate inferences more supportable than those we make at the student level. Analytics aside, however, anyone on the receiving end of a sanction can offer explanations and excuses, be they student or state commissioner or anyone in between. The scientific question is how to gauge the truth of the excuses. The policy and political question is how much weight to accord them in light of the science.

The science is too thin. We are in the midst of dramatic increases in K-12 expenditures in an effort to spur reform, but support for these welcome investments will soon evaporate unless the public sees effective accountability and meaningful improvements. Perhaps it is a good gamble that states and districts will drive change forward by focusing the high stakes principally on powerless children, with far less attention to carrots and sticks for other actors. (I am doubtful, and in any case it seems a cruel gamble.) Surely, however, our investment will be more secure if research provides more guidance in constructing higher-level accountability methods. This is an urgent matter.

Reconsidering Radical Decentralization

A more radical suggestion, perhaps, is that we make a less romantic and more scientific assessment of the decentralization in our 15,000-district education sector. The choice by national and state governments to decentralize should be considered one of several possible “treatments” or engineering strategies in school reform, just as a multinational conglomerate might adopt a strategy concerning centralization versus site-based

autonomy. Is the strategy we’ve had the one we should choose in this new century?

Imagine the perspective of a passionate, concerned parent, hearing a claim that school improvement will come from devolving more discretion to principals and teachers. “Why?” asks the parent. “I’m not all that interested in giving principals or teachers the freedom to be stupid at the expense of my kid. I’m just not. It’s too important. Indeed, I’m not all that interested in giving my local school board the autonomous discretion to continue its history of bad administration, because the people in my community and I don’t have the practical political power to force our school board to do better.”

Here is an analogy. I am not interested in giving my local oncologist the freedom to experiment and innovate. I would prefer that the National Institutes of Health (NIH) be giving some guidance, that the oncologist feel considerable pressure to follow that guidance, and that the Food and Drug Administration mark some treatments clearly out of bounds because they are ineffective or dangerous. Ideally, I want the local oncologist to be aware of all the treatment options, and fully skilled at selecting among them. Absent the ideal clinician, however, I want a quality safety net. (I also want to be able to sue the doctor if she’s negligent.) And I want all of this, thank you very much, because it matters to me what choices are made, intensely. I feel only slightly less frantic about the wisdom of the choices shaping my child’s education.

This could be put another way. Starting with an acknowledgement of education problems in the decentralized system we have, where is the research evidence that just letting 15,000 flowers bloom is the better strategy for bringing about the tremendous changes needed to close the racial gaps in achievement, or the broader change the public demands?

Toward a Science of Diffusion

Finally, retreating from radicalism to accept the more realistic assumption of a high degree of decentralization, do we know enough about how change occurs? About the processes for the diffusion of reform strategies, especially the diffusion of research about successful practices under a variety of different circumstances? There is an enormous education policy literature, of course, but far less rigorous attention to the question of how insight about success in district A can be analyzed, transmitted, and applied to inform practice in district Z.

Between promising research and program evaluation on one end, and successful implementation on the other, a diffusion and refinement of knowledge takes place through a variety of processes varying in their formality and quality-assuring characteristics. These processes deserve

far more study and self-conscious design effort than we have seen, including consideration of the need for more powerful intermediary institutions.49 Leaving it to schools of education and a meager jumble of inservice training investments will not do. Again, the magnitude of the challenges, combined with the coming of major new investments, make this an important avenue for work.

Consider once more a medical analogy. How does clinical research about the latest strategies for combating a particular type of cancer in a particular type of patient find its way to the practice group in your local hospital, and to the desktop and the mind of the physician who is going to treat you? Well, it is a complicated process, with elaborate mechanisms involving a combination of institutions. Sometimes it works well, sometimes it doesn’t. But it is far less ad hoc than the diffusion of new practices to schools and teachers.

In medicine, NIH and other agencies are thinking hard about how to harness technology to shrink the length of time that it takes for the effective dissemination of new clinical strategies. There is no assumption that every patient ought to be treated the same and, in the case of cancer, there is a recognition that it is not a single disease, but a constellation of diseases. Some of the mechanisms of disease are shared, but some of them are different. And the treatments vary enormously, from the high end modern genetic interventions of the sort that we are going to be seeing increasingly over the next few years, to the common sense we-need-more-prevention. In this incredibly complex system, progress is not left to decentralized, unanalyzed processes of diffusion. There is focused attention to the problem of getting news out and into practice.

Now, we stand at the threshold of many tens of billions of dollars of new investments in school improvement, in the teaching profession, and in experimentation and research. A key question, therefore, is whether we are smart enough to make the best possible use of those new investments by devising better strategies and mediating institutions to take the best ideas and implement them. That problem, that puzzle, I think, is a research set of questions. The diffusion delays we see in education would be unacceptable for promising new treatments of cancer, heart disease, or even acne.

CONCLUSION

“Millennium Conference” is an awfully ambitious title, but for good reason. The conference organizers hoped we would recognize this as an occasion for making new commitments, and for rededicating ourselves to some things that are fundamental. The ideas of opportunity, achievement, and justice certainly do qualify. Americans have learned the hard way

that when we are missing those things, this isn’t the kind of nation we want and we don’t have the kinds of communities our children deserve to grow up in.

The sponsorship by the Department of Education was a welcome opportunity to focus the National Academies on the importance of closing the opportunity gap. One can find in the work of the National Research Council much reason to be encouraged about the possible contributions of research science to that undertaking. Any and all possible undertakings in this regard must be encouraged, because it is difficult—I would say impossible—to imagine a more important set of challenges for the opening decades of this millennium.

REFERENCES

1. Digest of Education Statistics, 2000. NCES 2001-034. Washington, DC: U.S. Department of Education, 2001. Characteristics of the 100 Largest Public Elementary and Secondary School Districts in the United States: 1998-1999. NCES 2000-345, by Beth Aronstamm-Young. Washington, DC: U.S. Department of Education, 2000.

3. The Use of Tests as Part of High-Stakes Decision-Making for Students: A Resource Guide for Educators and Policy-Makers, Office for Civil Rights, Washington, DC, 2000. Available at www.ed.gov/offices/OCR/testing/index1.html Accessed June 14, 2001.

6. Lizette Alvarez, “Testing Requirement to Stay in House Bill,” New York Times, May 23, 2001, A22; Lizette Alvarez, “On Way to Passage, Bush’s Education Plan Gets a Makeover,” New York Times, May 4, 2001, A16.

7. H.R. 1, 107th Cong., § 111 (2001). The Texas accountability system, while in other respects criticized by some civil rights commentators, does have achievement data disaggregated by race and poverty, and does tie rewards and sanctions to performance of law-defined achieving students.

All, regardless of race or class or economic status, are entitled to a fair chance and to the tools for developing their individual powers of mind and spirit to the utmost. This promise means that all children by virtue of their own efforts, competently guided, can hope to attain the mature and informed judgement needed to secure gainful employment, and to manage their own lives, thereby serving not only their own interests but also the progress of society itself.

13. I had several conversations with President Clinton and Secretary Riley on this subject during 1997 and 1998, and each of them offered the same characterization to me of the civil rights concerns. My rebuttals were ineffective.

14. 42 U.S.C. §§ 2000d to 2000d-1. Administrative regulations to enforce Title VI contain standards for disparate impact cases. For example, the Department of Education’s regulations state that programs which have “the effect of subjecting individuals to discrimination because of their race, color, or national origin” can violate Title VI. 34 C.F.R. § 100.3. The U.S. Supreme Court recently limited the availability of private lawsuits to enforce disparate impact regulations, but the Court did not limit government enforcement of the regulations nor address the legality of the regulations themselves. Alexander v. Sandoval, 121 S. Ct. 1511 (2001). Title VI’s protections are limited to race, color, or national origin. Title IX of the Education Amendments of 1972 protects individuals based on sex. 20 U.S.C. § 1681. Persons with disabilities are protected in various ways by the Americans with Disabilities Act of 1990, 42 U.S.C. §§ 12101-12213, section 504 of the Rehabilitation Act of 1973, 29 U.S.C. § 794, and the Individuals with Disabilities Education Act, 20 U.S.C. §§ 1401-1420.

15. American Educational Research Association, American Psychological Association, National Council on Measurement in Education, Standards for Educational and Psychological Testing. (Washington, D.C.: American Psychological Association, 1999); National Research Council, Committee on Appropriate Test Use, High Stakes: Testing for Tracking, Promotion, and Graduation, Jay P. Heubert and Robert M. Hauser, eds. (Washington, D.C.: National Academy Press, 1999).

16. United States v. Carolene Products Co., 304 U.S. 144, 152 n.4 (1938). For an analysis of the Carolene Products case and the role of judicial review in addressing problems with the majoritarian process, see John Hart Ely, Democracy and Distrust (Cambridge, MA: Harvard University Press, 1981).

19. The related concern of education policy, as distinct from civil rights polity, is that would-be reformers often treat the test as the statement of learning goals and then insist that the curriculum in some sense be “aligned” with the test. This is nonsensical to testing experts, who recognize that any test instrument is just a sample over some learning domain. In practice, this inverted perspective is driven by high stakes use of a test and can produce a narrowing of the curriculum and teaching to the test.

21. Debra P. v. Turlington, 644 F.2d 397 (5th Cir.1981). In Massachusetts, students graduating in 2003 will be the first students required to have passed the high stakes examination. In California and New York, the testing requirement first applies to students graduating in 2004. Similar delays in implementation can be found in proposed federal legislation, which does not require states to adopt content standards in history or science until the beginning of the 2005-2006 school year. H.R. 1, 107th Cong. § 111 (2001).

24. Historically, of course, it is well established that before Brown, expenditures for minority students attending segregated schools were grossly unequal. See, e.g., Gary Orfield, Dismantling Desegregation, 36-37; Michael Middleton, Brown v. Board: Revisited, 20 S. Ill. U. L. J. 19, 32 (1995) (describing how black children received inferior education under segregated systems because of severe underfunding). Today, by far the stronger relationship is between poverty and underfunding. More important, there is a strong interaction effect produced by the disproportionate concentration of poverty in heavily minority schools. See, e.g., Gary Orfield, Schools More Separate: Consequences of a Decade of Resegregation at 39-40 (July 2001, The Civil Rights Project at Harvard) (www.law.harvard.edu/civilrights/publications/pressseg.html). The percent of poor children in the school of the average African American student is twice that for the average white student, and the disparity is slightly greater for Latino children. Among highly racially isolated schools (90 percent or more white, or 90 percent black and Latino), only 17 percent of those white schools have half or more poor children, compared with 88 percent of minority schools. Id., at 40 (using 1998-99 NCES Common Core of Data). Contemporary court decisions support the observation that race is correlated with resource disparities. See, e.g., Campaign for Fiscal Equity, Inc. v. State of New York, (2001 N.Y. Misc. Lexis 1); Robinson v. Kansas, 117 F. Supp. 2d 1124 (D. Kan. 2000) (Title VI claim alleging disproportionate resources). Indeed the relationship is accepted knowledge in the civil rights enforcement community. According to the U.S. Department of Education, the problem of unequal resources affects minority and low-income students the hardest. See U.S. Department of Education, Office for Civil Rights, Intradistrict Resource Comparability Investigative Resources at 3 (2000).

28. 20 U.S.C. § 6311. The statute only states that “adequate yearly progress” shall be defined in a manner: “(i) that is consistent with guidelines established by the Secretary that result in continuous and substantial yearly improvement of each local educational agency and school sufficient to achieve the goal of all children served under this part meeting the State’s proficient and advanced levels of performance, particularly economically disadvantaged and limited English proficient children; and (ii) that links progress primarily to performance on the assessments carried out under this section while permitting progress to be established in part through the use of other measures.”

29. William J. Clinton, Text of Presidential Memo to Secretary of Education on School Uniforms (Washington, DC: U.S. Newswire, 1996).

30. William G. Howell, Patrick J. Wolf, Paul E. Peterson and David E. Campbell, “Test-Score Effects of School Vouchers in Dayton, Ohio, New York City, and Washington D.C.: Evidence from Randomized Field Trials.” Paper Prepared for the Annual Meetings of the American Political Science Association, September 2000; Paul E. Peterson and Bryan Hassel, eds., Learning from School Choice (Washington, D.C.: Brookings Institution Press, 1998); Cecilia Rouse, “Private School Vouchers and Student Achievement: An Evaluation of the Milwaukee Parental Choice Program,” Quarterly Journal of Economics, v. 113, no. 1, February 1998; Kate Zernicke, “New Doubt is Cast on Study that Backs Voucher Effects,” The New York Times, September 15, 2000.

32. National Research Council, Committee on Appropriate Test Use, High Stakes: Testing for Tracking, Promotion, and Graduation, Jay P. Heubert and Robert M. Hauser, eds. (Washington, DC: National Academy Press, 1999): American Educational Research Association, American Psychological Association, National Council on Measurement in Education, Standards for Educational and Psychological Testing. (Washington, DC: American Psychological Association, 1999).

33. John Bishop and Ferran Mane, “The Impacts of Minimum Competency Exam Graduation Requirements on College Attendance and Early Labor Market Success of Disadvantaged Students,” in Gary Orfield and Mindy L. Kornhaber (eds.), Raising Standards or Raising Barriers? Inequality and High Stakes Testing in Public Education (New York: Century Foundation Press, 2001); Hauser, op cit.; Gary Natriello and Aaron M. Pallas, “The Development and Impact of High Stakes Testing,” in Gary Orfield and Mindy L. Kornhaber (eds.), Raising Standards or Raising Barriers? Inequality and High Stakes Testing in Public Education (New York: Century Foundation Press, 2001); C. Thomas Holmers, op. cit. Note that the magnitude of these effects is disputed, especially as regards drop outs. Deciding this is an important empirical question for policy, but it is complicated by problems with drop out data, and by the problem of holding constant exogenous variables, especially the impact of a tight labor market on propensity to drop out.

35. Richard Murnane, John B. Willett and K. P. Boudett (1995). “Do High School Dropouts Benefit from Obtaining a GED?” Educational Evaluation and Policy Analysis, 17, 133-147; Richard J. Murnane, John B. Willett, and John H. Tyler (2000). “Who Benefits from Obtaining a GED? Evidence from High School and Beyond.” The Review of Economics and Statistics , 82, 23-37.

tured English immersion programs with the goal of moving limited-English-proficient students into mainstream classes after one year; Arizona Proposition 203 passed on November 7, 2000 codified at Title 15, chapter 7 Ariz. Rev. Stat. Ann. Section 15-751, et seq., (2001), Article 3.1 which is similar to California’s Proposition 227.

38. While bilingual education programs generally have produced better outcomes in academic achievement, it is not clear how much better these programs are. See Testimony of Kenji Hakuta to the United States Commission on Civil Rights, “The Education of Language Minority Students,” April 13, 2001, www.stanford.edu/~hakuta/Docs/CivilRightsCommission.htm. Also see, Jorge Amselle and Amy Allison, “Two Years of Success: An Analysis of California Test Scores After Proposition 227,” http://www.ceousa.org/html/227rep.html, August 2000; Californians Together, “Schools with Large Enrollments of English Learners and Substantial Bilingual Instruction are Effective in Teaching English,” August 21, 2000; and Orr, Butler, Bousquet, and Hakuta, “What Can We Learn About the Impact of Proposition 227 from SAT-9 Scores?” August, 2000 analyzing student achievement scores on the Stanford 9 after the implementation of Proposition 227.

39. Educating Language Minority Children. Committee on Developing a Research Agenda on the Education of Limited-English Proficient and Bilingual Students, D. August and K. Hakuta, eds. Commission on Behavioral and Social Sciences and Education. Washington, DC: National Academy Press, 1998.

42. Some may define proficiency as proficiency in conversational skills while others define proficiency as having appropriate oral, written, and reading skills for a native speaker of English at a particular grade or age level. Still more relevant in the context of achievement testing, however, is proficiency sufficient for academic learning in English.

43. Public Education: Meeting the Needs of Students with Limited English Proficiency. Washington, DC: Government Accounting Office, pp. 5-6, 2001.

44. While the United States District Court for the Northern District of California ruled in Valeria G. v. Wilson, 12 F.Supp.2d 1007 (July 15, 1998), that Proposition 227 on its face did not violate the EEOA, the case has been appealed and it is unclear whether another court would make the same finding. The Court in Valeria G. found that Proposition 227 did not violate the EEOA because the defendants presented evidence that structured immersion is the “predominant method of teaching immigrant children in many countries in Western Europe, Canada and Israel.” Id. at 1018. It also found that because the initiative was flexible and allowed schools and school districts to make choices about the type of curriculum they would implement that “this court can not conclude that no possible choice could constitute ‘appropriate action’ under Section 1703(f).” Id. at 1019.

This volume summarizes a range of scientific perspectives on the important goal of achieving high educational standards for all students. Based on a conference held at the request of the U.S. Department of Education, it addresses three questions: What progress has been made in advancing the education of minority and disadvantaged students since the historic Brown v. Board of Education decision nearly 50 years ago? What does research say about the reasons of successes and failures? What are some of the strategies and practices that hold the promise of producing continued improvements? The volume draws on the conclusions of a number of important recent NRC reports, including How People Learn, Preventing Reading Difficulties in Young Children, Eager to Learn, and From Neurons to Neighborhoods, among others. It includes an overview of the conference presentations and discussions, the perspectives of the two co-moderators, and a set of background papers on more detailed issues.

Welcome to OpenBook!

You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.