For most of the last century, baseball and education had a lot in common. Most folks thought that baseball was all about talent. Playing the game well was characterized as an art. Managers trusted their hunches and used their gut feelings to guide their decision making. The important thing was playing the “right way” and honoring tradition.

Despite all the scientific and technological advances that have altered our world, education hasn’t changed very much. Instead of thinking about data and statistics as a way to analyze and solve systemic problems, we look at aggregate student results and extrapolate their sources and solutions based on our political leanings, gut feelings or individual experience.

Is it any wonder that we fight the same battles over and over again? Traditionalists vs. reformers. Districts vs. charters. Test haters vs. supporters. More money vs. money doesn’t matter. In-school vs. out-of-school factors. Etc. Etc. Despite all the energy and all the shouting, nothing actually changes except for the names of the latest initiatives and the people doing the fighting. Large numbers of kids never learn to read at grade level. They lose interest in science and math. They drop out of high school. They fail to complete college.

These results are not inevitable. But to change them, we need to stop throwing initiatives at a wall and praying something sticks. It doesn’t make sense to start with a solution before identifying the actual problem and its underlying sources. It makes even less sense to implement a “solution” at scale without assessing whether it has any impact on the underlying sources of the problem. This goes for both new initiatives like blended learning and the things we’ve been doing for years like class-size reduction. Regardless of an initiative’s actual merits, its proponents have inevitably tended to extrapolate its benefits to multiple problems – grade-level reading, dropout rates, college readiness, etc. – without clearly identifying their sources. As a result, our public investments are often based less on the merits of any individual reform than the political power of its supporters and/or the strength of their public relations campaign.

This isn’t all the fault of the proponents of these reforms. For much of the history of education, longitudinal assessments or evaluations of anything at scale cost a lot of money and took a lot of time. As a result, proponents of any current reform will often cite decades-old research or the evaluations of recent pilots to support their approach. But we are reaching a point where we can identify the sources of problems and promising solutions without spending bazillions of dollars on multi-year mixed methodology research studies. There is an incredible amount of longitudinal data emerging from multiple sources inside our education system. Instead of rejecting the potential power of this data in the name of local control, we simply have to commit at a state level to collecting that data, questioning our conventional wisdom, and seeing what the data says.

This change isn’t limited to baseball. In medicine, Kaiser Permanente created a massive “integrated electronic health records system” that incorporates information from multiple data systems. By synthesizing and analyzing this data, Kaiser could provide its physicians with better information on effective interventions for their patients. By looking at the data at scale, they are also able to identify the underlying sources of health problems. For example, when their data crunchers found that obesity rates in Oakland were correlated with patient access to parks, Kaiser started investing in parks and building relationships with schools and YMCAs.

Now, I know sports and medicine are not perfect analogies to K-12 education. But that doesn’t mean they don’t have something to teach us about using data to solve big problems. Think about how much we could learn about the real impact of areas where we spend billions, like academic remediation and special education. With the right data and analysis, we could figure out why, for example, some kids don’t learn to read by third grade, assess the relative benefits of our various early intervention strategies and consider unlikely alternatives.

Move beyond past mistakes

At this point, the biggest barrier to asking those questions isn’t the collection and analysis of data but the corrosive nature of education politics. If we could just step outside the warring houses of pro- and anti-reform, we could have a real conversation about the possibilities presented by data analytics. We could begin by acknowledging that both sides have made important points about the use and misuse of data in education accountability over the past decade.

Critics of education reform are right when they say that decision makers misused data over the past decade to curtail the ability of education leaders to use their professional judgment when making complex accountability decisions. But the simplistic and erroneous use of data in high-stakes decision-making (such as the Program Improvement model in NCLB) does not render the data itself useless or make poor educational results any less troubling. Nor should it undermine the desire to collect and analyze data to answer critical questions about beneficial strategies and investments to correct our most difficult problems.

Supporters of education reform are right to point out that 20 years ago data on educational outcomes was either nonexistent or highly variable and assessments of educational quality depended solely on individual judgment. But the poor educational results that emerged from the first systematic collections of education data, particularly for low-income students and students of color, were not by themselves a sufficient rationale for stripping away the ability of local education leaders to make complex accountability decisions. Nor should the trends and lessons that emerge from future analyses serve as the rationale for the imposition of a similarly rigid accountability structure.

There is no better time to do this than now. We have just engaged in a burst of unprecedented education policymaking and spending. Yet, from Common Core assessments to work-based learning to LCFF, our state has absolutely no plans to collect any data to assess the impact of these multibillion-dollar investments in public money. In this day and age, that makes no sense.

The first year of the implementation of the Smarter Balanced assessments could yield a treasure trove of information on the readiness of our state’s digital infrastructure, student preparedness for online assessment and so much more. The investments in work-based learning should be connected to post-secondary and work placement data systems to reveal whether these programs actually provide graduates with real college and career opportunities. The Local Control and Accountability Plans (LCAPS) from more than 1,000 districts and charter schools will provide an unprecedented amount of information on goals, outcomes, strategies and expenditures in eight state priority areas. However, the state currently has no plan to collect the information in these LCAPs or to evaluate what kind of progress districts are making toward their goals. Without collecting this information, our state leaders will know absolutely nothing at a state level about our state priorities! Meanwhile, local stakeholders will have no way to assess the comparative impact of their strategies and learn about better ones.

The mantra of local control and abdication of any state role may be good politics, but it is not good policy in the 21st century. By taking the time to collect and analyze the myriad data emerging from these multiple initiatives, we might actually learn something. In fact, a few years from now, we might be able to figure out which of these reforms were sacrifice bunts and which were home runs.

•••

Arun Ramanathan is executive director of The Education Trust–West, a statewide education advocacy organization. He has served as a district administrator, research director, teacher, paraprofessional and VISTA volunteer in California, New England and Appalachia. He has a doctorate in educational administration and policy from the Harvard Graduate School of Education. His wife is a teacher and reading specialist and they have a child in preschool and another in a Spanish immersion elementary school in Oakland Unified.

Comments are closed

Don Krause3 years ago3 years ago

STAR has to be scaled to equalize difficulty otherwise it is a worthless longitudinal measurement for year over year comparisons - the whole point of STAR testing and its extension -API.
LCFF uses LEP, foster and SES (free and reduced) as a proxy for low performing students. In regard to FRPL, why is it a proxy in the first place? - Because of the statistical correlation between poverty and low-performance. But a proxy is … Read More

STAR has to be scaled to equalize difficulty otherwise it is a worthless longitudinal measurement for year over year comparisons – the whole point of STAR testing and its extension -API.

LCFF uses LEP, foster and SES (free and reduced) as a proxy for low performing students. In regard to FRPL, why is it a proxy in the first place? – Because of the statistical correlation between poverty and low-performance. But a proxy is very inaccurate by definition. People commit fraud en masse with that program. You might as well say that if you’re poor you’re bound to be a dummy. LCFF institutionalizes this idea, which is less than enlightened. Governor Brown in effect has taken all the previous data and thrown it away as inaccurate. Instead he replaced it with a system steeped in error.

navigio3 years ago3 years ago

but how is the scaling done? either you do it subjectively (the way difficulty is assessed during test design, but then its subjective) or you do it statistically (which makes the process circular). (my guess is behind the curtain its actually a bit of both). i think doug pointed out in a recent post that a less accurate approach was taken this past year in the interest of speed (the year that just so happened … Read More

but how is the scaling done? either you do it subjectively (the way difficulty is assessed during test design, but then its subjective) or you do it statistically (which makes the process circular). (my guess is behind the curtain its actually a bit of both). i think doug pointed out in a recent post that a less accurate approach was taken this past year in the interest of speed (the year that just so happened to inexplicably have all sorts of confusingly unexpected results and variation). I think its noteworthy that the state has avoided publishing enough data to sufficiently assess the process. with the exception of this last year, which also just happens to be the last year for that test, making pretty much everything moot.

Don3 years ago3 years ago

Data isn't the problem. Numbers don't lie. If we didn't use data and instead used anecdotal experiential information as in "ask the teachers", would not the compilation of that information result in the same potential for misconstruction due to its subjectivity? Why conclude the data is wrong? Data is never wrong unless it is improperly obtained. Data is a tool by which to make assumptions. It isn't a proof as in math. It is … Read More

Data isn’t the problem. Numbers don’t lie. If we didn’t use data and instead used anecdotal experiential information as in “ask the teachers”, would not the compilation of that information result in the same potential for misconstruction due to its subjectivity? Why conclude the data is wrong? Data is never wrong unless it is improperly obtained. Data is a tool by which to make assumptions. It isn’t a proof as in math. It is part of a larger equation that includes numerous human variables. To disagree with a data-driven analysis is to disagree with the analysis, not the numbers.

In recent years such analysis has resulted in some pretty disappointing conclusions. Are the conclusions wrong due to human error? Sometimes. Now we have the teachers vilifying data because it’s not painting the desired picture of results. Conversely, we have “reformists” using the data to drive home preconceived conclusions – cherry-picking. Arguments over the meaning of statistics are not the fault of the statistics. Should math instructors teach kids to beware of numbers? No. The problem resides with their misuse by the media, by politicians, by various reformist groups and by the public.

In an era when San Francisco schools have honor rolls for African Americans and honor rolls for everyone else, I’m glad to have some hard data by which to draw my own conclusions as to what constitutes achievement.

navigio3 years ago3 years ago

It is likely that most data gathered education studies is gathered inaccurately. On top of that, much of the analysis is either improper and/or biased. It seems natural that would cause people to question data-based 'studies'. This is why most of the discussion surrounding these studies becomes about accuracy, methodology and motive.
By the way, I don't know whether that was intended to be a response to someone, but it's much easier to follow … Read More

It is likely that most data gathered education studies is gathered inaccurately. On top of that, much of the analysis is either improper and/or biased. It seems natural that would cause people to question data-based ‘studies’. This is why most of the discussion surrounding these studies becomes about accuracy, methodology and motive.

By the way, I don’t know whether that was intended to be a response to someone, but it’s much easier to follow the logic of the thread if you use the reply link under the comment you’re replying to instead of the one at the bottom of the page. Just a suggestion 🙂

Don3 years ago3 years ago

Thank you for the suggestion, Navigio. I usually do as you suggest but occasionally forget.

What makes you conclude that the data obtained is usually wrong? Social science is part art part science, but most social scientists would not last long if they used widely-rejected methodology.

Paul Muench3 years ago3 years ago

Don,

Navigio is not a lone voice on the nature of research. Here’s a link to an article for your consideration.

navigio3 years ago3 years ago

Interesting article Paul. In many universities you must be published to remain relevant. And the higher profile the publication, the more you're 'worth'. this has led to all sorts of sketchy publishing techniques, imho. research has really lost its status recently. perhaps part of the problem is there is just so much of it now. that seems like it would obviously lower its quality. until we can teach our communities to be more cautious and … Read More

Interesting article Paul. In many universities you must be published to remain relevant. And the higher profile the publication, the more you’re ‘worth’. this has led to all sorts of sketchy publishing techniques, imho. research has really lost its status recently. perhaps part of the problem is there is just so much of it now. that seems like it would obviously lower its quality. until we can teach our communities to be more cautious and scruitinous (and bang on our media more–thank you caroline), i expect we’ll continue to get things we should not be trusting as much as we do…

navigio3 years ago3 years ago

regarding data collection itself, this is a very tricky topic. probably the biggest problem with accuracy lies simply in the fact that data is virtually always at some level of extraction from what is being measured. probably one of the easiest examples of this are measures of infant behavior. because infants cant talk, we've devised all sorts of methods to divine what their intentions are (usually in neuro-psychological studies). in many fields, those methods meet … Read More

regarding data collection itself, this is a very tricky topic. probably the biggest problem with accuracy lies simply in the fact that data is virtually always at some level of extraction from what is being measured. probably one of the easiest examples of this are measures of infant behavior. because infants cant talk, we’ve devised all sorts of methods to divine what their intentions are (usually in neuro-psychological studies). in many fields, those methods meet consensus, but even there the agreement is not always universal. within education specifically, i think there are a lot of problems with how ‘data’ is presented. probably the easiest target is CSTs. note, we dont actually get the results of the csts rather we get something that is scaled to meet some other criteria (a statistician can probably argue that that scaling is valid, but as you’re probably aware, there is also nowhere near consensus on that). even then, those scores are broken down into some kind of classification (eg performance bands). that is an arbitrary distinction that is loaded with emotional baggage. there are all sorts of other examples: pel measure at schools are quantifiable but what is almost never published with those rates is the rate of response (not everyone actually reports their level in all schools, and even those that do may not do so accurately). API is provided by ethnic subgroup even though each ethnic subgroup may be made up of completely different ratios of at-risk subgroup overlays (eg read overclassification of african americans in special education). i could go on and on.
in some ‘studies’ we only get these things at the superficial (incorrect) level. admittedly in some, we get closer to the core meaning of the data, but i expect when arun speaks of data, he wont be questioning what csts supposedly tell us.
those are a few points, but im running up against the comment limit, sorry… 🙂

Steve Rees3 years ago3 years ago

I'm delighted to see Navigio's thoughtful regard for the quality of the raw material that we call data. Indeed, all data has noise mixed in with high fidelity, high quality signals. In the sciences, discriminating between evidence that one regards as "more or less" truthful requires an appreciation for how data is collected, where noise enters the signal, and the degree to which it is noisy. Finding false positives and false negatives in testing evidence … Read More

I’m delighted to see Navigio’s thoughtful regard for the quality of the raw material that we call data. Indeed, all data has noise mixed in with high fidelity, high quality signals. In the sciences, discriminating between evidence that one regards as “more or less” truthful requires an appreciation for how data is collected, where noise enters the signal, and the degree to which it is noisy. Finding false positives and false negatives in testing evidence in medicine, and evaluating each test’s likelihood of rendering both false values, is a habit of mind.

What are the habits of mind in K-12? Sadly, as this comment section reveals, we are debating whether statistical evidence has merit at all. No surprise, if our state’s governor continues to combine his love of the humanistic core of learning with his love of local control, we will have less and less evidence altogether. California is the state where the “Know-Nothing Party” was born.

I strongly agree with Arun about the fundamental problem and affirm that it is one that can be changed. The problem is how we govern ourselves.

Don3 years ago3 years ago

Second that, Steve. I believe the disregard for data and the decoupling of results from SACS codes is a thought out political reaction to the steady stream of bad press on student achievement. They're killing the messenger under the guise of humanism. It is a profoundly retrograde policy that thumbs its nose at social science. And it's paradoxical for decision-makers in the word of public policy to take such a stance, much less the … Read More

Second that, Steve. I believe the disregard for data and the decoupling of results from SACS codes is a thought out political reaction to the steady stream of bad press on student achievement. They’re killing the messenger under the guise of humanism. It is a profoundly retrograde policy that thumbs its nose at social science. And it’s paradoxical for decision-makers in the word of public policy to take such a stance, much less the Governor himself.

When the legislature passed SBX 3_4, 2009 five years ago, they flexed the Categoricals and reassigned the codes to 0000. As a result of this institutionalized lack of accountability, all the funding allocated under that flexibility is impossible to track. On a state level we have no idea if those billions were used to any effect or not. In a continuation of this policy LCFF untethers the statistical approach to accountability.

Manuel3 years ago3 years ago

navigio, too bad you can't post the graphs showing the raw and scaled CSTs of the 2013 administration. Most of the data worshippers here might have their eyes opened.
Alas, they "trust" that which they can't verify.
Statistics have validity but only if one knows what it represents. One wouldn't argue about what "temperature" means if one knows what it stands for: the standard deviation of the energy distribution of a collection of gas molecules. But here … Read More

navigio, too bad you can’t post the graphs showing the raw and scaled CSTs of the 2013 administration. Most of the data worshippers here might have their eyes opened.

Alas, they “trust” that which they can’t verify.

Statistics have validity but only if one knows what it represents. One wouldn’t argue about what “temperature” means if one knows what it stands for: the standard deviation of the energy distribution of a collection of gas molecules. But here we are arguing over the meaning of 1% of a standard deviation of a “measurement” that is, for all intents and purposes, derived from a calculation that fits a matrix of coefficients to a set of made up variables so that they match a particular result. This is “science?” Really?

“Know-nothing” my foot…

Gary Ravani3 years ago3 years ago

"At this point, the biggest barrier to asking those questions isn’t the collection and analysis of data but the corrosive nature of education politics. If we could just step outside the warring houses of pro- and anti-reform, we could have a real conversation about the possibilities presented by data analytics."
The above statement seems to come from someone with the "dove of peace" in one hand, and the "litigious dagger" in the back to dedicated classroom … Read More

“At this point, the biggest barrier to asking those questions isn’t the collection and analysis of data but the corrosive nature of education politics. If we could just step outside the warring houses of pro- and anti-reform, we could have a real conversation about the possibilities presented by data analytics.”

The above statement seems to come from someone with the “dove of peace” in one hand, and the “litigious dagger” in the back to dedicated classroom educators everywhere of the Vegrra suit in the other. Ed Trust West being a prominent supporter of that billionaire instigated war on the “house” of public education.

To put the best possible light on this kind of pronouncement…let’s just say it’s…disingenuous to an extreme.

Jane Smith3 years ago3 years ago

This is the first rational commentary I've seen lately. I am a school board member, former superintendent, and accountability consultant to hundreds of schools all over the US. The problems in public education are fundamental and structural: all kids are expected to achieve at the same rate, and to the same degree, in all the same subjects. This alone is a stunningly flawed assumption. There are NO consequences for failure, only for achievement (less funding). … Read More

This is the first rational commentary I’ve seen lately. I am a school board member, former superintendent, and accountability consultant to hundreds of schools all over the US. The problems in public education are fundamental and structural: all kids are expected to achieve at the same rate, and to the same degree, in all the same subjects. This alone is a stunningly flawed assumption. There are NO consequences for failure, only for achievement (less funding). The worst teachers are in schools with the neediest kids, and teachers have NO incentives to work harder or take on kids who need more help. Just the opposite: one gets higher pay and more recognition for getting out of the classroom and doing other jobs like “coaching” (which is a joke) or being a specialist or being an administrator, It is no secret why we aren’t getting results for ALL students. We are structured to ensure that only the strongest succeed.

Gary Ravani3 years ago3 years ago

Some smart guy once said: "Data is not information, information is not knowledge, knowledge is not understanding, understanding is not wisdom."
Another very smart man, a well known economist in his own right and son of the iconic John K Galbraith (JKG) said about data in economics: "Turning to the relationship between inequality and development, we diagnose several common econometric problems in the literature, including measurement error, omitted variable bias, serial correlation in longitudinal data, and … Read More

Some smart guy once said: “Data is not information, information is not knowledge, knowledge is not understanding, understanding is not wisdom.”

Another very smart man, a well known economist in his own right and son of the iconic John K Galbraith (JKG) said about data in economics: “Turning to the relationship between inequality and development, we diagnose several common econometric problems in the literature, including measurement error, omitted variable bias, serial correlation in longitudinal data, and the possible persistence of lagged dependent variables.”

Just substitute “education problems” for “econometric problems” and the possible pitfalls of the worship of data begins to be clarified.

And then, we get to the great man JKG himself, about who nobel laureate Joseph Stiglitz said: “JKG understood capitalism as lived, not as theorized.”

In his biography JKG talked about the data obsessed economists, out of the Chicago School, that he met immediately at the end of America’s 2nd great Era of Reform, the New Deal. These data driven economists, JKG believed, were using data, spread sheets, etc., to mask the “lived,” real life impacts of economic policy choices by government. If you focused enough on the data, then you could ignore policy implementations that resulted in more poverty and human suffering.

And so it has been with the last wave of “data driven decision making” in education. The use of data abused and mutated into things like value added based evaluation (VAM). It should be noted here that the basic calculations for VAM were originally conceived by an agricultural economist, not realizing that learning doesn’t come in bushels. Pay attention to spread sheets, not kids.

So all of the caveats about “metrics” detailed by Galbraith (above) certainly apply when considering the applications of “data” to educational issues. Issues several orders of magnitude more complex than baseball. When is data usefully applicable to ‘lived” education? Ask a teacher.

Steve Rees3 years ago3 years ago

Gary, your glib dismissal of Bill Sanders because he was applying statistical methods to agriculture is not a high-minded approach to this debate. If you disagree with value-added measurement models, then I would rather see you argue your case based on a critique of Bill Sanders' ideas, rather than his profession.
By the way, Sanders' and Rivers' mixed model for estimating teacher impact on student learning has been a reference point for dozens of others who … Read More

Gary, your glib dismissal of Bill Sanders because he was applying statistical methods to agriculture is not a high-minded approach to this debate. If you disagree with value-added measurement models, then I would rather see you argue your case based on a critique of Bill Sanders’ ideas, rather than his profession.

By the way, Sanders’ and Rivers’ mixed model for estimating teacher impact on student learning has been a reference point for dozens of others who followed. There is now a smorgasbord of models for estimating teacher impact. If you find them all to be flawed, you better take stock of the literature.

Your own bias for the “lived” experience, and your antipathy to the analytic approach to knowledge, is a bias that has echoes in many fields where this polarization has emerged. I recommend your read Paul Meehl’s work on this subject. The debate emerged in psychology in the 1950s.

Carl Cohn3 years ago3 years ago

In order to be successful in professional sports, Philly needs even more kids from Long Beach Poly…Asking Chase Utley and DeSean Jackson to carry both the Phillies and the Eagles is too much even for these talented Jackrabbits…

Richard Moore3 years ago3 years ago

If you draw any graph of educational achievement you will find that we are educating more students to higher levels than ever before in our history. Problem areas do not indicate a generalized assumption about a "problem" with education in general. Stabilize the home life. Feed the children. Give them access to books, in public and school libraries. Then there will be improvement in those areas where there are specific problems. But California continues to … Read More

If you draw any graph of educational achievement you will find that we are educating more students to higher levels than ever before in our history. Problem areas do not indicate a generalized assumption about a “problem” with education in general. Stabilize the home life. Feed the children. Give them access to books, in public and school libraries. Then there will be improvement in those areas where there are specific problems. But California continues to have the lowest level of school and public library service in the nation. We would have to hire 5000 school librarians and build 1000 public library branches to be average nationally.

el3 years ago3 years ago

An important difference here is that measuring outcomes in baseball is relatively simple. What you measure and what you want are easily aligned, and easily collected. A game takes only a few hours. A season lasts less than a year.
By contrast, in education, our outcome is a functional 18 year old human with 13 years of education. Thus, to see the outcome of our kindergarten changes on graduation rate is not a short term exercise. … Read More

An important difference here is that measuring outcomes in baseball is relatively simple. What you measure and what you want are easily aligned, and easily collected. A game takes only a few hours. A season lasts less than a year.

By contrast, in education, our outcome is a functional 18 year old human with 13 years of education. Thus, to see the outcome of our kindergarten changes on graduation rate is not a short term exercise. Indeed, it is rare for any particular curriculum or reform strategy to even stay in place as long as 13 years – long enough to get your first clean data points.

And, while we use test scores as our proxy for student outcomes, it’s well understood that they are crude and don’t really indicate success or failure, especially if you look at too few students.

Paul Muench3 years ago3 years ago

My condolences on being a Phillies fan :)
One of the other stories of the A's is that improving their win percentage did not translate into winning a series. Likely because they could't invest in the talent that was obvious. In education I liken that to public schools not paying teachers enough to attract the same talent as TFA. Not that TFA has a monopoly on talent. Perhaps there are some silver … Read More

My condolences on being a Phillies fan 🙂

One of the other stories of the A’s is that improving their win percentage did not translate into winning a series. Likely because they could’t invest in the talent that was obvious. In education I liken that to public schools not paying teachers enough to attract the same talent as TFA. Not that TFA has a monopoly on talent. Perhaps there are some silver bullets out there waiting to be found in the data. But they are not easy to find. And its not like we haven’t been through one failed statewide data system already.