Testing controversies didn’t start with No Child Left Behind or Race to the Top, writes William J. Reese, an education history at the University of Wisconsin, Madison, in the New York Times. “Members of the Boston School Committee fired the first shots in the testing wars in the summer of 1845.”

Many Bostonians smugly assumed that their well-funded public schools were the nation’s best.

. . . Citizens were in for a shock. For the first time, examiners gave the highest grammar school classes a common written test, conceived by a few political activists who wanted precise measurements of school achievement. The examiners tested 530 pupils — the cream of the crop below high school. Most flunked. Critics immediately accused the examiners of injecting politics into the schools and demeaning both teachers and pupils.

In 1837, education reformer Horace Mann, the “father of the common school,” became secretary of the newly created Massachusetts Board of Education, which was “part of the Whig Party’s effort to centralize authority and make schools modern and accountable,” writes Reese. “After a fact-finding trip abroad, Mann claimed in 1844 in a nationally publicized report that Prussia’s schools were more child-friendly and superior to America’s.” (Prussia was the Finland of the mid-19th century!)

Mann’s friend Samuel Gridley Howe, was elected to the School Committee. As a member of the examining committee, he insisted on written rather than oral tests.

His committee arrived at Boston’s grammar schools with preprinted questions, which angered the masters and terrified students. Pupils had one hour to write down their answers on each subject to questions drawn from assigned textbooks.

Only 30 percent passed. It turned out that students had “memorized material they often did not understand,” Reese writes.

The examiners believed that the teacher made the school, a guiding assumption in the emerging ethos of testing. Tests, they said, would identify the many teachers who emphasized rote instruction, not understanding. They named the worst ones and called for their removal.

. . . Anticipating an angry reaction from parents, Mann told Howe to deflect criticism from the examiners by blaming the masters for low scores. While the School Committee fired a few head teachers, parents nevertheless accused Howe of deliberately embarrassing the pupils and bounced him out of office in the next election.

Testing continued. Examiners caught one master leaking questions to students. They criticized a school for black students for low expectations and performance. They worried about how to evaluate school quality.

“Comparison of schools cannot be just,” the chairman of the examining committee wrote in 1850, “while the subjects of instruction are so differently situated as to fire-side influence, and subjected to the draw-backs inseparable from place of birth, of age, of residence, and many other adverse circumstances.”

(State Sen. Barry) Finegold, the bill’s sponsor and the son of public-school teachers, said his motivation sprung from conversations with parents in Lawrence, part of his district northwest of Boston, where the struggling school district was taken over by the state in 2011. The state has since brought in charter operators to run two low-performing schools, and parents told him, “we’d be out of here” had that not happened, Mr. Finegold said. “One thing I don’t think people realize—charter schools are keeping a lot of the middle class in cities,” he said.

More than 31,000 Massachusetts students attend charter schools, an increase of 20 percent in the past four years.

Massachusetts ranks its schools from Level One, the highest, to Level Five based on academic achievement, graduation and dropout rates. This year, 59% of charter schools in the state were Level One, compared with 31% of non-charter schools.

Eighty-three percent of Boston charter schools did significantly better than comparison schools; no Boston charter did worse. “The Boston charter schools offer students from historically underserved backgrounds a real and sustained chance to close the achievement gap,” said Margaret Raymond, who directs CREDO at Stanford University.

Statewide, the typical student in a Massachusetts charter school gains an extra one and a half months of learning per year in reading and two and a half in math.

The idea is that teachers know best and that standardized testing—or any kind of testing, really, other than the teacher-built kind—is a distracting nuisance that saps valuable instructional time, deflects instructors from what’s most essential, and yields very little useful information about student learning.

. . . research has consistently demonstrated that, absent independent checks, many teachers hold low-income and minority students to different standards than their affluent, white peers.

. . . Standardized tests not only help us unearth these biases but also put the spotlight on achievement gaps that need to be closed, students who need extra help, schools that are struggling, and on. And by doing so, they drive critical conversations about the curriculum, pedagogy, and state and district policies that we need to catch kids up and get them back on the path to success.

Testing also is blamed for “drill-and-kill” instruction that existed long before the testing-and-accountability era, they write.

All else being equal, the students who typically fare better on state tests are those whose teachers focus not on empty test-taking tricks but rather on content-rich and intellectually engaging curriculum.

Standardized tests don’t measure “what really matters” in education, such as critical thinking or social and emotional skills, critics complain. No test can measure everything, concede Porter-Magee and Borgioli. But many skills can be evaluated.

Anti-testers argue that setting standards and aligning assessments to them doesn’t work because it’s not what the Finns do.

Our own history suggests that it is exactly the states that have set rigorous standards connected to strong accountability regimes—most notably, Massachusetts—that have seen the greatest gains for all students, not just our most disadvantaged.

Meaningful reform will “require the effective measurement of student achievement that tests make possible,” they conclude.

Teacher preparation policies earned a D+ in 2012, according to the National Council on Teacher Quality’s State Teacher Policy Yearbook. That’s up from a D in 2011.

The highest grade — B- — went to Alabama, Florida, Indiana and Tennessee. Alabama, Connecticut, Kentucky, New Hampshire, Rhode Island and Vermont made the most progress. Three states – Alaska, Montana and Wyoming – received failing grades.

Only a third of undergraduate teacher preparation programs are sufficiently selective, NCTQ finds. The majority “fail to ensure that candidates come from the top half of the college-going population.” Only 24 states require teacher preparation programs to use a basic skills test to screen applicants.

Standards are low for elementary teachers:

Teaching children to read is among an elementary teacher’s most important responsibilities, yet only 10 states appropriately assess teacher proficiency in effective reading instruction. And only 11 states adequately test new elementary teachers’ knowledge of mathematics.

Even though all but four states require some subject matter tests for elementary teacher licensing, the passing scores are extremely low. Every state (for which NCTQ has data) except Massachusetts sets the passing score for elementary teacher licensing tests below the average score for all test takers (50th percentile), and most states set passing rates at an exceedingly low level.

Only eight states– Colorado, Florida, Georgia, Louisiana, North Carolina, Ohio, Tennessee and Texas – use student achievement data to hold teacher preparation programs accountable for the effectiveness of the teachers they graduate.

Some states produce enough elementary teachers to fill anticipated openings, but others produce twice as many as needed—or more.

Supply

Demand

Percent Difference

Colorado

1,169

1,099

106%

Connecticut

701

600

117

Delaware

373

122

306

Illinois

9,982

1,073

930

Kentucky

1,275

730

175

Louisiana

1,033

650

159

Maryland

1,011

723

140

Massachusetts

1,175

1,051

112

Michigan

2,903

1,227

236

Minnesota

1,179

709

166

Mississippi

751

660

114

New York

6,498

2,800

232

Pennsylvania

6,048

1,420

426

Tennessee

1,970

1,380

143

NOTES:Excepting Illinois and Maryland, supply figures come from Title II for 2009-10 and represent only new teachers. Maryland supply figures do not include alternatively prepared teachers. Supply and demand figures for Illinois and Maryland are based on their education departments’ analyses. They represent supply in 2009-10 for Illinois and 2010-11 in Maryland, and demand in 2010-11 for both states. Demand figures for other states are based on 10-year occupational estimates from 2010-20, except for Michigan, Mississippi, Tennessee (2008-18) and Colorado (2011-21).

SOURCES: U.S. Department of Education; state education departments; state labor bureaus.

New York and Michigan prepared twice as many elementary teachers as needed in 2011-12. Pennsylvania turned out four new graduates for every job opening. Illinois issued nine new elementary-teacher certificates in 2009 for every one first-time teacher hired in 2010.

By contrast, Colorado and Michigan produce just enough new elementary teachers to meet demand. (That’s assuming nobody moves from Illinois and Pennsylvania.)

Colleges should be more selective about admitting teacher candidates and train them more intensively, argues the National Council on Teacher Quality.

“We could improve, enhance, and extend the quality of teacher preparation, and therefore produce better-qualified new teacher graduates, but probably fewer in number,” agrees Arthur E. Wise, former president of the National Council for Accreditation of Teacher Education.

The most striking contrast is in mathematics, where the performance of Finnish 8th graders was not statistically different from the U.S. average on the 2011 TIMSS, or Trends in Mathematics and Science Study, released last month. Finland, which last participated in TIMSS in 1999, actually trailed four U.S. states that took part as “benchmarking education systems” on TIMSS this time: Massachusetts, Minnesota, North Carolina, and Indiana.

. . . “Finland’s exaggerated reputation is based on its performance on PISA, an assessment that matches up well with its way of teaching math,” said Loveless, which he described as “applying math to solve ‘real world’ problems.”

He added, “In contrast, TIMSS tries to assess how well students have learned the curriculum taught in schools.”

Finland’s score of 514 on TIMSS for 8th grade math was close to the U.S. average of 509 and well below Massachusetts’ score of 561. Finland was way, way below South Korea on TIMSS but nearly as high on PISA.

Finland beat the U.S. average on TIMSS science section, but was well under Massachusetts.

In 4th grade reading, Finland beat the U.S. average on PIRLS (Progress in International Reading, Literacy Study), but scored about as well as Florida, the only U.S. state to participate.

Finland’s seventh graders dropped from above average to below average on TIMSS math. Pasi Sahlberg of the Finnish Ministry of Education and Culture said this was “mostly due to a gradual shift of focus in teaching from content mastery towards problem-solving and use of mathematical knowledge.”