Share this Page

Formative Assessment :: Building a Better Student

By testing academic performance at regular intervals,formative assessment strategies regard every childas a work in progress.

To start, a question: What’s the difference between formative assessment and summative assessment?

It’s not enough to say merely that formative assessment is a measure of student achievement administered several times during the school year, whereas summative assessment is a measure of student achievement usually administered at the end of the year. That’s only part of the answer; the true distinctionbetween the two lies in their respective uses.

The goal of summative assessment is typically to provide an overall measure of student performance for someone outside the classroom—a report card for parents, an SAT score for college admissions officials. The consequences of this measure can be critical—being promoted or not, getting into college or not. By contrast, the goal of formative assessment is to provide feedback for someone inside the classroom—an indication of how well a student is doing in a given subject at a given level. The consequences are more immediate, such as individualized instruction. Broadly speaking, summative assessment answers the question “How did I do?”; formative assessment answers the question “How am I doing?” In the data that provides the answer to the latter question, the benefits of formative assessment are drawn. Those benefits are why formative assessment is in such demand today.

In their seminal essay, “Inside the Black Box: Raising Standards through Classroom Assessment,” Paul Black and Dylan Wiliam cite D. Royce Sadler, a professor of higher education at Griffith University in Australia, writing: “When anyone is trying to learn, feedback about the effort has three elements: recognition of the desired goal, evidence about present position, and some understanding of a way to close the gap between the two.” So assessment in the classroom involves both student and teacher awareness of a goal—perhaps meeting a state standard, a baseline of knowledge, or a strategy of instruction.

So now that we’ve defined our terms, let’s move on to Winston- Salem, NC. That’s where Wes Leiphart, the assessment team manager for Winston-Salem/Forsyth County Schools, read the Black and Wiliam essay one day and thought, This is what we need. He brought the idea of standardized formative assessment to the superintendent and the principals in his district, and they told him to go for it.

What Leiphart went for was the Prosper assessment system, developed by Pearson Education. Prosper gave the district what it wanted—in Leiphart’s words, a“quick way to get a snapshot” of how students in grades 3-8 werefaring in reading and math. Leiphart and the teachers in his districtcreate their own items for the measures, which are based onNorth Carolina’s state standards. They administer these measuresthree times a year on precoded answer sheets. Then theyuse the system to “disaggregate the data”—meaning, break itdown into reports and statistical summaries: How well did thisteacher’s students do on these items? How well did boys do onthese items? How well did Hispanic students do on these items?Using the reports, teachers can target instruction appropriately:tutoring for this student, advanced work for that one.

Teachers know when kids are clearly doing well or clearly not doing well. It’s the kids “in the middle” who are often missed, and formative assessment helps identify them.

At first, Leiphart couldn’t get teachers even to read the reports. Now, he says, teachers are demanding them. He recently received a call from a grades 4-5 school—one of 72 schools in the district—asking for 1,000 reports. Leiphart met with the school’s leadership team, trying to home in on what was needed most; he pared the request down to about 200. The school had been counseling each student, identifying strengths and weaknesses based on the feedback from the assessments. That, says Leiphart—despite the omnibus request—is how the strategy is supposed to work.

Pearson Measures Up

A second Pearson product is being used in the San Marcos Consolidated Independent School District in Texas. The Progress Assessment Series (PASeries) does even more than Prosper; it furnishes the actual measuring instruments—the tests. San Marcos administered its first PASeries assessments to fifth-grade math students in November 2005, a response to the concern over the district’s math performance on the state tests. Testing was then expanded to both reading and math at the four elementary, one intermediate, and two junior high schools in December, at the semester’s end.

The PASeries has predictive validity, which is what makes it so valuable: Because the system is aligned with various state standards in different subjects, it predicts how students will do on the state tests, in this case, the Texas Assessment of Knowledge and Skills (TAKS). Dale Wiley, San Marcos’ director of accountability and school improvement, took charge of the PASeries project. He learned everything he could about it online, and he took a random sampling of his own, using 100 students. With an initial screening, he found a predictive validity “in the 80s,” he says. “For a predictive tool, I think it’s the best thing I’ve seen out there.” As Wiley says that a principal in the district commented, “If we can predict with 85 percent accuracy on how our kids are going to do on the TAKS, using a single test, why would we not use it?”

Naturally, Pearson Education’s president and CEO, Doug Kubach, is a big proponent of formative assessment. “If kids are clearly doing well or clearly not doing well,” he says,“teachers know that.” It’s the kids “in the middle,” Kubachpoints out, who are often missed, and teachers need a tool thatwill identify them. Prosper, which is generally classroombased,was launched at the beginning of 2004; the PASeries, which is generally school- and district-based (students needaccess to computers in order to take the tests), was launchedat the end of 2004.

The latest version of Prosper contains an Academic Standards Library CD that allows users to import their state learning standards for math and science into the system, as well as incorporate reports from which data can then be disaggregated into subsets, such as students with disabilities, English Language Learners, and economically disadvantaged students. Pearson’s SuccessMaker Enterprise—powered by the database management system SQL Anywhere Studio from Sybase — individualizes instruction in math and reading according to the needs of each student. Specifically, SuccessMaker helps teachers assess what their students know, identify areas where students need help, and tailor the digital courseware to each student so the right material is taught at the right time. Teachers can use the program to produce reports on student progress so they can make adjustments in their curriculum and focus on material that is giving students the most difficulty.

There are, of course, other players in the formative assessment market. In Orangeburg, SC, Greg Carson, the public information officer for the 14-school Orangeburg Consolidated School District Five, and Sterling Harris, the principal of North Middle/High School, have been using two products from Scantron Testing and Assessment: the Performance Series and the Achievement Series. The Performance Series is a diagnostic test given at the beginning of the school year; it’s norm-referenced against a national percentile. After nine weeks, the Achievement Series is administered; it measures how many students are meeting each state standard. Throughout the year, every nine weeks or so, more testing is done.

According to Carson and Harris, the results over the past year have been nothing short of phenomenal. After receiving training from Scantron, administrators and teachers immediately bought into the system, which Carson says was important.“You need district assessment of the software to makesure it’s going to work,” he says. The staff’s faith was justifiedwhen it began to get feedback from the tests. Harris says thatin some cases, teachers had totally misjudged which standardsstudents were meeting. “It’s helping,” he says. “It’s giving myteachers more information.”

DOES YOUR ASSESSMENT TOOL PASS THE TEST?

The US Department of Education says it does if it can perform these five key functions:

Get the right data. Assessments should be valid, reliable, and interpretable.

Get the data right. Data should be gathered and disaggregated accurately.

Get the data right away. Reports should be made available as quickly as possible.

Get the data the right way. Data should be assessed electronically to provide for easy viewing at the classroom, school, and district levels—by grade, by subgroup, by course, by class, by staff member, and by individual student.

Get the right data management. A single, centralized interface should be used for the most efficient data management.

You might have thought The Princeton Review was a sophisticated literary journal published in New Jersey. Nope. It’s a New York City-based provider of some of the largest formative assessment programs in the country, having generated more than 10 million assessments and 1.5 million reports since 2000. Sloane O’Neal, vice president of sales and marketing, says the company offers tests for school districts that use all kinds of platforms, not just its own, which is called the Homeroom Assessment Center. The Princeton Review designs tests, delivers them either online or offline, and provides reports and psychometric evaluations.

But back to Orlando and Lee Baldwin. Baldwin wanted a centralized, consistent assessment of the students in his district, primarily to adhere to No Child Left Behind and Reading First requirements. Orange County now administers formative assessments in math and reading, and it’s moving toward doing the same for science. The districts prints out its own answer sheets, students mark them, the answer sheets are scanned optically, the images are uploaded to a website, and then the results are posted on the site. Teachers test in the morning and get their results by the afternoon. “It’s virtually instant feedback,” Baldwin says.

This is the first year that Orange County has used these assessments, but state results already show marked improvements in both reading and math. Still, Baldwin cautions aphoristically,“Change is a process, not an event.” Baldwin choseThe Princeton Review after a laborious process of winnowingout programs that professed to be aligned to state standards butfell short. He’s very satisfied, but says he is open to more changein order to maintain quality assessments that are matched specificallyto the standards that students will need to meet.

In discussions with administrators such as Baldwin and with assessment providers, a distinct emphasis arises: assessment not as an end to itself but as a means of instruction. Some of this no doubt is a result of defending formative assessment against complaints from both teachers and parents that testing takes time away from instruction. But the point is made over and over. The Princeton Review’s Sloane O’Neal says intervention, not tests, makes for student improvement. Winston-Salem’s Wes Leiphart says succinctly, “Assessment is part of instruction.”

“Acuity is a classroom-friendly suite of assessments with both online and paper-and-pencil administration options. Predictive benchmark assessments mirror state NCLB assessments in grades 3-8 and grade 10 in math and reading/language arts, and deliver immediate, actionable data on student progress.” Acuity was designed “for teachers and classroom use,” the release goes on to say. “The Acuity Diagnostic Benchmarks are tailored to district curriculum pacing and assess student retention and knowledge of core content areas. Diagnostic reports show specific mistakes students make so teachers can target instruction to improvement needs—a powerful way to accelerate student performance and help educators meet achievement goals.’”

McGraw-Hill staff wrote the predictive assessments specifically to each state’s tests. They also wrote the diagnostic assessments for districts wanting to know how well their students were keeping up with the curricula, intent on making changes based on that knowledge. That means the company worked with each district to determine precisely which items would provide the most valid measures for the district’s curricula. The staff also had a bank of more than 40,000 items in language arts and math to provide to teachers who wanted to create their own measures.

One measure, the predictive one, needs to be as closely aligned as possible to each state’s tests. The other measure, the diagnostic one, needs to be as closely aligned as possible to each district’s curricula. If the curricula meet the state’s standards, then the two measures should be closely correlated.

Christine Leydon, systems training assistant for the Broome-Tioga Board of Cooperative Educational Services, which serves 15 school districts across upstate New York, is an Acuity user. She says she knew “we pretty much had to sell it to teachers. Who better to sell to teachers than other teachers?” So four teachers per grade level for language arts and mathemathics created assessments using a question bank of more than 20,000 items. They then trained all their fellow teachers in how to use Acuity. Once the teachers realized that after developing the assessments, their work was virtually done—the tests would be scanned and scored automatically—and once they saw that the reports Acuity generates would allow them to easily identify students’ particular academic needs, they embraced the program.

Some assessment technologies work with questions asked orally. Consider InterWrite PRS by GTCO CalComp. PRS stands for personal response system: Each student holds a compact, infrared wireless transmitter. The teacher asks a question, and the students respond by pressing buttons on their transmitters. The signals are sent to a PRS receiver, which forwards them to a computer. At the end of the session, the teacher has a statistical summary of the responses in addition to a record of individual responses. The software also includes a function for creating quizzes consisting of several types of questions, including numeric with decimal points, fractions, and positive and negative numbers; multiple choice; true/false; rank order; multiple correct; and short answer.

Some assessment technologies work with essay questions, too. LearningExpress, for example, partnered with Miami-Dade County Public Schools to improve student performance on the Florida Comprehensive Assessment Test (FCAT) Writing+ assessment. Students’ writing assessments were directly aligned with the FCAT Writing+. Each student essay was scored professionally and returned to the district within three days of its processing. More than 35,000 essays have been scored, and student responses, essay results, and diagnostic feedback are available online to administrators and teachers for analysis, targeted remediation, and tracking. Some of the programs already touched on, such as Acuity, will soon be able to record and score essays as well as multiplechoice items.

What all of these formative assessment technologies do, in one form or another, is provide useful information to both teacher and student. Although the information often arrives instantly, the benefits are seen over time. Indeed, as Orange County’s Lee Baldwin says, change is indeed a process, one that’s producing ever more interesting tools for educators looking to make instruction as meaningful, as efficient, and as effective as possible.

:: web extra :: For more information on this topic, visit www.thejournal.com. In the Browse by Topic menu, click onAccountability/Assessment.

Neal Starkman is a freelance writer based in Seattle, WA.

This article originally appeared in the 09/01/2006 issue of THE Journal.