Follow me on Twitter

Data Teams

Getting people to care about data, engage in data and use data to drive instruction is difficult, but this exercise can serve the dual goal of data engagement and team building.

Too many teachers are stuck on current practices. We teach as we were taught. We hold biases. When data confirms what we know, it feels a waste of time. When it contradicts our assumptions, we make excuses as to why that might be so. The hustle and bustle of the profession provides an easy excuse to pass over important data about our students instead of having meaningful engagement.

Here’s an exercise–a kind of data jigsaw–to cut through that.

Data. You will need to break your data down into smaller bits. Say, Math scores for the 4th grade. And, Reading scores for the 7th. More and more. Take the data you have already, and slice and dice it until you have as many bits as faculty and staff present. You can use graphs, too.

Each person in the room has one part of a table or a graph, which you prepared in the last step.

Looking at their data, people should think about what next piece of data they need to create context (so, if I had 4th grade Math data, I might want to know how the supervisory union did. Or how these same students did in 3rd). You could have them write the question on a sticky note (this makes it concrete) or not (which allows mental flexibility). With the former, they can track their own thought pathway, too.

People then find whomever who is holding that table or graph with the information they seek. They chat.

Together, they come to conclusions (and write them on their sticky note). They then decide together if they need another piece of data, or their next question. Or, they might decide each needs different data or has different questions and separate.

Repeat. The exercise ends when each individual has “next steps” for what students need based on a clear picture of where they are.

A lot of graphs and broken up data tables are needed for that! I recommend any facilitator play this game amongst other admins or even alone to get an idea if a) would it actually work, b) what pieces are wanted and needed–with data, what makes sense in YOUR head is not always clear to others.

Recently, we have moved away from it. Some parents equate more homework with academic rigor and quality. Of course, the data says otherwise–but also supports that community perception of the school goes up when the homework load goes up. Now, the rallying call is for a research based policy on homework.

I often start with John Hattie and his file reviews and meta-analyses because the guy covers everything. From his file reviews you can also find great original research worth pursuing.

Rather than rehash, let me guide folks to an insightful analysis of Hatties’ conclusions in “Homework: What does the Hattie research actually say?” on Tom Sherrington’s headguruteacher blog. Sherrington does a great job looking beyond the overall effective rating of d = 029, which implies that homework is not an effective (Hattie argues that a practice is “effective” when above d = 0.40, and even that statement is a simplification of his work). Note: Sherrington’s analysis begins a bit technical, a reflection of Hattie’s technical meta-analysis that he’s analyzing. Wade through it–it’s worth it.

In short, Sherrington notes that Hattie reports that homework is ineffective at the primary grade level (d = 0.15), but quite effective at the secondary level (d = 0.64). Of course, these results have caveats about type of work and the like. Read the article, as Sherrington goes in-depth with the details and their implications.

But Sherrington’s analysis is also instructive when we look at bias in analysis and reporting. Up front, he makes his pro-homework views clear and shares a link to another piece of his, ‘Homework Matters: Great Teachers set Great Homework’. The analysis that follows is, as I’ve said, pretty solid. Until the end.

All of this makes sense to me and none of it challenges my predisposition to be a massive advocate for homework. The key is to think about the micro- level issues, not to lose all of that in a ridiculous averaging process. Even at primary level, students are not all the same. Older, more able students in Year 5/6 may well benefit from homework where kids in Year 2 may not. Let’s not lose the trees for the wood! Also, what Hattie shows is that educational inputs, processes and outcomes are all highly subjective human interactions. Expecting these things to be reduced sensibly into scientifically absolute measured truths is absurd. Ultimately, education is about values and attitudes and we need to see all research in that context.

Let’s ignore the statement “All of this makes sense to me and none of it challenges my predisposition…” which is the definition of denial (or perhaps cognitive dissidence). You would think, at least, Sherrington would concede that at the primary level homework offers little benefit (remember: d = 0.15 vs. d = 0.40 entering the effective range). No. Instead, he qualifies, “Older, more able students in Year 5/6 may well benefit…” That word: May. Ugh.

Through such phrases a bus runs through. After presenting, breaking down and analyzing the data he a) ignores the data that refutes his viewpoint, b) presents an alternative based on no data. You cannot do that! It’s not good science! If Sherrington wants to use his theory as a basis for more research, great. Instead, he simply argues that the data says one things, he believes another and so he’s going with his gut. At least he’s transparent about it.

But I have been in countless meetings and conversations like this: Someone concedes that a practice is not effective and then justifies its continuation. Parents and educators will often come with opinion pieces that seem like common sense, but with little data. With a little research these views are often refuted, and sometimes quite harmful practices. Yet, the idea will persist. As professionals and organizations we refuse to trust data–or even consider it.

The posts to Sherrington’s piece are a typical reaction to any presentation of data that challenges orthodoxy. Some accept it, but many qualify their views and use that data Hattie presents in an interpretive way. Again, a good study in identifying bias.

My suggestion in such situations is to turn it around. First of all, if some primary kids “may” benefit, might it also be said that some secondary kids “may not” benefit? Do they get penalized? What’s the plan for them? Remember–when there is a majority, there is also a minority. Why do we make policy when the majority agrees with us, but ignore the majority when they disagree?

Second, all choices have consequences: When you choose one thing, you are giving up another. To have homework, students and teachers are giving up something. That might be something simple like time–a student with thirty minutes of homework loses thirty minutes of play time, for example. Is the loss worth the gain? Hattie’s analysis seems to indicate that little is gained at the primary level, so any loss might not be worth it. Educators should focus on the trade-off. Is it worth it?

For Sherrington, he seems as interested in getting in the content and practice in the face of limited class time. He is trading off the student’s time for academic gain. That might be a fair trade-off, especially if the students are part of the decision making process. But there might be other inefficiencies in Sherrington’s lesson planning that can be exploited (I have no idea) that a student might want addressed before they give up their after-school time.

Is it worth it? Really, that’s the essence of all of this data analysis. Sherrington needs to respect that data a bit more instead of dismissing it.

Just because a student is learning does not mean a teacher is teaching.

Can we take credit for success? Or failure? We need to know the effectiveness of our program if we hope to increase that effectiveness. What, for example, if the school year gains only make up for a huge summer regression? What if students, after a year of work, only gain slightly?

Knowing precisely where gains and losses occur is essential for change–or, for standing by a program that works! Well meaning initiatives are the norm, but they are often based on assumptions. Two things then happen: 1. The reality of how learning occurs does not match, so the results show no change, or 2. Those forces against change raise concerns that are as valid (or more valid) than the evidence supporting the original initiative. In the end, the initiative fails, fades or simply disappears and everyone feels just a tad more of initiative fatigue. Precise knowledge of where programs succeed and fail stops such failures and creates real change.

For example, our SU had a push for something called “Calendar 2.0”, a re-imaging of the school year calendar that would break the year into six week unit sized chunks with a week or two break between. It added no more days to the year, but spread out school days. The intent was good. During the breaks teachers could plan their next unit in reaction to the results of the previous unit’s results, and students could get help mastering skills they had yet to gain. Schools might even be able to offer enrichment during that time! The shorter summer break was designed to prevent summer regression.

For families, it shrunk the summer break, and created more week-long breaks. It cut into summer camps and made child care difficult and somewhat random. There were a vocal group, too, that argued for the unstructured time that summer afforded. In reality, many families were going to plan vacations during that time–told that their child was reading poorly and needed to stay would not trump the money already spent to go to Disney World.

Because Calendar 2.0 was not based on the clear need of students in our schools, it failed. It sounded good–summer regression! Planning time! Time for shoring up skills!–but there was no local data supporting it. Of all the areas teachers saw in need of shoring up, working in the summer did not rank highly.

Where in the Year is the Gain?

In my last post, we looked at year-to-year gains and regression. When we look at the 12 kids who regressed, half did so over the summer. But half did so over the school year. So, over 180 days of instruction student reading actually regressed for 6 students. Our school made them go backwards. Both summer or school year regression are results we should be concerned about, but the latter points to something we can control but are not.

Below, I identify students who had a dramatic difference between their summer and school year gains. Some students showed consistent gains, while others consistently stagnated. My two periods of examination–summer and school year–are based on discussions people at my school were having about programs we could institute (Calendar 2.0 being the most prominent). You should do those time periods that you feel are important to you. It could be the November-January holiday period (where family strife makes learning difficult), or May-June (end-of-year-itis) or days of the week (Mondays are about mentally coming back, Wednesday is the drag of “hump day” and Friday is checking out–so when DO kids focus?). The important part is to use data to examine it. All of my parenthetical asides are assumptions, many of which I have found untrue. You will be surprised how assumptions and old saws are wrong.

Students Who Gain Over School Year

Instead of always looking at the negative, let’s try and determine where kids succeed. These students (scores highlighted in yellow) showed significantly more gains in reading over the school year than the summer. In fact, relatively, this group lost ground over the summer.

Note how, because I gave the DRP in the spring, fall and again in the spring I was able to measure a) year-to-year growth, b) spring-to-fall growth (in short, gains or losses during the summer) and c) school year growth.

The difference between the school year and summer demonstrates the importance of being in a literate environment for reading growth to occur. Being forced to spend time with text leads to reading success.

Note that growth occurred both in students who struggle with reading (expected) and those who are in the top group. Even good readers benefit from the literate school environment. If these students get more time in a literate environment–more reading time–these gains should continue and increase.

Summer Gains, School Year Loss

There is a population who sees a loss in reading progress over the school year (notes in that dark khaki color), yet sees gains over the summer. They are a diverse group for which this phenomena could have many causes.

Still, one factor carries through many of those identified: Time. Many of these students lead busy lives, with responsibilities including sports, family, work and school. Reading for fun, and leisure time in general, is at a premium. Without practicing their reading, they show no growth–or a loss.

In the summer, these students enjoy a lighter schedule. They fill it with reading. In order for them to see year round growth they need time. These same results have been observed in other studies and are especially prevalent in middle class students saddled with activities.

Will Successes Scale Up?

Okay, so we saw some success. Often, when we do nothing, someone gains. Was it me? The conclusion I reached–more time spent reading will improve reading skills–makes instinctual sense. And the research backs that up. Good, right?

That said, the data I have at hand is thin. I am relying on a certain knowledge of my students and that invites bias. My sample sizes are small. As a data point, the DRP is more like a machete than a scalpel. (Read The DRP as an Indicator Of….) Will more Sustained Silent Reading (SSR) result in progress? It will take some time for the data to prove me out or cause me to change course. And, of course, more data, and more precise data. But, I am aware of all of this as I move forward. My program will react not to theory, but to what students experience in the classroom.

For example, of those 12 students who regressed, 3 are in the top stanines–they have nowhere to go. Similarly, the glut of students with none to minimal gains are also in the top stanines nationally–they had nowhere to go. But, I am wary of making excuses, so data.

*

Restatement: Introduction to These Next Few Blog Posts (Backstory for those coming to this post first).

We get a lot of data. It may come in the form of test scores or grades or assessments, but it is a lot. And we are asked to use it. Make sense of it. Plan using it.

Two quotes I stick to are:

Data Drives Instruction

No Data, No Meeting

They are great cards to play when a meeting gets out of hand. Either can stop an initiative in its tracks!

But all of the data can be overwhelming. There are those who dismiss data because they “feel” they know the kids. Some are afraid of it. Many use it, but stop short of doing anything beyond confirming what they know–current state or progress. And they can dismiss it when it does not confirm their beliefs. (“It’s an off year”) Understanding data takes a certain amount of creativity. At the same time, it must remain valid. Good data analysis is like a photograph, capturing a picture of something you might not have otherwise seen.

This series of blog posts will take readers through a series of steps I took in evaluating the effectiveness of my reading program. I used the DRP (Degree of Reading Power), a basic reading comprehension assessment, as my measure because it was available. I’m also a literacy teacher, so my discussion will be through that lens–but this all works for anything from math to behavior data.

* A stanine (STAtistic NINE) is a nine point scale with a mean of 5. Imagine a bell curve and along the x-axis you divide it into nine equal parts. The head and tail is very small area (4%) while the belly is huge (20%). Some good information can be found in this Wikipedia entry.

Many schools measure a program in yearly gains. In one year, students should show a year of growth. What we mean, of course, is the school year; that from September to June students will gain a grade level. And we hope there is little regression over the summer. We will discuss the difference between yearly and school year gains in our next post. For now, let’s focus on yearly gains.

Are students learning? After identifying where you students are (the previous post) your next task is to measure growth over a year. To start, I suggest spring to spring because that measures where they are after a year of instruction. Again, here is my spreadsheet from Grace Haven Elementary. Note Column E, which simply takes the DRP (Degree of Reading Power) score from 6th grade and subtracts it from the 7th grade score for a single number I call “gain”:

It is really important to compare apples to applies. One of the strengths of the DRP is that the scores are comparable from year to year. Whatever measure you use, please make sure the measures match so that you are able to measure a year’s worth of growth.

What is not always clear is what a year’s worth of growth IS. For example, the DRP offers an I90 score (the reading level a student is able to read independently with 90% understanding). I would expect that every student gains by just being in the building. By having a year of life under their belt. But what does a year’s worth of gain look like? At Grace Haven, they gained a bit over five (5) points on the I90. That does not seem like a lot.

Except that, as we learned in Step 1. over half of the students were in the top three stanines*, or top 23% nationally. Where do they have to go? If half of our students cannot gain much, it dampens the possible growth of the group.

So take them out. When we look at those who are not in the top stanines another picture emerges. In the case of Grace Haven Elementary, the growth is mixed. Some students gain a lot, some regress, and others stay put. If you have the former, pat yourself on the back before moving into the tough analysis. If the latter–stagnation–you might think about the questions those in regression need to be ask because your program is not where you want it to be.

Regardless, just because a student is learning does not mean a teacher is teaching. Can we take credit for success? Or failure? We need to know the effectiveness of our program if we hope to increase that effectiveness. What, for example, if the school year gains only make up for a huge summer regression? What if students, after a year of work, only gain slightly?

When we look at the 12 kids who regressed, half did so over the summer. But half did so over the school year. So, over 180 days of instruction student reading actually regressed for 6 students. Our school made them go backwards. Both summer or school year regression are results we should be concerned about, but the latter points to something we can control but are not.

Of those 12 students, 3 are in the top stanines–they have nowhere to go. Similarly, the glut of students with none to minimal gains are also in the top stanines nationally–they had nowhere to go.

Still, it raises a basic question: Does our program help kids raise their game, or does it rely on previously done work and simply maintain that. If the latter, those who did not “get it” earlier are not getting what they need now.

Restatement: Introduction to These Next Few Blog Posts (Backstory for those coming to this post first).

We get a lot of data. It may come in the form of test scores or grades or assessments, but it is a lot. And we are asked to use it. Make sense of it. Plan using it.

Two quotes I stick to are:

Data Drives Instruction

No Data, No Meeting

They are great cards to play when a meeting gets out of hand. Either can stop an initiative in its tracks!

But all of the data can be overwhelming. There are those who dismiss data because they “feel” they know the kids. Some are afraid of it. Many use it, but stop short of doing anything beyond confirming what they know–current state or progress. And they can dismiss it when it does not confirm their beliefs. (“It’s an off year”) Understanding data takes a certain amount of creativity. At the same time, it must remain valid. Good data analysis is like a photograph, capturing a picture of something you might not have otherwise seen.

This series of blog posts will take readers through a series of steps I took in evaluating the effectiveness of my reading program. I used the DRP (Degree of Reading Power), a basic reading comprehension assessment, as my measure because it was available. I’m also a literacy teacher, so my discussion will be through that lens–but this all works for anything from math to behavior data.

* A stanine (STAtistic NINE) is a nine point scale with a mean of 5. Imagine a bell curve and along the x-axis you divide it into nine equal parts. The head and tail is very small area (4%) while the belly is huge (20%). Some good information can be found in this Wikipedia entry.

We get a lot of data. It may come in the form of test scores or grades or assessments, but it is a lot. And we are asked to use it. Make sense of it. Plan using it.

Two quotes I stick to are:

Data Drives Instruction

No Data, No Meeting

They are great cards to play when a meeting gets out of hand. Either can stop an initiative in its tracks!

But all of the data can be overwhelming. There are those who dismiss data because they “feel” they know the kids. Some are afraid of it. Many use it, but stop short of doing anything beyond confirming what they know–current state or progress. And they can dismiss it when it does not confirm their beliefs. (“It’s an off year”) Understanding data takes a certain amount of creativity. At the same time, it must remain valid. Good data analysis is like a photograph, capturing a picture of something you might not have otherwise seen.

This series of blog posts will take readers through a series of steps I took in evaluating the effectiveness of my reading program. I used the DRP (Degree of Reading Power), a basic reading comprehension assessment, as my measure because it was available. I’m also a literacy teacher, so my discussion will be through that lens–but this all works for anything from math to behavior data.

Step 1: Sorting Proficiency

The first, most basic step in analyzing a program is to find out how many of your students can do the skill you are interested in. It seems basic, but so many teachers assume they know the answer. Never assume. A number of our students read a lot, but don’t really think about their reading–their ability is on the surface and their memory of what they read is weak. Because we see them with their nose in a book, though, we tag them as a reader. Others can, but don’t read. They do, though, test well. In a previous post I discussed if the DRP was a measure of reading or stamina (or ability to focus). That may an issue for some. It is certainly an excuse–they don’t test well, or they’re unable to focus. You can do analysis later, but you first have to see where your class stands before you begin asking why and proposing solutions.

Choose an assessment and give it. You want to give one that you would consider valid–that is, few variables. You can measure books and/or pages read, stamina in SSR, depth of reading using reading logs, or a good old standardized test. What that means is up for debate, but I used the basic off-the-shelf DRP to measure reading comprehension. You also want to administer an assessment that you give multiple times. We administer it in the fall and spring to allow for tracking progress (more on that later).

Here is a sample of a class from Grace Union Elementary:

Students highlighted in lavender are in the top three stanines* of achievement nationally, or the top 23% of readers in the same grade. In this cohort there are 24 out of 47 in that group. So, half of our students are top readers. In addition, 3 other students met our local standard, but fell short of the top national stanines (highlighted in purple). Twenty-seven out of 47 students scoring well is great news, right?

It depends. Looking at your data, you do have to decide where the line between proficient and below are. Our supervisory union does that by pegging the “local standard” to a certain national average point. You might disagree with your local designation–I used the stanine to raise my bar above ours–but since the results change depending on that line your choice is important. What line will reveal the most about your program?.

For example, an additional 15 students were in the 5th or 6th stanine, putting them at or above the mean nationally. Not bad, when added to the 24 who were in the 7th, 8th and 9th stanines. I could comfortably go to our admins and the school board and talk about 39 out of 47 being average or above. If I wanted to, I could point out that a number of our struggling readers have IEPs or other plans. Everyone would agree that my program is solid.

Except, locally, that’s not good enough. When they get to high school they will struggle if they are merely average. Six of my students were in the lowest stanines, or about 1 out of 8. Not great numbers. And 20 students don’t meet our local standard of proficiency. They are leaving my classroom unprepared for what awaits them.

Rubrics are a start. What is “proficiency” for you? The NCLB data is a nice yardstick–what measures do you have that correlate with the data you are seeing there? Think about if they do, in fact, correlate. Our old NCLB writing data (NECAP) seemed to inflate our ability, so we have created a local assessment that gives us a little guidance on what to work on. I correlate that with what I see in my classroom assignments. If we still gave the old NECAP, I’d take the data with a grain of salt.

For half of our students, reading is a natural activity and they do it well. Twenty-seven students can claim to be proficient. But, even for them, I have no idea if I can claim their success a result of my program.

That is the next question.

* A stanine (STAtistic NINE) is a nine point scale with a mean of 5. Imagine a bell curve and along the x-axis you divide it into nine equal parts. The head and tail is very small area (4%) while the belly is huge (20%). Some good information can be found in this Wikipedia entry.

When I arrived at my current school a decade ago, there was no definitive measure of a student’s ability to read. That may be hard to conceive in this data-drives-instruction landscape (although in education you can find plenty of instances where a lot of data will not tell you the most basic things about students), but the feeling was that teachers knew the students and the word was spread as they moved from grade to grade.

In November of that first year, I found that I had a student who did not know alphabetical order. I had asked her to look something up in the dictionary; she faked it for a few minutes and then threw a fit that got us distracted from the whole enterprise. Fortunately, my aide noticed the faking and followed up. This student was a known non-reader, but no one knew she was reading at a second grade level. How, after all, could someone be reading at so low a level? And she carried around grade level appropriate books and sat quietly during SSR. When pressed, she created a scene of distraction. In trying to fit in, she slipped through the cracks. She is exactly why we have and use data today.

What is the DRP? (Skip Down for Discussion on Validity)

At a previous job, we had used the Degree of Reading Power, or DRP, on 10th graders. Created by Questar, the DRP measures reading comprehension. In the assessment a passage is provided with certain words removed. Students are asked to fill in the blank from a selection of five words. The 7th grade test is 63 questions, while the 10th grade was 110 when I gave it years ago (I doubt it’s changed). The questions start out easy, and get progressively harder as the student goes on.

We use bubble sheets–it’s that dull. But, in adopting the DRP, we have a screening tool that lets us question who has mastered the basics of reading. From that baseline, we ask follow-up questions, plus have students write reading journals and answer prompts to measure understanding and deeper meaning throughout the year. For the hour we put into it, we get what we need out of it.

The DRP also provides some good data. From the raw score, Questar gives you an Independent Reading Score, or I90. The I90 indicates the level of a book a student can independently read without problems or assistance. So, Harry Potter and the Sorcerer’s Stone is ranked a 56, meaning a student who scores an I90: 56 should be able to read it unassisted (this does not take into account cultural literacy or maturity, which is why caution should be used on Of Mice and Men’s “easier” score of 53). It also offeres I80 and I70 scores, which indicate increasing support offered for understanding, plus a “frustration level”–the point where a student might throw the textbook across the room. Questar also ranks a student’s score against the nation, providing national percentiles and stanines. I’ve never asked what database they get this information from (is it against other users of the DRP, or larger pools?), or if it updates every year, but it’s a larger sample size nonetheless.

When I first did this, there was a booklet filled with tables that converted all of this for you, but they later came out with a database for the computer. The company also provided a directory of popular classroom texts and their DRP, so you could match students with books. None of these CD ROMs ever really worked for our computers. Questar seemed locked in the 1970s. The online information today smells like a dying company or division being run out of habit, where each year someone has an idea to update stuff but never to revamp the entire test for the NCLB age. Even the name Questar sounds like one of the lesser computers of the early 80’s competing with Tandy and Commadore. I think they know what they have and keep plugging.

The neatest thing about the DRP, though, is that the I90 score measures across grades. You can compare an I90 score taken in 2nd grade with the I90 taken with the 7th grade test. So, if a 2nd grader scores a 43 as a 2nd grader and a 45 as a 7th grader, you know they have not progressed over five years of schooling. You can also give a poor reading 7th grader a 5th grade test and they will not meet their frustration point until much deeper in, providing a more accurate result. In the end, I like to measure growth. The DRP is great for that.

What Does the DRP Really Measure?

Every September we give the DRP to our 7th grade, and every May we give it again. Because of the design, we can measure I90 growth over the year. We can also measure it against their 6th grade result. If we use the 6th grade spring results against the 7th grade fall, we are be able to measure gain (or loss) over the summer. We can do the same thing when we measure the same kids in 8th grade.

But the DRP is dull. And, remember, the questions start out easy, and get progressively harder as the student goes on. For some students, the first hard question throws them. Then, they just color in dots. One way Questar makes money is that they sell their bubble sheets and then correct them for you (and put the results on a disk, ready for manipulation). Instead of paying for that, we took our answer key and made an overlay (overhead transparency sent through the photocopier). In correcting it ourselves we can see where kids give up from a series of wrong answers.

Which lead us to the question we’ve wondered for a while: Does the DRP measure reading comprehension or stamina?

To answer that question, last September I broke up the 63 question DRP I usually give my 7th graders into three parts with 21 questions each. Then, I measured growth (or not) with their 6th grade scores from the previous May. In the end, nothing significant showed up except that one group got better at reading over the summer: Over scheduled kids. I had read that high achieving middle class kids who participate in a lot of activities–soccer, music, the school play–cannot find time to read during the school year. My data showed that, but nothing about stamina. In fact, the ups and downs over the summer made little sense.

But discoveries often happen by accident. This May, I went back to the old administration of the test–63 questions in one sitting. Our 8th graders were taking a NCLB mandated Science assessment, so I used that time to give the 7th graders the DRP. Because the 8th grade were monopolizing our aides and classrooms, I set the 7th graders up in the cafeteria while the kitchen staff were whipping up lunch in the kitchen. My hope was that the blowers and bacon smell would be white noise and calming as the students and DRP assessments spread out across the antiseptic tables in the grey room. Some finished quickly, while others lingered over an hour.

The results were not inspiring. I had been unhappy with my reading program–I’m unhappy every year with both my reading and writing programs, but this year I now had weak data to prove it. I uploaded my scores into a spreadsheet, looked at growth, ranked and sorted. The high kids stayed high, and the middle kids stayed in the middle, with a few growing or dropping a bit. Even that assessment is a bit inflated, if I’m honest. It was not a good year.

Then there were the kids at the bottom. About ten students had dropped between ten and thirty points over the school year (on a scale topping out at 80). This was significant. Our entire Tier II intervention placement was based on these scores. Several students who had moved out of Tier II were looking at returning in 8th grade. Those receiving Tier II were seeing regression. What, I wondered, was I doing wrong? (I had ideas, and it started with sacrificing SSR time for any distraction that came down the pipe).

In looking at the names of the students, I realized that those students who either had a diagnosis of ADD or ADHD, or we suspected of having ADD or ADHD, had tanked. Our literacy group had often wondered to what extent the DRP was a test of stamina as much as it tested reading.

In looked at their answer sheets, I noticed that around the 20th question these students began to get questions wrong. Not just a few as the questions got harder, but a string wrong and then another string wrong. The break, I suspected, was because, even when guessing, a student will get some correct, because probability. They had given up and were just filling in bubbles. Bad data.

The next week, I had a few of these students redo questions 22 through 42. They were placed in a quiet room in two groups of four. I had explained my belief in them and personally appealed to their sense of pride and in controlling the environment. In short, was trying to get them to focus on the task and then setting up an environment that fostered focus. Six of the eight did significantly better, from 5 to 12 questions better. When I had them redo the last twenty questions, I saw the same results. Five the students went from the 4th or 5th stanine for reading nationally to the 8th.

Of the two students who showed little improvement, one is not ADD or ADHD. The other is suspected of ADHD and was even more hyperactive than the first administration and openly hostile to the retake. Either they had learned to read in a week, or I had been measuring stamina before.

Why does it matter beyond the one assessment? Our school uses the DRP data to decide who get Tier II help and who has “graduated” to Tier I. Tier II instruction happens against World Language, so it can be a reward or punishment depending on the family. There is some pressure for students to be taking a World Langauge (often from their parents), or a desire by students to “flunk” into Tier II so they can a) avoid the hard work of learning French and b) be with their Tier II friends. These numbers weigh heavily in the court of “what’s best for the child”.

It also matters on how we take other, higher stakes assessments. For their NCLB assessment, Vermont uses the SBAC. Entirely online, students have a lot of control–if they choose to take it. Those who click through quickly and take a long break find those answers locked when they return. They can, though, slowly go through a small number and break. Then return for a few more. This is different from past assessments, which means we need to retrain and empower students. These results tell us that we need to instruct some students in how to take a test–an instruction that in tailored to individual students and different from just attacking the questions themselves. The results also tell us we need to create a different environment–one in which students can move about without disturbing others, and they are less tied to a clock.

All of this leads me to a more outlandish proposition that I am still thinking about: Our school uses the DRP to measure where students are, but I’d like assessment to be more predictive about potential. Why? Because when an assessment just measures, I find the school’s reaction is to address what they think it measures. So, those who tank the DRP get put into the standard Tier II reading program. But if we can measure elements that go into that measure–like stamina–it gives us a better idea of what to address. The potential is there. The fix, then, might be more around Habits of Mind than more phonics. At present, we are not sure.

Of course, our support services responds by offering more assessments. But that is often guesswork and time consuming. If the cafeteria with bacon wafting through the room is not condusive to results, I cannot image the forty minutes a special educator can give me to do a “quick” BES is much better. And the coordinator who battered kids with an AIMS-Web in a noisy hallway (the only space available) produced little that was useful. And, if anything is found, the student is often dumped into a program with a promise that “we’ll work on that” cause when they have time after the reading instruction is done. No one has time. In identifying causes, we might find the solution can be had with greater efficiency.

My hope is assessments that can be more predictive, and can be done by empowering students. By having the students value the assessment, and understanding the consequences of their choices, they own it. When we give them the tools to do their best work, they use them. In the end, the measure becomes about reading.