Data Champions: How Do You Measure Student Growth?

This month’s question was not posed by an educator but was spurred on by an article I recently read related to measuring student growth. This may be a timely topic for many of you, as December typically represents the beginning of the winter interim assessment testing window. For those that administer interim assessments in the winter, this gives teachers (and their students) a chance to reflect on the growth their students have made from their previous assessment and to see if they are on track to meeting their end of year goals. With the passing of the Every Student Succeeds Act in December of 2015, states have been tasked with developing their own plans for being accountable for student growth and success. But, what kind of growth should we be measuring? Great question! Unfortunately, the answer to this question is… It’s complicated. Fortunately, some individuals working for the Council of Chief State School Officers (CCSSO) have attempted to shed some light on the different types of growth that can be measured and what purposes they serve as they relate to academic performance of students. This article represents my own summary of a 117-page research report related to measuring student growth which can be found here. [Castellano, K. E., & Ho, A. D. (2013). A Practitioner’s Guide to Growth Models. Council of Chief State School Officers.]

What is growth?

In education, teachers are tasked with helping to mold the growth of an ever-growing population – their students. Throughout a lifetime, an individual may grow in any number of ways; in size/stature, in age, in maturity, spiritually, athletically, and in intelligence just to name a few. Although each of these types of growth are vastly different, they all share a common root – change over a period of time. In essence, teachers are the change agents which help students grow into functioning members of the larger society.

Before we can go into analyzing what types of growth should be measured, we must first know what growth is/represents. Growth, as the authors of the research report have defined it, “describes the academic performance of a student or group (a collection of students) over two or more time points” (pg. 13 of full report). Essentially, growth represents a change in state or status from one point in time to another. When analyzing the growth of students over time (and the effectiveness of the teachers and schools), growth models are commonly used. A growth model represents a “collection of definitions, calculations, or rules that summarizes student performance over two or more time points and supports interpretations about students, their classrooms, their educators, or their schools” (pg. 16 of full report). Basically, a growth model is a way to provide evidence that growth either has or has not occurred between two or more points in time, and that the resultant changes (or non-changes) are due to a specific cause or situation.

What kinds of growth models are out there?

The authors of this report identified seven different growth models that relate to measuring change in academic achievement:

Before we can go into analyzing how each model works it is important to recognize the three different functions of growth models: 1) growth description; 2) growth prediction; and 3) value added. A model that is targeted towards growth description, is intended to measure the magnitude (how much) of growth that occurred. A model that is targeted towards growth prediction is intended to measure how much growth should occur between two or more points in time, based on current and/or past achievement. A value-added model is intended to identify what caused the growth (or non-growth) to occur and to make growth comparisons based on different factors. A breakdown of the different growth models and their various characteristics is included in the table below.

Gain Score Model

The Gain Score model is a basic model that is intended to describe the raw scale score growth for an individual student (or average scale score growth for a group) on an assessment between two time periods. Interpretively, this model is intended to answer the question “How much has a student learned on an absolute scale?” In layman’s terms, this model defines growth relative to oneself. To illustrate this in an example, if a student scored 200 on the state math assessment in grade 3 and scored 250 on the state math assessment in grade 4, then this model indicates that the student grew positively by 50 points (i.e. 250 – 200). For this statement (and model) to be valid, the scores measured at each point in time must be on the same common scale. This means that the total scale points must be the same (i.e. you can’t compare an assessment that has a scale of 100-500 with another assessment that has a scale of 200-400) AND that the score on the first test/assessment is equivalent to the same score on the second test/assessment (i.e. represent the same amount of learning mastery). Illustrating this in a non-educational example, if a student weighed 50 pounds at the beginning of grade 3 and then weighed themselves at the beginning of grade 4 (on the same weight scale) which showed that they weighed 60 pounds, then this model would suggest that the student grew in weight by 10 pounds. Because they were using the same scale, a weight of 60 pounds in grade 3 is equal to a weight of 60 pounds in grade 4.

This model is especially useful in situations where pre/post tests have been conducted because the questions are the same and they are both scored on the same scale. However, in situations where there are differences in the assessments taken (i.e. differences in assessment questions/content), this type of model would only be appropriate if each assessment were rigorously tested and the content mastery was on a developmental continuum. Because the difficulty/complexity increases and the type of content changes as a student progresses across grade levels, it is difficult to compare scores on a single common scale across grade levels. To put this to an example, this would mean a score of 200 on a particular assessment in grade 3 would represent the same amount of learning mastery as a student scoring a 200 on the assessment in grade 4. To illustrate this in a non-educational example, if a person’s weight is being compared at the start of grade 3 and grade 4, then a weight of 70 pounds in grade 3 is equivalent to weighing 70 pounds in grade 4. Also, fundamental to this model is the concept of equal intervals. Essentially, this means that a growth in assessment score by 25 points from 50 to 75 represents an equivalent increase in knowledge as an increase from 75 to 100 (i.e. an increase of 25 points represents the same amount of additional learning content mastered regardless of position on the scoring scale). Bringing it back to the example about a student’s weight, an increase of 10 pounds from 70 to 80 pounds represents the same increase of magnitude (in weight) as an increase from 90 to 100 pounds.

Although, it is difficult and takes a lot of time to create an assessment with a common vertical scale, this does not mean that it has not been done. The North Dakota State Assessment (NDSA) is an assessment which student growth could be potentially analyzed using the Gain Score Model because it is based on a vertical scale. NWEA Measures of Academic Progress (MAP) assessments are also assessments which could be potentially analyzed using the Gain Score Model. “Every test item is anchored to a vertically-aligned equal-interval scale that covers all grades” (“Measures of Academic Progress”, pg. 3).[1] In fact, “MAP supports extensive use of the true gain score model, a growth model that defines how much true learning growth has occurred in the intervening time measured by the true difference between two test scores. The resulting growth measure offers a direct measure of how much a student has progressed over a given time period” (pg. 6). STAR Math and Reading assessments can also be used to analyze growth using the Gain Score Model. “STAR scores range from 0 to 1400 and use a vertical scale to create a grade-independent test score” (“Converting Measures of Academic Progress”, pg. 3).[2]

It is important to note that although, an individual may show growth (an increase in scale score) on an assessment, depending on the assessment scoring scale and the expectations for student content mastery at each grade level, a student may increase in scale score and decrease in their percentile or proficiency level. Ultimately, this type of model is intended to describe the overall magnitude of change (or no change) in student learning mastery.

The Gain Score Model can also be used to identify if the change in an individual’s (or group’s average) scale score is low, medium or high compared to others. This can be analyzed in 3 different ways: 1) scale-based standard setting; 2) norm-referenced standard setting, and 3) target-based standard setting.

Scale-based standard setting occurs when categories of scale score growth are established using cut points on the scoring scale. For example, an increase of 1-10 scale points may be considered “low”, 11-20 scale points “medium”, and 21+ scale points “high.” This type of standard setting may not be appropriate across grade levels as the difficulty in moving up scale points may not be consistent across all grades.

Norm-referenced standard setting occurs when categories of scale score growth are established using the distribution of gain scores from a reference or norm group of individuals who have taken the same assessment. Each person’s gain score would be ranked in comparison to the norm or reference group and different categories of growth would be identified based on ranges of growth percentiles (e.g. 1-33rd percentile = “low”; 34th-67th percentile = “medium”, and 68th-99th percentile = “high”). This type of standard setting is more effective in situations where there is not a consistent linear trend of expected growth across grade levels.

Target-based standard setting involves identifying a student (or students) as making adequate growth if they are on track to meet a certain score in the future. The student’s scale score growth from the previous to the current assessment period is projected into the future, and if their projected future score is at/above the future target, then they would be categorized as “on track.” An example would be that John scored a 300 in grade 3 on his state math assessment (an increase of 25 points from his score in grade 2). It is expected that in grade 6, students should be scoring 350 or higher on the state math assessment to be considered proficient. To be considered “on track” to meet the goal of 350 or above in grade 6, John would need to increase his score by 75 points over the next 4 years (18.75 points per year). Because his growth of 25 points is higher than his expected growth of 18.75 points, John would fall into the category of being “on track” to reach his target based on his grade 3 score.

Trajectory Model

The Trajectory Model is intended to predict student growth as a function of the growth/change in scale score from the previous to current assessment assuming consistent growth over time. For example, if a student increased their scale score on the state math assessment by 50 points from grade 2 to grade 3, then the trajectory model would project that this student would increase their score by an additional 50 points in grade 5, another 50 points in grade 5, and so on. Based on the growth that has occurred in the past, this model creates a growth trajectory for expected performance in the future. This model is useful if 1) the scoring scale is consistent across assessments, and 2) if equal growth is expected across grade levels (or periods of time). This model would not be useful in situations where there is not a common scale across grade levels, or where variable growth is expected across grade levels (or periods of time). Assuming each of these features are met, this model is useful for determining if a student (or a group of students) are “on track” for meeting a future score target. An example of one way to use the trajectory model to assess student growth is included in the image below.

In this example, a student grew by 25 points on their assessment score from grade 3 to grade 4. Based on the growth that this student made from Grade 3 to Grade 4, a trajectory line was created to predict this student’s growth on future assessments (assuming 25 point increases each year). Additionally, a target score in grade 6 was mapped on the graph to see if this student’s growth trajectory would place the student “on track” (above the gold line) by grade 6 or “not on track” (below the gold line) by grade 6. In this scenario, the student would be “on track” for meeting the grade 6 target score assuming consistent yearly growth of 25 points.

Categorical Model

The Categorical Model is similar to the Gain Score Model except that instead of using change in scale scores it uses change in performance level categories. Positive gains are associated with increases (movement in an upward direction) in performance level category, and negative gains are associated with decreases (movement in a downward direction) in performance level category. For example, if a student scored in the “Partially Proficient” performance category on the state assessment in grade 5, and then scored in the “Proficient” performance category on the state assessment in grade 6, this would represent positive growth according to the Categorical Model. One potential drawback of this model is that it lumps multiple scores or bands of scores into a single category, which may disregard changes in scale score that fall within a single performance category. Essentially, a student could score at the bottom score of the “Proficient” performance category during Time 1 and reach the highest score within the “Proficient” category during Time 2, but since the student’s performance category did not change no growth would be reported. This model is less useful in situations where the definitions (score ranges) for the performance categories change. For instance, in the image below, two students grew the same amount (green arrows), but only one student grew in regard to their performance category (the student in red).

Residual Gain Model

The Residual Gain Model uses linear regression to determine what a student’s expected score would be for the current assessment based on their past score(s). This is also known as a conditional status model because it predicts the student’s expected score in condition of their previous score. The residual gain represents the difference between the student’s actual scale score compared to the student’s expected scale score based on the prediction of the linear regression model. If the student’s actual score is above the expected score, then the student grew at a higher rate than in the past, and vice versa. The model is created by taking prior students’ scores in the student of interest’s current grade and previous grade and plotting them on a scatterplot. Then, calculate and plot the linear regression line using statistical software or manually using the equation y =a + bx where:

Once the line has been plotted on the graph, find the student of interest’s score on the x-axis and go straight up until you intersect with the line calculated by the linear regression model (solid line). At the point where you intersect with the line, go straight left to the y-axis and that will tell you what score the student is predicted to get during the current assessment period. After the student completes the assessment for the current year, calculate the residual by subtracting their expected score from the actual score (similar to the image below). In the example below, a student who scored a 350 on their grade 3 assessment would have been expected to score a 364 on the assessment in grade 4. The student’s actual score in grade 4 was 375, which exceeded that student’s expected score by 11 points. In conclusion, this student grew at a higher rate than what was expected based on their previous score.

This model is only useful in situations where 1) the variation in spread of the points above and below the regression line is consistent across scores, and 2) the scale scores for each grade level are linearly related (e.g. linear growth is expected from year to year). If neither criteria are fulfilled, an alternate regression model such as those used for Student Growth Percentiles (covered later) may be used.

Projection Model

The Projection Model uses linear regression to predict or project scores in a future grade (or time period), which operates as a conditional status model. This model is similar in nature to the residual gain model (highlighted in the previous section), except that this model is used to predict a student (or groups of students’ scores) in a future grade. In application, the Projection Model may be used to assess whether a student is “on track” to meeting a future goal or if the student is making “adequate growth.” In order to calculate the linear regression equation, the model uses scores of students with similar score histories as the student(s) of interest as well as their scores in future grades to identify the expected growth a student should make given their score in the current grade.

One example of an interpretation suggested by the authors of the article would be “On average, students with a score of 110 on the grade 3 mathematics test and 250 on the grade 4 mathematics test have a predicted grade 5 mathematics score of 275.” Comparing the student’s expected score in the future with a set future goal score/benchmark, one should be able to identify if a student is on track to meet the future goal. If, in the example above, the goal score to meet “proficiency” in mathematics in grade 5 is a score of 260, and this student is projected to have a score of 275, then teachers may conclude that this student is on pace to reach “proficiency” in mathematics by grade 5 based on growth of students with similar scoring histories. In comparison to the Trajectory Model for growth prediction highlighted earlier in this article, the Projection Model is expected to have greater predictive accuracy. An example of the Projection Model being used to predict grade 4 assessment scores based on grade 3 assessment scores is included in the image below.

Specifically, in this example, a grade 3 student who scores a 350 is expected to score 364 on the assessment in grade 4. A student who scores a 356 in grade 3 is expected to score 382 in grade 4. The Projection Model equation for this example calculated using prior cohorts of student data in grades 3 and 4 is:

Predicted Grade 4 Score = -677.667 + (2.974*Observed Grade 3 Score)

All someone needs to do to predict a student’s grade in grade 4 is to insert their grade 3 score in the parentheses “Observed Grade 3 Score” and complete the equation. The linear regression equation may use multiple data points to predict a student’s future score. For example, if a teacher would like to predict a student’s score in grade 8 based on their grade 5 scores, they may use student scores in grades 3, 4, and 5 in their model:

In this equation, the placeholders a, b, and c would be replaced with the estimated regression weights for each grade level.

It is important to note that this model is more flexible compared to other models because it does NOT require a vertical scale, and that there can be multiple predictor variables including variables that are not previous test scores. Additionally, this model may be continuously updated based on new data that is collected (however, be sure to explain changes that occurred in the model if comparing growth over multiple years). Although this model can include multiple predictors, it also requires that each of the students who are used to create the model must have data for ALL of the predictor variables. Basically, if the model included students’ grades 3, 4, and 5 scores to predict student scores in grade 8, then each student used in the model would need to have scores in grades 3, 4, and 5. Another important thing to note is that this model is based on an assumption of a linear relationship between the outcome variable (the predicted future score) and the predictors (students’ previous scores). If a non-linear relationship exists between the outcome and the predictors, then this model may lose predictive accuracy and lead to inaccurate predictions. Lastly, this model may be applied to groups of students to identify what percentage of students are “on track” (or predicted to reach) a certain score.

Student Growth Percentile (SGP) Model

The Student Growth Percentile (SGP) model is used to describe “the relative location of a student’s current score compared to the current scores of students with similar score histories” (pg. 89 of full report). The location of the student’s score is calculated based on their percentile rank in relation to other students who have taken the same assessment that had the same/similar scores as the student of interest. For instance, if a student scored a 350 on their state math assessment in grade 3, and increased their score to 375 in grade 4, they would compare this student’s growth of 25 in comparison to the growth of other students who scored a 350 on the state math assessment in grade 3. If there were 10 students who scored a 350 on the state math assessment in grade 3, and this particular student’s growth was higher than seven of the other students, this student’s growth percentile (SGP) would be at the 70th percentile for students with a score of 350 in grade 3.

Student growth percentile may also be calculated across all students who took the assessment (regardless of score received at Time 1). It is important to note, however, that this percentile may be less accurate compared to using only students who received the same/similar scores. This inaccuracy may be due to different growth expectations depending on a student’s initial score. For instance, if a student scored in the 80th percentile with a score of 385 on the state math assessment in grade 3, there may be less opportunity to grow to reach the expected target score (e.g. 400) in grade 4, compared to a student who scored in the 40th percentile with a score of 350 in grade 3.

Typically, the SGP model is based on median quantile regression, which takes all of the students who had a specific score on an assessment at Time 1 and finds the median student score for these students on the assessment in the following year. This process would be replicated for each of the scores observed at Time 1. Then a simple linear regression line would be calculated based on the median scores received at Time 2 for each of the scores observed at Time 1. This line shows the expected score (i.e. median growth) which a student who received a specific score on the assessment at Time 1 must receive at Time 2 to have grown at the 50th percentile. An example of a graph of the SGP model is included below.

In the graph above, the dashed line represents the scores students must receive in grade 4 to have grown at the 50th percentile from their grade 3 score. As an example, if a student scored 345 on the assessment in grade 3, their expected score in grade 4 would be 354 (a growth of 9 scale score points). This means that of the students who scored a 345 on this assessment in grade 3, the student who scored in the middle in grade 4 increased their score by 9 points. On the graph, if the student scored above the dashed line, this indicates that they grew at a rate higher than 50% of their peers who scored 345 in grade 3, and if it is lower than the dashed line that means that their growth was lower than 50% of their peers who scored a 345 in grade 3.

Another example of a particular student’s percentile growth across grade levels is provided in the image below. As you can see, the green/orange/red shaded bar above “Next Year” on the x-axis shows the dispersion of the scores received by prior students who scored a 609 on the reading assessment in grade 6. Although this graph does not show the specific score the student must achieve to grow at the 50th percentile, it does categorize the student’s growth according to bands of student growth percentiles (1st-35th percentile = Low; 36th-65th percentile = Typical; 66th-99th percentile = High). As you can see, if this student grew within the “typical” growth band for students with a score of 609 in grade 6, their scale score may even decrease from the prior year (and their score category may even drop from “proficient” to “partially proficient”).

For more information about the Student Growth Percentile model, check out these short videos:

Multivariate Model

The Multivariate Model is intended to compare the amount of value added between one or more factors for a specific situation. As it relates to education, this model is used to express “…a teacher’s students’ performances in terms of their average distance from expectations. These expectations are set by considering students’ other test scores, average district performance, and the other teachers that the students have had” (pg. 104 of the full report). Overall, this model incorporates information over time, across subjects, and across teachers.

Specifically, this model may address this question given the following circumstances:

To put this model to an example, if we were to estimate the added value of being in a specific teacher’s classroom at a specific grade level, we would take all of the students who were enrolled in this teacher’s class and compare them with students who are similar (i.e. shared similar scores on other tests or shared similar teachers in the past). The average difference in scores between students in this specific teacher’s classroom and similar students outside of the teacher’s classroom would show how much of an additional impact this teacher made in comparison to other teachers at that same grade level with a similar group of students.

The multivariate model is flexible in the sense that it can use a wide variety and amount of data, and that it does not need to rely on a vertical scale. However, it is important to understand that this model assumes standard deviations are consistent across grades and subjects. Essentially, this means that the effects of teachers in prior grades are constant across test scales for future grades and subjects. Because this model is based on standard deviations, the interpretations from this model cannot be made on an absolute scale. Instead, an analysis is made using a top or bottom proportion for the teacher effects estimates or identifying the number of teachers a certain number of standard deviation units away from a reference point.

If you are interested in real-life applications of the multivariate model in education, check out the videos for the Education Value-Added Assessment System (EVAAS – 5:00) developed by the Statistical Analysis System (SAS), and the Tennessee Value-Added Assessment System (TVAAS – 12:30; 9:05 = Student Growth Percentile Model).

Conclusion

My goal for this article was to provide you with a quick snapshot of seven different models that may be used for measuring student growth with an intention to increase your awareness of how student growth is measured using various models, and in what contexts each model would be appropriately used. The article is by no means a comprehensive summary of each model. If you are looking for more information, I strongly encourage you to read the full report. My hope is that you will be able to use this information to identify the ways in which student growth is measured in your school, and to ensure that the growth model your school currently uses is appropriate for the assessment data in which your school collects.

Collaboration Column

If you have any feedback or tips/resources related to measuring student growth, please post them in the comments section at the end of the article!

If anything in this article was confusing or unclear please, please, PLEASE let me know by emailing me at Chris.Thompson@k12.nd.us with the subject line “Data Champions” and I will do my best to clarify my unclarity!

ABOUT

South East Education Cooperative

The South East Education Cooperative (SEEC) is one of eight Regional Education Associations (REAs) in North Dakota. Its membership includes 36 public school districts and four private schools in the southeast portion of N.D. Through these members the SEEC serves over 34,800 N.D. students. REAs strive to offer consistent high-quality programs and services in the areas of professional development, technology support, data systems support, school improvement support, and curriculum enrichment that reflect the needs of its region.