EVENTS

Grade inflation-3: How do we independently measure learning?

Recall (see here and here for previous postings) that to argue that grade inflation has occurred, it is not sufficient to simply show that grades have risen. It must be shown grades have risen without a corresponding increase in learning and student achievement. And that is difficult to do because there are really no good independent measures of student learning, apart from grades.

Some have argued that the SAT scores of matriculating classes could be used as a measure of student ‘ability’ and could thus be used to see if universities are getting ‘better’ students, thus justifying the rise in grades.

But the use of SAT scores as a measure of student quality or abilities has always been deeply problematic, so it is not even clear that any rise in SAT scores of incoming students means anything. One reason is that the students who take the SAT tests are a self-selected group and not a random sample, so one cannot infer much from changes in SAT scores. Second, SAT scores have not been shown to be predictive of anything really useful. There is a mild correlation of SAT scores with first year college grades but that is about it.

Even at Case, not all matriculating students have taken the SAT’s. Also the average total SAT scores from 1985-1992 was 1271, while the average from 1993-2005 was 1321. This rise in SAT scores of incoming students at Case would be affected by two factors, the first being the re-centering of SAT scores that occurred in 1995. It is not known whether the pre-1995 scores we have at Case are the original ones or have been raised to adjust for re-centering. This lack of knowledge makes it hard to draw conclusions about how much, if at all, SAT scores have risen at Case.

Alfie Kohn cites “Trends in College Admissions” reports that say that the average verbal-SAT score of students enrolled in all private colleges rose from 543 in 1985 to 558 in 1999. It is also the fact that it was around 1991 that Case instituted merit scholarships based on SAT scores and started aggressively marketing it as a recruiting tool. So it is tempting to argue that there has been a genuine rise in SAT scores for students at Case.

Another local factor at Case that would influence GPAs is the practice of “freshman forgiveness” that began in 1987. Under this program, students in their first year would be “forgiven” any F grades they received and this F would not be counted towards their GPA. This is bound to have the effect of increasing the overall GPA, although a very rough estimate suggests only a 1-2% increase. This practice was terminated in 2005.

The Rosovsky-Hartley monograph points to the fact that many more students in colleges are now enrolled in remedial courses than was the case in the past, arguing that this implies that students are actually worse now. But again, that inference is not clear. Over the recent past there has been a definite shift in emphasis in colleges of now wanting to retain the students they recruit. The old model of colleges recruiting more students than they needed and then ‘weeding’ them out using certain courses in their first year, is no longer in vogue, assuming that there was substance to that belief and it is not just folklore.

Now universities go to great lengths to provide assistance to their students, beefing up their advising, tutoring, and other programs to help student stay in school. So the increased enrollment of students in remedial courses may simply be the consequence of universities taking a much more proactive attitude to helping students, rather than a sign of declining student quality. All these measures are aimed at improving student performance and are another possible benign explanation for any rise in grades. In fact, all these remedial and assistance programs could be used to argue that a rise in grades could be due to actual improved student performance.

Alfie Kohn argues that taking all these things into account, there is no evidence for grade inflation, that this is an issue that has been blown way out of proportion by those who have a very narrow concept of the role of grades in learning. Kohn says there are many reasons why grades could rise:

Maybe students are turning in better assignments. Maybe instructors used to be too stingy with their marks and have become more reasonable. Maybe the concept of assessment itself has evolved, so that today it is more a means for allowing students to demonstrate what they know rather than for sorting them or “catching them out.” (The real question, then, is why we spent so many years trying to make good students look bad.) Maybe students aren’t forced to take as many courses outside their primary areas of interest in which they didn’t fare as well. Maybe struggling students are now able to withdraw from a course before a poor grade appears on their transcripts. (Say what you will about that practice, it challenges the hypothesis that the grades students receive in the courses they complete are inflated.)

The bottom line: No one has ever demonstrated that students today get A’s for the same work that used to receive B’s or C’s. We simply do not have the data to support such a claim.

In addition to the factors listed by Kohn, psychologist Steve Falkenberg points out a number of other reasons why average grades could rise. His essay is a particularly thoughtful one that is worth reading.

Part of the problem in judging whether grade inflation exists is that we don’t know what the actual grade distribution in colleges should be. Those who argue that it should be a bell curve (or ‘normal’ distribution) with an average around C are mixing up a normative approach to assessment (as is used for IQ tests and SATs) with an achievement approach.

IQ tests and SATs are designed so that the results are spread out over a bell curve. They seek to measure a characteristic (called “intelligence'”) that is supposedly distributed randomly in the population according to a normal distribution. (This assumption and the whole issue of what constitutes intelligence is the source of a huge controversy that I don’t want to get into here.) So the goal of such tests is to sort students into a hierarchy, and they design tests that spread out the scores so that one can tell who is in the top 10% and so on.

But when you teach a class of students, you are no longer dealing with a random sample of the population. First of all, you are not giving your assessments to people off the street. The students have been selected based on their prior achievements and are no longer a random sampling of the population. Secondly, by teaching them, you are deliberately intervening and skewing the distribution. Thirdly, your tests should not be measuring the same random variable that things like the SATs measure. If they were, you might as well give your students their grades based on those tests.

Tests should not be measures of some intrinsic ability, even assuming that such a thing exists and can be measured and a number assigned to it. Tests are (or at least should be) measuring achievement of how much and how well a selected group of students have learned as a result of your instruction. Hence there is no reason at all to expect a normal distribution. In fact, you would expect to have a distribution that is skewed towards the high end. The problem, if it can be considered a problem, is that we don’t know a priori what that skewed distribution should look like or whether there is a preferred distribution at all. After all, there is nothing intrinsically wrong with everyone in a class getting As, if they have all learned the material at a suitably high level.

In fact, as Ohmer Milton, Howard Pollio, and James Eison write in Making Sense of College Grades (Jossey-Bass, 1986): “It is not a symbol of rigor to have grades fall into a ‘normal’ distribution; rather, it is a symbol of failure — failure to teach well, failure to test well, and failure to have any influence at all on the intellectual lives of students.”

There is nothing intrinsically noble about trying to keep average grades unchanged over the years, which is what those who complain about grade inflation usually want to do.

On the other hand, one could make the reasonable case that as we get better at teaching and in creating the conditions that make students learn better, and as a consequence we get students who are able to learn more, then perhaps we should raise our expectations of students and provide more challenging assignments, so that they can rise to greater heights. This is a completely different discussion. If we do so, this might result in a drop in grades. But this drop is a byproduct of a thoughtful decision to make learning better, not caused by an arbitrary decision to keep average grades fixed.

This approach would be like car manufacturers and consumers raising their standards over the years so that we now expect a lot more from our cars than we did fifty years ago. Even the best cars of fifty years ago would not be able to meet the current standards of fuel efficiency, safety, and emissions. But the important thing to keep in mind is that standards have been raised along with the ability to make better cars able to meet the higher standards.

But in order to take this approach in education, it requires teachers to think carefully about what and how we assess, what we can reasonably expect of our students, and how we should teach so they can learn more and learn better. Unfortunately much of the discussion of grade inflation short-circuits this worthwhile aspect of the issue, choosing instead to go for the quick fix like putting limits for the number of grades awarded in each category.

It is perhaps worthwhile to remember that fears about grade inflation, that high grades are being given for poor quality work, have been around for a long time, especially at elite institutions. The Report of the Committee on Raising the Standard at Harvard University said: “Grades A and B are sometimes given too readily — Grade A for work of no very high merit, and Grade B for work not far above mediocrity. … One of the chief obstacles to raising the standards of the degree is the readiness with which insincere students gain passable grades by sham work.”

That statement was made in 1894.

POST SCRIPT: Cindy Sheehan in Cleveland tomorrow

Cindy Sheehan will speak at a Cleveland Town Hall Meeting Saturday, March 25, 1-3 pm

Progressive Democrats of Ohio present Gold Star Mother and PDA Board Member Cindy Sheehan at a Town Hall Meeting on Saturday, March 25, 2006 from 1 – 3 p.m. at the Beachland Ballroom, 15711 Waterloo Road in Cleveland’s North Collinwood neighborhood. (directions.)