SAT I: A Faulty Instrument For Predicting College Success

Promotional claims for the SAT I frequently tout the test's important place in the "toolbox" of college admissions officers trying to distinguish between students from vastly different high schools. Yet the true utility of the SAT I is frequently lost in this rhetoric as admissions offices search for a fair and accurate way to compare one student to another. Many colleges and universities around the country, in dropping their test score requirements, have recently confirmed what the research has shown all along - the SAT I has little value in predicting future college performance.

What is the SAT I supposed to measure?

The SAT I is designed to predict first-year college grades - it is not validated to predict grades beyond the freshman year, graduation rates, pursuit of a graduate degree, or for placement or advising purposes. However, according to research done by the tests' manufacturers, class rank and/or high school grades are still both better predictors of college performance than the SAT I.
How well does the SAT I predict first-year college grades? The College Board and ETS conduct periodic studies of the SAT I. This usually involves examining the relationship between test scores and first-year college grades, generally expressed as the correlation coefficient (or r value). The College Board's Handbook for the SAT Program 2000-2001 claims the SAT-V and SAT-M have a correlation of .47 and .48, respectively, with freshman GPA (FGPA). This number is deceptive, however. To determine how much of the difference in first-year grades between students the SAT I really predicts, the correlation coefficient must be multiplied by itself. The result, called r squared, describes the difference (or variation) among college freshman grades. Thus, the predictive ability (or r squared) of the SAT I is just .22, meaning the test explains only 22% of the variation in freshman grades. With a correlation of .54, high school grades alone do a better job, explaining almost 30% of the variance in first-year college performance.

What do the SAT I validity studies from major colleges and universities show?

Validity research at individual institutions illustrates the weak predictive ability of the SAT. One study (J. Baron & M. F. Norman in Educational and Psychology Measurement, Vol. 52, 1992) at the University of Pennsylvania looked at the power of high school class rank, SAT I, and SAT II in predicting cumulative college GPAs. Researchers found that the SAT I was by far the weakest predictor, explaining only 4% of the variation in college grades, while SAT II scores accounted for 6.8% of the differences in academic performance. By far the most useful tool proved to be class rank, which predicted 9.3% of the changes in cumulative GPAs. Combining SAT I scores and class rank inched this figure up to 11.3%, leaving almost 90% of the variation in grades unexplained.

Another study of 10,000 students at 11 selective public and private institutions of higher education found that a 100-point increase in SAT combined scores, holding race, gender, and field of study constant, led to a one-tenth of a grade point gain for college GPA (Vars, F. & Bowen, W. in The Black-White Test Score Gap, 1998). This offered about the same predictive value as looking at whether an applicant's father had a graduate degree or her mother had completed college.

After a three-year validity study analyzing the power of the SAT I, SAT II, and high school grades to predict success at the state's eight public universities, University of California (UC) President Richard Atkinson presented a proposal in February 2001 to drop the SAT I requirement for UC applicants. The results from the UC validity study, which tracked 80,000 students from 1996-1999, highlighted the weak predictive power of the SAT I, with the test accounting for only 12.8% of the variation in FGPA. SAT II's and HSGPA explained 15.3% and 14.5% of the variation, respectively. After taking SAT II and HSGPA into account, SAT I scores improved the prediction rate by a negligible 0.1% (from 21.0% to 21.1%), making it a virtually worthless additional piece of information. Furthermore, SAT I scores proved to be more susceptible to the influence of the socioeconomic status of an applicant than either the SAT II or HSGPA.

Bates College, which dropped all pre-admission testing requirements in 1990, first conducted several studies to determine the most powerful variables for predicting success at the college. One study showed that students' self-evaluation of their "energy and initiative" added more to the ability to predict performance at Bates than did either Math or Verbal SAT scores. In comparing five years of enrollees who submitted SAT scores with those who didn't, Bates found that while "non-submitters" averaged 160 points lower on the SAT I, their freshman GPA was only five one-hundredths of a point lower than that of "submitters."

How well does the SAT I predict success beyond the freshman year?

If one looks beyond college grades, information from The Case Against the SAT by James Crouse and Dale Trusheim actually points to the SAT I's poor utility in forecasting long-term success. Data they analyzed demonstrated that using the high school record alone to predict who would complete a bachelor's degree resulted in "correct" admissions decisions 73.4% of the time, while using the SAT I and high school GPA forecast "correct" admissions in 72.2% of the cases.

How well does the SAT I predict college achievement for females, students of color, and older students?

The poor predictive ability of the SAT I becomes particularly apparent when considering the college performance of females. Longstanding gaps in scores between males and females of all races show that females on average score 35-40 points lower than males on the SAT I, but receive better high school and college grades. In other words, the test consistently under-predicts the performance of females in college while over-predicting that of males.

Measuring the SAT I's predictive ability for students of color is more complicated since racial classifications are arbitrary. For students whose first language isn't English, test-maker research shows the SAT I frequently under-predicts their future college performance. One study at the University of Miami compared Hispanic and non-Hispanic White students. Though both groups earned equivalent college grades, the Hispanic students received on average combined SAT I scores that were 91 points lower than their non-Hispanic White peers. This gap existed despite the fact that 89% of the Hispanic students reported English as their best language.

Extensive research compiled by Derek Bok and William Bowen in The Shape of the River highlights the SAT I's questionable predictive power for African-American students. The ability of SAT I scores to predict freshman grades, undergraduate class rank, college graduation rates, and attainment of a graduate degree is weaker for African-American students than for Whites. Such discrepancies call into question the usefulness of using the SAT I to assess African-American students' potential.

The SAT I also does a poor job of forecasting the future college performance of older students. ETS acknowledges that the test's predictive power is lower for "non-traditional" students who may be out of practice taking timed, multiple-choice exams. For this reason, many colleges and universities do not require applicants who have been out of high school for five years or more, or those over age 25, to submit test scores.

How should a college or university go about conducting its own validity study?

Through its Validity Study Service (VSS), the College Board analyzes the SAT I's predictive power for colleges and universities. In a 1991 Harvard Education Review article, James Crouse and Dale Trusheim propose a major overhaul of the College Board's traditional validity studies. Claiming that the VSS's methods "are significantly flawed, and this leads colleges to misleadingly positive conclusions" about the SAT I, Crouse and Trusheim argue for the inclusion of two categories aimed at measuring the real world ways standardized tests are used.

The first addition - a "Crosstabulation of Predicted Grades" - would show the extent to which a college would admit the same students regardless of whether it uses the high school record, or high school record plus test scores. The second - a table predicting "College Outcomes" - would show to what degree adding SAT I scores to the high school record improves the rate of college graduation, the average high school grades of admitted students, and the percent of admitted students with average first-year college grade-point averages above 2.5. They also recommend including separate prediction and tabulation tables broken down by gender, ethnicity, family income, age, and whatever other criteria colleges believe would affect performance. In addition, schools should look at how well the SAT I predicts other outcomes such as graduation rates and four-year grades.

What's the alternative?

The weak predictive power of the SAT I, its susceptibility to coaching, examples of test score misuse, and the negative impact test score use has on educational equity all lead to the same conclusion - test scores should be optional in college admissions. The nearly 400 colleges and universities that already admit substantial numbers of freshman applicants without regard to test scores have shown that class rank, high school grades, and rigor of classes taken are better tools for predicting college success than any standardized test. The ACT and SAT II are often viewed as alternatives to the SAT I. While they are more closely aligned with high school curricula, they are not necessarily better tests.