Assessment

“Why do we have such trouble telling faculty what they are going to do?” said the self-identified administrator, hastening to add that he “still thinks of himself as part of the faculty.”

“They are our employees, after all. They should be doing what we tell them to do.”

Across a vast number of models for assessment, strategic planning, and student services on display at last month’s IUPUI Assessment Institute, it was disturbingly clear that assessment professionals have identified “The Faculty” (beyond the lip service to #notallfaculty, always as a collective body) as the chief obstacle to successful implementation of campuswide assessment of student learning. Faculty are recalcitrant. They are resistant to change for the sake of being resistant to change. They don’t care about student learning, only about protecting their jobs. They don’t understand the importance of assessment. They need to be guided toward the Gospel with incentives and, if those fail, consequences.

Certainly, one can find faculty members of whom these are true; every organization has those people who do just enough to keep from getting fired. But let me, at risk of offending the choir to whom keynote speaker Ralph Wolff preached, suggest that the faculty-as-enemy trope may well be a problem of the assessment field’s own making. There is a blindness to the organizational and substantive implications of assessment, hidden behind the belief that assessment is nothing more than collecting, analyzing, and acting rationally on information about student learning and faculty effectiveness.

Assessment is not neutral. In thinking of assessment as an effort to determine whether students are learning and faculty are being effective, it is imperative that we unpack the implicit subject doing the determining. That should make clear that assessment is first and foremost a management rather than a pedagogical practice. Assessment not reported to the administration meets the requirements of neither campus assessment processes nor accreditation standards, and is thus indistinguishable from non-assessment. As a fundamental principle of governance in higher education, assessment is designed to promote what social scientist James Scott has called “legibility”: the ability of outsiders to understand and compare conditions across very different areas in order to facilitate those outsiders’ capacity to manage.

The Northwest Commission on Colleges and Universities, for example, requires schools to practice “ongoing systematic collection and analysis of meaningful, assessable, and verifiable data” to demonstrate mission fulfillment. That is not simply demanding that schools make informed judgments. Data must be assessable and verifiable so that evaluators can examine the extent to which programs revise their practices using the assessment data. They can’t do that unless the data make sense to them. Administrators make the same demand on their departments through campus assessment processes. In the process a hierarchical, instrumentally rational, and externally oriented management model replaces one that has traditionally been decentralized, value rational, and peer-driven.

That’s a big shift in power. There are good (and bad) arguments to be made in favor of (and opposed to) it, and ways of managing assessment that shift that power more or less than others. Assessment professionals are naïve, however, to think that those shifts don’t happen, and fools to think that the people on the losing end of them will not notice or simply give in without objection.

At the same time, assessment also imposes substantive demands on programs through its demand that they “close the loop” and adapt their curriculums to those legible results regardless of how meaningful those results are to the programs themselves. An externally valid standard might demand significant changes to the curriculum that move the program away from its vision.

In my former department we used the ETS Major Field Test as such a standard. But while the MFT tests knowledge of political science as a whole, in political science competence is specific to subfields. Even at the undergraduate level students specialize sufficiently to be, for example, fully conversant in international relations and ignorant of political thought. The overall MFT score does not distinguish between competent specialization and broad mediocrity. One solution was to expect that students would demonstrate excellence in at least one subfield of the discipline. The curriculum would then have to require that students took nearly every course we offered in a subfield, and staffing realities in our program would inevitably make that field American politics.

Because the MFT was legible to a retired Air Force officer (the institutional effectiveness director), an English professor (the dean), a chemist (the provost), and a political appointee with no previous experience in higher education (the president), it stayed in place as a benchmark of progress, but offered little to guide program management. The main tool we settled on was an assessment of the research paper produced in a required junior-level research methods course (that nearly all students put off to their final semester). That assessment gave a common basis for evaluation (knowledge of quantitative research methods) and allowed faculty to evaluate substantive knowledge in a very narrow range of content through the literature review. But it also shifted emphasis toward quantitative work in the discipline, and further marginalized political thought altogether since that subfield isn’t based on empirical methods. We considered adding a political thought assignment, but that would have required students to prioritize that over the empirical fields (no other substantive field having a required assignment) rather than putting it on equal footing.

Evaluating a program with “meaningful, assessable, and verifiable data” can’t be done without changing the program. To “close the loop” based on MFT results required a substantive change in how we saw our mission: from producing well-rounded students to specialists in American politics. To do so with the methods paper required changes in course syllabuses and advising to bring more emphasis on empirical fields, more quantitative rather than qualitative work within those fields, more emphasis on methods supporting conclusions rather than the substance of the conclusions, and less coursework in political thought. We had a choice between these options. But we could not choose an option that would not require change in response to the standard, not just the results.

This is the reality facing those, like the administrator I quoted at the beginning of this essay, who believe that they can tell faculty what to do with assessment without telling them what to do with the curriculum. If assessment requires that a program make changes based on the results of its assessment processes, then the selection of processes defines a domain of curricular changes that can result. Some of these will be unavoidable: a multiple-choice test will require faculty to favor knowledge transmission over synthetic thinking. Others will be completely proscribed: if employment in the subfield of specialization is an assessment measure, the curriculum in political thought will never be reinforced, because people don’t work in political thought. But no process can be neutral among all possible curriculums.

Again, that may or may not be a bad thing. Sometimes a curriculum just doesn’t work, and assessment can be a way to identify it and replace it with something that does. But the substantive influence of assessment is most certainly a thing one way or the other, and that thing means that assessment professionals can’t say that assessment doesn’t change what faculty teach and how they teach it. When they tell faculty members that, they appear at best clueless and at worst disingenuous. With most faculty members having oversensitive BS detectors to begin with, especially when dealing with administrators, piling higher and deeper doesn’t exactly win friends and influence people.

The blindness that comes from belief in organizationally and curricularly neutral assessment is, I think, at the heart of the condescending attitudes toward faculty at the Assessment Institute. In the day two plenary session, one audience member asked, essentially, “What do we do about them?” as if there were no faculty members in the room. The faculty member next to me was quick to tune out as the panel took up the discussion with the usual platitudes about buy-in and caring about learning.

Throughout the conference there was plenty of discussion of why faculty members don’t “get it.” Of how to get them to buy into assessment on the institutional effectiveness office’s terms. Of providing effective incentives — carrots, yes, but plenty of sticks — to get them to cooperate. Of how to explain the importance of accreditation to them, as if they are unaware of even the basics. And of faculty paranoia that assessment was a means for the administration to come for their jobs.

What there wasn’t: discussion of what the faculty’s concerns with assessment actually are. Of how assessment processes do in fact influence what happens in classrooms. Of how assessment feeds program review, thus influencing administrative decisions about program closure and the allocation of tenure lines (especially of the conversion of tenure lines to adjunct positions when vacancies occur). Of the possibility that assessment might have unintended consequences that hinder student learning. These are very real concerns for faculty members, and should be for assessment professionals as well.

Nor was there discussion of what assessment professionals can do to work with faculty in a relationship that doesn’t subordinate faculty. Of how assessment professionals can build genuinely collaborative rather than merely cooptive relationships with faculty members. Of, more than anything, the virtues of listening before telling. When it comes to these things, it is the assessment field that doesn’t “get it.”

Let me assure you, as a former faculty member who talks about these issues with current ones: faculty members do care about whether students learn. In fact, many lose sleep over it. Faculty members informally assess their teaching techniques every time they leave a classroom and adjust what they do accordingly. In fact, that usually happens before they walk back into that classroom, not at the end of a two-year assessment cycle. Faculty members most certainly feel disrespected by suggestions they only care for themselves. In fact, it is downright offensive to suggest that they are selfish when in order to make learning happen they frequently make less than their graduates do and live in the places their graduates talk of escaping.

Assessment professionals need to approach faculty members as equal partners rather than as counterrevolutionaries in need of reeducation. That’s common courtesy, to be sure. But it is also essential if assessment is to actually improve student learning.

Calls for scorecards and rating systems of higher education institutions that have been floating around Washington, if used for purposes beyond providing comparable consumer information, would make the federal government an arbiter of quality and judge of institutional performance.

This change would undermine the comprehensive, careful scrutiny currently provided by regional accrediting agencies and focus on cursory reviews.

Regional accreditors provide a peer-review process that sparks an investigation into key challenges institutions face to look beyond symptoms for root causes. They force all providers of postsecondary education to investigate closely every aspect of performance that is crucial to strengthening institutional excellence, improvement, and innovation. If you want to know how well a university is really performing, a graduation rate will only tell you so much.

But the peer-review process conducted by accrediting bodies provides a view into the vital systems of the institution: the quality of instruction, the availability and effectiveness of student support, how the institution is led and governed, its financial management, and how it uses data.

Moreover, as part of the peer-review process, accrediting bodies mobilize teams of expert volunteers to study governance and performance measures that encourage institutions to make significant changes. No government agency can replace this work, can provide the same level of careful review, or has the resources to mobilize such an expert group of volunteers. In fact, the federal government has long recognized its own limitations and, since 1952, has used accreditation by a federally recognized accrediting agency as a baseline for institutional eligibility for Title IV financial-aid programs.

Attacked at times by policy makers as an irrelevant anachronism and by institutions as a series of bureaucratic hoops through which they must jump, the regional accreditors’ approach to quality control has rather become increasingly more cost-effective, transparent, and data- and outcomes-oriented.

Higher education accreditors work collaboratively with institutions to develop mutually agreed-upon common standards for quality in programs, degrees, and majors. In fact, in the Southern region, accreditation has addressed public and policy maker interests in gauging what students gain from their academic experience by requiring, since the 1980s, the assessment of student learning outcomes in colleges. Accreditation agencies also have established effective approaches to ensure that students who attend institutions achieve desired outcomes for all academic programs, not just a particular major.

While the federal government has the authority to take actions against institutions that have proven deficient, it has not used this authority regularly or consistently. A letter to Congress from the American Council on Education and 39 other organizations underscored the inability of the U.S. Department of Education to act with dispatch, noting that last year the Department announced “it would levy fines on institutions for alleged violations that occurred in 1995 -- nearly two decades prior.”

By contrast, consider that in the past decade, the Southern Association of Schools and Colleges Commission on Colleges stripped nine institutions of their accreditation status and applied hundreds of sanctions to all types of institutions (from online providers to flagship campuses) in its region alone. But, when accreditors have acted boldly in recent times, they been criticized by politicians for going too far, giving accreditors the sense that we’re “damned if we do, damned if we don’t.”

The Problem With Simple Scores

Our concern about using rating systems and scorecards for accountability is based on several factors. Beyond tilting the system toward the lowest common denominator of quality, rating approaches can create new opportunities for institutions to game the system (as with U.S. News & World Report ratings and rankings) and introduce unintended consequences as we have seen occur in K-12 education.

Over the past decade, the focus on a few narrow measures for the nation’s public schools has not led to significant achievement gains or closing achievement gaps. Instead, it has narrowed the curriculum and spurred the current public backlash against overtesting. Sadly, the data generated from this effort have provided little actionable information to help schools and states improve, but have actually masked -- not illuminated -- the root causes of problems within K-12 institutions.

Accreditors recognize that the complex nature of higher education requires that neither accreditors nor the government should dictate how individual institutions can meet desired outcomes. No single bright line measure of accountability is appropriate for the vast diversity of institutions in the field, each with its own unique mission. The fact that students often enter and leave the system and increasingly earn credits from multiple institutions further complicates measures of accountability.

Moreover, setting minimal standards will not push institutions that think they are high performing to get better. All institutions – even those considered “elite” – need to work continually to achieve better outcomes and should have a role in identifying key outcomes and strategies for improvement that meet their specific challenges.

Accreditors also have demonstrated they are capable of addressing new challenges without strong government action. With the explosion of online providers, accreditors found a solution to address the challenges of quality control for these programs. Accrediting groups partnered with state agencies, institutions, national higher education organizations, and other stakeholders to form the State Authorization Reciprocity Agreements, which use existing regional higher education compacts to allow for participating states and institutions to operate under common, nationwide standards and procedures for regulating postsecondary distance education. This approach provides a more uniform and less costly regulatory environment for institutions, more focused oversight responsibilities for states, and better resolution of complaints without heavy-handed federal involvement.

Along with taking strong stands to sanction higher education institutions that do not meet high standards, regional accreditors are better-equipped than any centralized governmental body at the state or national level to respond to the changing ecology of higher education and the explosion of online providers.

We argue for serious -- not checklist -- approaches to accountability that support improving institutional performance over time and hold institutions of all stripes to a broad array of criteria that make them better, not simply more compliant.

Belle S. Wheelan is president of the Southern Association of Colleges and Schools Commission on Colleges, the regional accrediting body for 11 states and Latin America. Mark A. Elgart is founding president and chief executive officer for AdvancED, the world’s largest accrediting body and parent organization for three regional K-12 accreditors.

"Competency-based” education appears to be this year’s answer to America’s higher education challenges, judging from this week's news in Washington. Unlike MOOCs (last year’s solution), there is, refreshingly, greater emphasis on the validation of learning. Yet, all may not be as represented.

On close examination, one might ask if competency-based education (or CBE) programs are really about “competency,” or are they concerned with something else? Perhaps what is being measured is more closely akin to subject matter “mastery.” The latter can be determined in a relatively straightforward manner, using various forms of examinations, projects and other forms of assessment.

However, an understanding of theories, concepts and terms tells us little about an individual’s ability to apply any of these in practice, let alone doing so with the skill and proficiency which would be associated with competence.

Deeming someone competent, in a professional sense, is a task that few competency-based education programs address. While doing an excellent job, in many instances, of determining mastery of a body of knowledge, most fall short in the assessment of true competence.

In the course of their own education, readers can undoubtedly recall the instructors who had complete command of their subjects, but who could not effectively present to their students. The mastery of content did not extend to their being competent as teachers. Other examples might include the much-in-demand marketing professors who did not know how, in practice, to sell their executive education programs. Just as leadership and management differ one from the other, so to do mastery and competence.

My institution has been involved in assessing both mastery and competence for several decades. Created by New York’s Board of Regents in the early 1970s, it is heir to the Regents’ century-old belief in the importance of measuring educational attainment (New York secondary students have been taking Regent’s Exams, as a requirement for high school graduation, since 1878).

Building on its legacy, the college now offers more than 60 subject matter exams. These have been developed with the help of nationally known subject matter experts and a staff of doctorally prepared psychometricians. New exams are field tested, nationally normed and reviewed for credit by the American Council on Education, which also reviews the assessments of ETS (DSST) and the College Board (CLEP). Such exams are routinely used for assessing subject matter mastery.

In the case of the institution’s competency-based associate degree in nursing, a comprehensive, hands-on assessment of clinical competence is required as a condition of graduation. This evaluation, created with the help of the W.K. Kellogg Foundation in 1975, takes place over three days in an actual hospital, with real patients, from across the life span -- pediatric to geriatric. Performance is closely monitored by multiple, carefully selected and trained nurse educators. Students must demonstrate skill and ability to a level of defined competence within three attempts or face dismissal or transfer from the program.

In developing a competency-based program as opposed to a mastery-based one, there are many challenges that must be addressed if the program is to have credibility. These include:

Who specifies the elements to be addressed in a competency determination? In the case of nursing, this is done by the profession. Other fields may not be so fortunate. For instance, who would determine the key areas of competency in the humanities or arts?

Who does the assessing, and what criteria must be met to be seen as a qualified assessor of someone’s competency?

How will competence be assessed, and is the process scalable? In the nursing example above, we have had to establish a national network of hospitals, as well as recruit, train and field a corps of graduate prepared nurse educators. At scale, this infrastructure is limited to approximately 2,000 competency assessments per year, which is far less than the number taking the College’s computer-based mastery examinations.

Who is to be served by the growing number of CBE programs? Are they returning adults who have been in the workplace long enough to acquire relevant skills and knowledge on the job, or is CBE thought to be relevant even for traditional-aged students?

(It is difficult to imagine many 22 year-olds as competent within a field or profession. Yet, there is little question that most could show some level of mastery of a body of knowledge for which prepared.)

Do prospective students want this type of learning/validation? Has there been market research that supports the belief that there is demand? We have offered two mastery-based bachelor’s degrees (each for less than $10,000) since 2011. Demand has been modest because of uncertainty about how a degree earned in such a manner might be viewed by employers and graduate schools (this despite the fact that British educators have offered such a model for centuries).

Will employers and graduate schools embrace those with credentials earned in a CBE program? Institutions that have varied from the norm (dropping the use of grades, assessing skills vs. time in class) have seen their graduates face admissions challenges when attempting to build on their undergraduate credentials by applying to graduate schools. As for employers, a backlash may be expected if academic institutions sell their graduates as “competent” and later performance makes clear that they are not.

The interest in CBE has, in large part, been driven by the fact that employers no longer see new college graduates as job-ready. In fact, a recent Lumina Foundation report found that only 11 percent of employers believe that recent graduates have the skills needed to succeed within their work forces. One CBE educator has noted, "We are stopping one step short of delivering qualified job applicants if we send them off having 'mastered' content, but not demonstrating competencies."

Or, as another put it, somewhat more succinctly, "I don't give a damn what they KNOW. I want to know what they can DO.”

The move away from basing academic credit on seat time is to be applauded. Determining levels of mastery through various forms of assessment -- exams, papers, projects, demonstrations, etc. – is certainly a valid way to measure outcomes. However, seat time has rarely been the sole basis for a grade or credit. The measurement tools listed here have been found in the classroom for decades, if not centuries.

Is this a case of old wine in new bottles? Perhaps not. What we now see are programs being approved for Title IV financial aid on the basis of validated learning, not for a specified number of instructional hours; whether the process results in a determination of competence or mastery is secondary, but not unimportant.

A focus on learning independent of time, while welcome, is not the only consideration here. We also need to be more precise in our terminology. The appropriateness of the word competency is questioned when there is no assessment of the use of the learning achieved through a CBE program. Western Governors University, Southern New Hampshire, and Excelsior offer programs that do assess true competency.

Unfortunately, the vast majority of the newly created CBE programs do not. This conflation of terms needs to be addressed if employers are to see value in what is being sold. A determination of “competency” that does not include an assessment of one’s ability to apply theories and concepts cannot be considered a “competency-based” program.

To continue to use “competency” when we mean “mastery” may seem like a small thing. Yet, if we of the academy cannot be more precise in our use of language, we stand to further the distrust which many already have of us. To say that we mean “A” when in fact we mean “B” is to call into question whether we actually know what we are doing.

John F. Ebersole is the president of Excelsior College, in Albany, N.Y.

In their effort to improve outcomes, colleges and universities are becoming more sophisticated in how they analyze student data – a promising development. But too often they focus their analytics muscle on predicting which students will fail, and then allocate all of their support resources to those students.

That’s a mistake. Colleges should instead broaden their approach to determine which support services will work best with particular groups of students. In other words, they should go beyond predicting failure to predicting which actions are most likely to lead to success.

Higher education institutions are awash in the resources needed for sophisticated analysis of student success issues. They have talented research professionals, mountains of data and robust methodologies and tools. Unfortunately, most resourced-constrained institutional research (IR) departments are focused on supporting accreditation and external reporting requirements.

Some institutions have started turning their analytics resources inward to address operational and student performance issues, but the question remains: Are they asking the right questions?

Colleges spend hundreds of millions of dollars on services designed to enhance student success. When making allocation decisions, the typical approach is to identify the 20 to 30 percent of students who are most “at risk” of dropping out and throw as many support resources at them as possible. This approach involves a number of troubling assumptions:

The most “at risk” students are the most likely to be affected by a particular form of support.

Every form of support has a positive impact on every “at risk” student.

Students outside this group do not require or deserve support.

What we have found over 14 years working with students and institutions across the country is that:

There are students whose success you can positively affect at every point along the risk distribution.

Different forms of support impact different students in different ways.

The ideal allocation of support resources varies by institution (or more to the point, by the students and situations within the institution).

Another problem with a risk-focused approach is that when students are labeled “at risk” and support resources directed to them on that basis, asking for or accepting help becomes seen as a sign of weakness. When tailored support is provided to all students, even the most disadvantaged are better-off. The difference is a mindset of “success creation” versus “failure prevention.” Colleges must provide support without stigma.

To better understand impact analysis, consider Eric Siegel’s book Predictive Analytics. In it, he talks about the Obama 2012 campaign’s use of microtargeting to cost-effectively identify groups of swing voters who could be moved to vote for Obama by a specific outreach technique (or intervention), such as piece of direct mail or a knock on their door -- the “persuadable” voters. The approach involved assessing what proportion of people in a particular group (e.g., high-income suburban moms with certain behavioral characteristics) was most likely to:

vote for Obama if they received the intervention (positive impact subgroup)

vote for Obama or Romney irrespective of the intervention (no impact subgroup)

vote for Romney if they received the intervention (negative impact subgroup)

The campaign then leveraged this analysis to focus that particular intervention on the first subgroup.

This same technique can be applied in higher education by identifying which students are most likely to respond favorably to a particular form of support, which will be unmoved by it and which will be negatively impacted and dropout.

Of course, impact modeling is much more difficult than risk modeling. Nonetheless, if our goal is to get more students to graduate, it’s where we need to focus analytics efforts.

The biggest challenge with this analysis is that it requires large, controlled studies involving multiple forms of intervention. The need for large controlled studies is one of the key reasons why institutional researchers focus on risk modeling. It is easy to track which students completed their programs and which did not. So, as long as the characteristics of incoming students aren’t changing much, risk modeling is rather simple.

However, once you’ve assessed a student’s risk, you’re still left trying to answer the question, “Now what do I do about it?” This is why impact modeling is so essential. It gives researchers and institutions guidance on allocating the resources that are appropriate for each student.

There is tremendous analytical capacity in higher education, but we are currently directing it toward the wrong goal. While it’s wonderful to know which students are most likely to struggle in college, it is more important to know what we can do to help more students succeed.

Dave Jarrat is a member of the leadership team at InsideTrack, where he directs marketing, research and industry relations activities.

Congratulations on your MVP award at the NBA Celebrity All-Star game: 20 points, 8 boards, 3 assists and a steal -- you really filled up that stat sheet. Even the NBA guys were amazed at your ability to play at such a high level -- still. Those hours on the White House court are paying off!

Like you, I spent some time playing overseas after college and have long been a consumer of basketball box scores -- they tell you so much about a game. I especially like the fact that the typical box score counts assists, rebounds and steals — not just points. I have spent many hours happily devouring box scores, mostly in an effort to defend my favorite players (who were rarely the top scorers).

As a coach of young players, my wife Michele and I (she is the real player in the family) expanded the typical box score — we counted everything in the regular box score, then added “good passes,” “defensive stops,” “loose ball dives” and anything else we could figure out a way to measure. This was all part of an effort to describe for our young charges the “right way” to play the game. I think you will agree that “points scored” rarely tells the full story of a player’s worth to the team.

Mr. Secretary, I think the basketball metaphor is instructive when we “measure” higher education, which is a task that has taken up a lot of your time lately. If you look at all the higher education “success” measures as a basketball box score instead of a golf-type scorecard, it helps clarify two central flaws.

First, exclusivity. Almost every single higher education scorecard fails to account for the efforts of more than half of the students actually engaged in “higher” education.

At Mount Aloysius College, we love our Division III brand of Mountie basketball, but we don’t have any illusions about what would happen if we went up against those five freshman phenoms from Division I Kentucky (or UConn/Notre Dame on the women’s side) -- especially if someone decided that half our points wouldn’t even get counted in the box score.

You see, the databases for all the current higher education scorecards focus exclusively on what the evaluators call “first-time four-year bachelor’s-degree-seeking students.” Nothing wrong with these FTFYBDs, Mr. Secretary, except that they represent less than half of all students in college, yet are the only students the scorecards actually “count.”

None of the following “players” show up in the box score when graduation rates are tabulated:

Players who are non-starters (that is, they aren’t FTFYBDs) — even if they play every minute of the last three quarters, score the most points and graduate on time. These are students who transfer (usually to save money, sometimes to take care of family), spring enrollees (increasingly popular), part-time students and mature students (who usually work full-time while going to school).

Any player on the team, even a starter, who has transferred in from another school. If you didn’t start at the school from which you graduated, then you don’t “count,” even if you graduate first in your class!

Any player, even if she is the best player on the team, who switches positions during the game: Think two-year degree students who switch to a four-year program, or four-year degree students who instead complete a two-year degree (usually because they have to start working).

Any player who is going to play for only two years. This is every single student in a community college and also graduates who get a registered-nurse degree in two years and go right to work at a hospital (even if they later complete a four-year bachelor’s degree, they still don’t count).

Any scoring by any player that occurs in overtime: Think mature and second-career students who never intended to graduate on the typical schedule because they are working full time and raising a family.

The message sent by today’s flawed college scorecards is unavoidable: These hard-working students don’t count.

Mr. Secretary, I know that you understand how essential two-year degrees are to our economy; that students who need to transfer for family, health or economic reasons are just as valuable as FTFYBDs, and that nontraditional students are now the rule, not the exception. But current evaluation methods are almost universally out-of-date with readily available data and out of synch with the real lives of many students who simply don’t have the economic luxury of a fully financed four-year college degree. All five types of students listed above just don’t show up anywhere in the box score.

“Scorecards” should look more like box scores and include total graduation rates for both two- and four-year graduates (the current IPEDS overall grad rate), all transfer-in students (it looks like IPEDs may begin to track these), as well as transfer-out students who complete degrees (current National Student Clearing­house numbers). These changes would provide a more accurate result for the student success rate at all institutions.

Another relatively easy fix would be to break out cohort comparisons that would allow Scorecard users to see how institutions perform when compared to others with a similar profile (as in the Carnegie Classifi­cations).

The second issue is fairness.

Current measurement systems make no effort to account for the difference between (in basketball terms) Division I and Division III, between “highly selective schools” that “select” from the top echelons of college “recruits” and those schools that work best with students who are the first in their families to go to college, or low-income, or simply less prepared (“You can’t coach height,” we used to say).

As much as you might love the way Wisconsin-Whitewater won this year’s Division III national championship (last-second shot), I don’t think even the most fervent Warhawks fan has any doubt about how they would fare against Coach Bo Ryan’s Division-I Wisconsin Badgers. The Badgers are just taller, faster, stronger — and that’s why they’re in Division I and why they made it to the Final Four.

The bottom line on fairness is that graduation rates track closely with family income, parental education, Pell Grant eligibility and other obvious socioeconomic indicators. These data are consistent over time and truly incontrovertible.

Mr. Secretary, I know that you understand in a personal way how essential it is that any measuring system be fair. And I know you already are working on this problem, on a “degree of difficulty” measure, very like the hospital “acuity index” in use in the health care industry.

The classi­fication system that your team is working on right now could assign a coefficient that weighs these measurable mitigating factors when posting outcomes. Such a coefficient would also help to identify those institutions that are doing the best job at serving these very students. Let us hope that your team can successfully weigh measurable mitigating factors to more fairly score schools. This also would help identify those institutions that are doing the best job at serving the students with the fewest advantages.

In the health care industry, patients are assigned “acuity levels” (based on a risk-adjustment methodology), numbers that reflect a patient’s condition upon admission to a facility. The intent of this classi­fication system is to consider all mitigating factors when measuring outcomes and thus to provide consumers accurate information when comparing providers. A similar model could be adopted for measuring higher education outcomes.

This would allow consideration of factors like (1) Pell eligibility rates, (2) income relative to poverty rates, (3) percentage that are first-generation-to-college, (4) SAT scores, etc. A coefficient that factors in these “challenges” could best measure higher education outcomes. Such “degree of difficulty” factors, like “acuity levels,” would provide consumers accurate information for purposes of comparison.

Absent such a calculation, colleges will continue to have every incentive to “cream” their admissions, and every disincentive against serving the students you have said are central to our economic future, including two-year, low-income and minority students. That’s the “court” that schools like Mount Aloysius and 16 other Mercy colleges play on. We love our FTFYBDs, but we work just as hard on behalf of the more than 50 percent of our students whose circumstances require a less traditional but no less worthy route to graduation. We think they count, too.