Are Student Ratings Unfair to Women? by Neal Koblitz, University of Washington

In the March-April [1990] issue of the AWM Newsletter, I
asked for information on whether or not student ratings tend to discriminate
against women. The purpose of this article is to report briefly on the response
to my query.

I was extremely pleased to receive a
large number of quite varied responses. Some people wrote their general
impressions and described their personal experiences. Others generously sent me
reprints of papers on the subject, or gave me advice on where to look for more
material. To my surprise, it turns out that quite a lot has been written on
this question, but not in journals which mathematicians normally read (see the
bibliography below).

I will not attempt a systematic survey of the research and
opinions on the subject. For this the reader is referred to the short list of
references below, which includes the papers which I
found to be the most interesting (more extensive lists of papers can be found
in their bibliographies). Rather, I will summarize my own conclusions based on
the material that was sent to me.

A few of the letters I received and some of the early studies
indicate that often women receive equal or higher student rating numbers than
men. In many situations students perceive (probably correctly) that the women
instructors tend to be more sensitive to their needs, more concerned and
caring, and more dedicated to teaching than the male instructors (it also helps
if the woman is thought to be lenient)  and as a result reward them with
higher ratings. This causes some people to conclude that there is little or no
discrimination against women in student ratings.

However, a more careful examination of the question shows that the
reality is more complex. Note that the traits listed in the last paragraph
which may lead to high ratings for women are compatible with sex-stereotyped
expectations of women as mother figures. According to
Kierstead et al. [6], Taken as a whole, [our]
results suggest that if female instructors want to obtain high student ratings,
they must be not only highly competent with regard to factors directly related
to teaching but also careful to act in accordance with traditional sex role
expectations. In particular, male and female instructors will earn equal
student ratings for equal professional work only if the women also display
stereotypically feminine behavior.

Thus, the difficulty for women would tend to occur in cases where
instructors have to adopt a get-tough approach. Such a situation is
much more likely to arise in a math department than, for example, in psychology
or sociology, because (1) mathematics departments typically are called upon to
perform the role of enforcer of academic standards, with service courses acting
as a weeding out device for the engineering and science
departments, and (2) the discrepancy between students high school
preparation and study habits and the demands of college work is especially
glaring in mathematics.

If an instructor feels compelled to put students under pressure
(assigning a lot of homework, giving challenging exams), then only the most
serious and mature students are at all likely to respond with high ratings at
the end of the course. Most students are inclined to punish the
instructor. There is considerable evidence that the punishment is
more severe if the instructor is female.

[According to] Susan Kay's classroom studies male
students were far more likely to give lower ratings to those female faculty
perceived to be hard graders . This finding is consistent with a
series of experiments at the University of Dayton that indicated that college
students of both sexes judged female authority figures who engaged in punitive
behavior more harshly than they judged punitive . ([8], p.484485)

Bennett, in particular, found that women will be rated highly only
if they are especially accessible to the students and spend a lot of time with
them, while men can receive equally high ratings while remaining more aloof. In
other words, students tend to allow men but not women to spend most of their
time on research and other non-teaching activities without penalizing them in
the ratings:  male instructors are judged independently of
students personal experiences of contact and access, whereas female
instructors are judged far more closely in this regard. In this sense women are
negatively evaluated when they fail to meet this gender appropriate
expectation  ([3], p.177178).

One of the most interesting studies was made in the 1970s by
Ellyn Kasehak [5]. 50 male and 50 female students were
given a set of descriptions of the teaching methods and practices of professors
in various specialties. In the forms received by half of the students (25 males
and 25 females) the professors were given names of the opposite gender from the
professors in the forms received by the other half of the students. Kaschak
found that the male students were biased against women, while the female
students were not.

The possibility of sex discrimination is one complex and
controversial aspect of the broader question of the validity of student ratings
as a measure of teaching effectiveness. It would take us too far afield to
discuss some of the other problems identified in the many studies that have
been conducted. But it is worth noting that, generally speaking, math
departments are usually put at a special disadvantage if administrators and
faculty in other departments have excessive confidence in the meaning of
student rating numbers and in the value of cross-department comparisons. A
larger proportion of our students take courses as requirements rather than
electives and view the subject as difficult. This tends to bring down math
department ratings across the hoard and leads to an unjustified belief on
campus that the math department has worse teachers than other departments.

People outside of the mathematical sciences often have a naive
faith in the value of numbers and are less aware than we are of the pitfalls in
taking raw statistics at face value.

[S]tudent rating scales are a form of measurement and,
according to American Psychological Association standards, should be
accompanied by information about the meaning, interpretation, and limitations
of the scores  yet most student ratings are not accompanied by such
information; [in fact,] promotion and tenure decisions are usually made by an
array of administrators and faculty committees who are naive about the standard
criteria for measurement instruments, and hence do not know how to interpret
the results or do not realize their limitations. ([9], p.88)

In practice, the treatment of student ratings by college
administrations varies considerably. On the one hand, McMaster University
(Hamilton, Ontario) is among the institutions that have conducted careful
studies of the validity of student ratings and seem to have adopted a cautious
and sophisticated approach to the subject. At the other extreme, I received
letters from two different women in the mathematical sciences at a university
in western Canada, complaining bitterly of the unfair and cynical way that
administrators at their university are using student ratings as a weapon
against the faculty, especially the female faculty.

And at the University of Arizona, the director of an office of
Instructional Research and Development circulated a tract
[1] to faculty members purporting to correct certain
myths held by sceptics. Myth 7 is: Gender of the
student and the instructor affect[s] student ratings. The article
proceeds to refute this myth by means of a highly selective and
distorted citing of the literature. Of course, someone in the math department
at the University of Arizona is not likely to be aware of the numerous studies
that give convincing support to Myth 7 (none of which are mentioned in
[1]), and so could easily be taken in by the self-serving
and intellectually dishonest propaganda.

Some Conclusions

Student ratings can provide valuable feedback to the
instructor her/himself, but they cannot be properly understood by someone who
is not familiar with the nature of the course being rated, the characteristics
of the students, and the pedagogical objectives of the instructor.

On the student rating forms, questions which are very specific
(e.g., promptness in correcting exams, availability for
office hours) are less likely to invite biased responses than questions
of a general nature (rate the instructor overall).

In certain teaching situations which are frequently
encountered in math departments (especially in introductory-level courses),
students tend to discriminate against women instructors on the rating forms.

Math departments and administrators have an ethical and legal
obligation not to base promotion and salary decisions on data which are biased
against women.