Column One: Research

Lincoln, Neb--When officials of the Crowley Independent School
District in Texas were looking for a new test to measure the adaptive
behaviors of prospective special-education students, they turned first
to their bookshelves.

There, they found a volume--little known outside the testing
world--that for test-buyers has become as indispensable as Consumer
Reports is for appliance shoppers: the Mental Measurement Yearbook,
published by the Buros Institute of Mental Measurement here.

Just as the consumer publication provides objective information on
the reliability and usefulness of various models of appliances, the
yearbook offers independent scholarly evaluations of new educational
and psychological tests. And it does not mince words when reviewers
think a test is no good.

"The state has a list of approved instruments in special
education,'' says Edith Heil, administrator for special programs for
the Crowley schools. "Some are more appropriate than others. We
carefully research Buros to make certain we choose the right
instrument."

"It's on the shelf with the dictionary and the encyclopedia as one
of the research tools we use quite often," Ms. Heil adds.

In its 53-year history, the yearbook and the institute have become
fixtures in the measurement community. In addition to the testimony
from school administrators like Ms. Heil, the publication has also
earned the respect of test publishers, who generally regard the reviews
as constructive criticism, and from researchers, who have lined up to
write the reviews.

Now, as institute officials are putting together the 11th volume of
the Yearbook, which is expected to be published late this year, they
are also looking for ways to make their products more useful to
consumers.

"People who use tests are singularly uninformed about
psychometrics," says Jane C. Conoley, co-editor of the Yearbook. "The
consumer has a legitimate goal in mind, but right-thinking people can
be led down the garden path. It takes sophisticated psychometrics to
show them the way."

Not a 'Good Housekeeping Seal'

Founded in 1938 by the late Oscar K. Buros, an educational
psychologist who taught at Rutgers University, the Buros Institute of
Mental Measurement has for more than 50 years been casting a critical
eye at commercially available educational and psychological tests.

It moved here to the University of Nebraska when Mr. Buros died in
1978, and his widow, Louella, opened a nationwide competition to find a
permanent home for the institute.

In undertaking the venture, Mr. Buros took the risk of offending the
commercial publishers he relied on for his studies, according to
Barbara S. Plake, the institute's director.

"Publishers said, 'I thought you were my friend," she notes.

Since then, she adds, the institute has gained a reputation as
scholarly and sound, to the point where publishers routinely send the
institute copies of new tests on the market.

In fact, she adds, although the test-makers are prohibited by
copyright restrictions from using any of the yearbook's information in
its advertising, they have found that even inclusion in the book helps
sales. But she cautions that, since the institute reviews every new and
substantially revised test published, mere inclusion does not
constitute a "Good Housekeeping seal of approval."

"Customers rely on us," Ms. Plake says. "We really are the principal
source of information on tests. If a test isn't in the Mental
Measurement Yearbook, a publisher would have a hard time selling it,
even if the reviews are not great."

Nancy S. Cole, executive vice president of the Educational Testing
Service, the nation's largest testing firm, says she has no evidence
that the volume affects sales. To do so, it would have to be more
widely used than it is now, she says.

But she agrees that publishers pay attention to the reviews in the
yearbook.

In compiling its publication, the Buros Institute first solicits
from publishers three copies of every commercially available test that
is new or has been substantially revised since the last volume was
published.

The institute then selects two reviewers, usually academic experts,
who will write five-page evaluations of the tests. The publication also
includes all research literature on the tests under review.

Two copies of the tests go to the reviewers; the third stays here in
Lincoln. Over the years, the institute has collected one of the largest
test libraries in the world, Ms. Conoley notes.

The library provided an invaluable resource for one University of
New Orleans researcher, who spent a summer there and conducted a
comprehensive study of the documentation on tests provided by test
publishers, notes Lawrence M. Rudner, director of the eric
Clearinghouse on Testing and Measurement.

"That would only be possible with a library like that," Mr. Rudner
says.

The reviewers--who are unpaid--analyze the test and compare it to
the publisher's claims. Sometimes they will field-test the test
themselves to evaluate it, Ms. Con6oley says.

Although they give reviewers free rein to consider any aspect of the
tests they review, most pay closest attention to issues of
validity--whether the test measures what it is supposed to measure--and
reliability--whether it is likely to obtain the same result with
different test-takers or with the same test-taker at different times.
But the reviewers also bring their own knowledge and critical skills to
bear, Ms. Plake says.

"If all we wanted was a factual review, we could do that," she says.
"We don't want just the facts. We want an evaluation."

As a result, some of the reviews are quite scathing, Ms. Conoley
notes. One reviewer of a new version of the Wechsler intelligence test
for children--one of the most commonly used such tests anywhere--called
it "an albatross around the neck of applied psychology," she
recalls.

Ms. Cole of the ets, who wrote some reviews for the Yearbook before
joining the testing firm, also remembers one review she wrote that was
so critical that "the publisher should have pulled the test off the
market."

Perhaps surprisingly, because they have attracted so much criticism
from other quarters, standardized achievement tests generally earn high
marks in the Yearbook, Ms. Conoley notes.

"Several represent the state of the art in item development and test
development," she says.

Ms. Plake acknowledges that standardized multiple-choice tests often
fail to measure all the qualities educators want to know about
students, as critics contend. But the proposed alternatives--so-called
performance-based assessments--have yet to prove themselves valid and
reliable measures of student abilities, she says.

"There is a lot of research needed to be done to assess the
improvement in the quality of information from performance-type
appraisals [compared with] multiple-choice types," she says.

Interpreting the Reviews

In addition to putting together the Yearel10lbook, institute
officials also consult with readers to help interpret the reviews.

For example, Ms. Conoley notes, lawyers often call the institute to
find out if a test referred to in testimony, such as a Rorschach "ink
blot" test, is valid and reliable.

School officials also frequently call for help in selecting new
tests, she notes.

"If they're faced with choosing a new testing series, they really
look for input," she says. "We get calls asking if there is anything
new in the pipeline and to interpret reviews."

To make their products more useful, institute officials have decided
to step up their publication schedule. Unlike during Mr. Buros's day,
when the Yearbook was published every five or six years, officials now
intend to move toward an annual schedule. The 10th edition was
published in 1989; the 11th is expected to be published this year.

In addition, the institute has placed its products on a computerized
data base, so that readers with access to a bibliographic retrieval
service can read them sooner.

"As soon as a pair of reviews is processed, they are immediately
available," Ms. Conoley says. "Readers don't have to wait for the
yearly publication."

Ms. Plake also notes that the institute may also publish materials
to help teachers and school officials understand testing issues. Last
year, for example, in conjunction with the eric clearinghouse, the
institute published such a book, entitled Understanding Standardized
Tests.

In general, says Ms. Plake, who is president of the National Council
of Measurement in Education, psychometricians need to do a better job
of explaining tests to a broader audience.

"The measurement community has done a wonderful job of refining
their science," she says. "They do things effectively. But they have
done of horrible job of communicating what they do."

"If we're going to have a consumer base that understands testing,"
she says, "we have to be part of the training."

Web Only

Notice: We recently upgraded our comments. (Learn more here.) If you are logged in as a subscriber or registered user and already have a Display Name on edweek.org, you can post comments. If you do not already have a Display Name, please create one here.

Ground Rules for Posting
We encourage lively debate, but please be respectful of others. Profanity and personal attacks are prohibited. By commenting, you are agreeing to abide by our user agreement.
All comments are public.