Reliability Of Repeated Tests

Will the same tests give consistent results when used repeatedly with the same
subject? In general we may say that they do. Something depends, however, on the age and
intelligence of the subject and on the time interval between the
examinations.

Goddard proves that feeble-minded individuals whose intelligence has
reached its full development continue to test at exactly the same mental
age by the Binet scale, year after year. In their case, familiarity with
the tests does not in the least improve the responses. At each retesting
the responses given at previous examinations are repeated with only the
most trivial variations. Of 352 feeble-minded children tested at
Vineland, three years in succession, 109 gave absolutely no variation,
232 showed a variation of not more than two fifths of a year, while 22
gained as much as one year in the three tests. The latter, presumably,
were younger children whose intelligence was still developing.

Goddard has also tested 464 public-school children for three successive
years. Approximately half of these showed normal progress or more in
mental age, while most of the remainder showed somewhat less than normal
progress.

Bobertag's retesting of 83 normal children after an interval of
a year gave results entirely in harmony with those of Goddard.
The reapplication of the tests showed absolutely no influence of
familiarity, the correlation of the two tests being almost perfect
(.95). Those who tested "at age" in the first test had advanced, on
the average, exactly one year. Those who tested _plus_ in the first
test advanced in the twelve months about a year and a quarter, as we
should expect those to do whose mental development is accelerated.
Correspondingly, those who tested _minus_ at the first test advanced
only about three fourths of a year in mental age during the
interval.

Our own results with a mixed group of normal, superior, dull and
feeble-minded children agree fully with the above findings. In this case
the two tests were separated by an interval of two to four years, and
the correlation between their results was practically perfect. The
average difference between the I Q obtained in the second test and that
obtained in the first was only 4 per cent, and the greatest difference
found was only 8 per cent.

The repetition of the test at shorter intervals will perhaps affect the
result somewhat more, but the influence is much less than one might
expect. The writer has tested, at intervals of only a few days to a few
weeks, 14 backward children of 12 to 18 years, and 8 normal children of
5 to 13 years. The backward children showed an average improvement in
the second test of about two months in mental age, the normal children
an average improvement of little more than three months. No child varied
in the second test more than half a year from the mental age first
secured. On the whole, normal children profit more from the experience
of a previous test than do the backward and feeble-minded.

Berry tested 45 normal children and 50 defectives with the Binet 1908
and 1911 scales at brief intervals. The author does not state which
scale was applied first, but the mental ages secured by the two scales
were practically the same when allowance was made for the slightly
greater difficulty of the 1911 series of tests.

We may conclude, therefore, that while it would probably be desirable
to have one or more additional scales for alternative use in testing the
same children at very brief intervals, the same scale may be used for
repeated tests at intervals of a year or more with little danger of
serious inaccuracy. Moreover, results like those set forth above are
important evidence as to the validity of the test method.