Every time a student takes a test there is a possibility that the
raw score (observed score) obtained may be less or more than the score the student
should actually have received (true score). The difference between the observed score and
the true score is called the error score. S true = S observed + S error

In the examples to the right Student A
has an observed score of 82. His true score is 88 so the error score would be 6.
Student B has an observed score of 109. His true score is 107 so the error score
would be -2.

If you could add all of the error scores and divide
by the number of students, you would have the average amount of error in the test.
Unfortunately, the only score we actually have is the Observed score(So).

The True score is hypothetical and could only be
estimated by having the person take the test multiple times and take an average of the
scores, i.e., out of 100 times the score was within this range. This is not a practical
way of estimating the amount of error in the test.

Another way of estimating the amount of error in a test is to use
other estimates of error. One of these is the Standard Deviation. The larger the standard
deviation the more variation there is in the scores. The smaller the standard deviation
the closer the scores are grouped around the mean and the less variation.

Another estimate is the
reliability of the test. The reliability coefficient (r) indicates the amount of
consistency in the test. If you subtract the r from 1.00, you would have the amount of
inconsistency. In the diagram at the right the test would have a reliability of .88. This
would be the amount of consistency in the test and therefore .12 amount of inconsistency
or error.

Using the formula: {SEM = So
x Sqroot(1-r)} where So is the Observed Standard Deviation and r
is the Reliability the result is the Standard Error of Measurement(SEM). This gives an
estimate of the amount of error in the test from statistics that are readily available
from any test.

The relationship between
these statistics can be seen at the right. In the first row there is a low Standard
Deviation (SDo) and good reliability (.79).
In the second row the SDo is larger and the result is a higher SEM at 1.18. In the last
row the reliability is very low and the SEM is larger. As the SDo gets larger the SEM gets
larger. As the r gets smaller the SEM gets larger.

The most common use of the SEM is the production of
the confidence intervals. The SEM is an estimate of how much error there is in a test. The
SEM can be looked at in the same way as Standard Deviations. Sixty eight percent of the
time the true score would be between plus one SEM and minus one SEM. We could be 68% sure
that the students true score would be between +/- one SEM. Between +/- two SEM the
true score would be found 96% of the time. Or, if the student took the test 100 times, 64
times the true score would fall between +/- one SEM.

The SEM can be added and subtracted to a students
score to estimate what the students true score would be. The table at the right shows for
a given SEM and Observed Score what the confidence interval would be. The most notable
difference is in the size of the SEM and the larger range of the scores in the confidence
interval.

While a test will have a SEM, many tests will
have a SEM for different parts of the test. Click here for
examples of the use of SEM in two different tests: