Rosner's Outlier Test

Rosner's test for multiple outliers is used by VSP to detect up to 10
outliers among the selected data values. This test will detect outliers
that are either much smaller or much larger than the rest of the data.
Rosner's approach is designed to avoid the problem of masking, where an
outlier that is close in value to another outlier can go undetected.

Rosner's test is appropriate only when the data, excluding the suspected
outliers, are approximately normally distributed, and when the sample
size is greater than or equal to 25.

Data should not be excluded from analysis solely on the basis of the
results of this or any other statistical test. If any values are flagged
as possible outliers, further investigation is recommended to determine
whether there is a plausible explanation that justifies removing or replacing
them.

Performing Rosner's Test

The \( n \) observed values are ordered from smallest to largest. We
specify the maximum number of suspected outliers \( k \) , where \( k
\) is between 1 and 10. Then we calculate a series of test statistics
by removing the datum (large or small) that is farthest from the mean
and recomputing the test statistic according to the following equation:

$$ \large R_{i+1} = \frac{|x^{(i)} - \bar x^{(i)}|}{s^{(i)}} $$

where \( \bar x^{(i)} \) is the sample mean and \( s^{(i)} \) is the
standard deviation of the data after the \( i \) most extreme observations
have been removed, and \( x^{(i)} \) is the observation in that subset
of the data that is furthest from \( \bar x^{(i)} \) .

Once all of the test statistics \( R_1 \dotsc R_k \) are computed, a
series of hypothesis tests are performed. We first test the hypothesis
that there are \( k \) outliers by comparing \( R_k \) to the critical
value \( \lambda_k \) , obtained from a table (Table
A-4, EPA) for the specified significance
level \( \alpha \).

If \( R_k > \lambda_k \) , then the test is significant and we can
reject the null hypothesis that there are no outliers in the data and
conclude that the \( k \) most extreme values are outliers.

If \( R_k \leq \lambda_k \) , we move on to test the hypothesis that
there are \( k-1 \) outliers by comparing \( R_{k-1} \) to the critical
value \( \lambda_{k-1} \) . This process is continued until one of the
tests is significant and we can conclude that there are a certain number
of outliers, or until all the tests have been performed and none were
found to be significant. If none of the tests are significant, then we
conclude that there are no outliers in the data.