A Brief History of Model II Regression Analysis

Karl Pearson [1] was the "first" to address the problem of fitting a line when both the X and Y variables had measurement error. He called his solution to the problem the "major axis" of the data ellipse. It describes how X and Y co-vary.

Kermack and Haldane [2] later showed that when the units of the X and Y variables were changed the major axis was not uniquely determined: the slope and intercept would vary (even after correction for the new axes) when the scales were changed. They proposed the use of a "reduced major axis" where both X and Y were converted to standardized variables. For standardized variables, the mean = 0 and the standard deviation = 1.

York [3] developed a method of weighing the data points in both X and Y for those cases where one wants to find the major axis but the uncertainties of the two measurements are different. He called his method the least squares cubic because it requires the solution of a cubic equation to find the slope of the regression, not because it gives a cubic equation.

Ricker [4] showed the the geometric mean regression is identical to the reduced major axis but far easier to compute.

Jolicoeur [5] took great exception to some of Ricker's comments; most notably:

he pointed out that " l'axe majeur des variables reduites" is more accurately translated standard major axis rather than reduced major axis which Kermack and Haldane [2] had translated literally from the French;

that formulae for the asymmetrical confidence limits for the slope of the geometrical mean regression already existed;

and, that because of the lack of sensitivity of the slope of the standard major axis to the strength of the relationship, the use of the bivariate structural relationship was preferred.

In his reply, Ricker [6] took great pains to address each of Jolicoeur's complaints:

he presented the formulae for the asymmetrical confidence limits for the slope of the geometrical mean regression and agreed that they provided better limits than his approximate symmetrical limits. Because of their computational complexity, I have not included them here.

Subsequently, Sprent and Dolby [7] took exception to the ad hoc use of the geometric mean regression in model II cases. They argue that an equally strong case can be made for the line that bisects the minor angle between the two model I regressions: Y-on-X and X-on-Y. (Let's call this line the least squares bisector.) While the differences in slope between the geometric mean regression and the least squares bisector are small and probably not statistically significant, this new "regression" line is included here for the sake of completeness.

Laws' and Archie's [8] presention of a very illustrative (biological) example of the pitfalls of using the model-I regression when a model-II regression is required.

Sokal's and Rohlf's [9] textbook "Biometry" where the issues of model-I vs model-II regression are discussed in great detail.

Recently, Laws [10] has written a book which contains a collection of mathematical and statistical methods commonly used by oceanographers. It includes an extensive chapter where various aspects of model-II regression techniques are presented.

And, finally, York et al. [11] have derived unified equations for the slope, intercept and standard errors of the best straight line for model-II cases showing that the least-squares estimation (LSE) and maximum likelihood estimation (MLE) methods yield identical results. Furthermore, they show that all known correct regression solutions in the literature can be derived from the original York equations [3].

References

Pearson (1901). On lines and planes of closest fit to systems of points in space. Phil. Mag. v2(6): 559-572.