More than 30 years ago, studying for my
doctorate in human behavioral genetics, I came to the
same conclusions (in retrospect, Weiss, 1986) as Velden
for exactly the same reasons. The biometrical doctrine seeks to measure all the
variation in a character and to partition the differences observed into
fractions (variances, heritabilities) ascribable to the effects of genetic and environmental phenomena. The biometrical paradigm asserts that continuous variation implies the determination of characters by many genes with small effects. However, the usual notion of small and additive effects of many genes for polygenic inheritance is violated by the hierarchical nature of biochemical conversions in metabolic pathways and will finally be overcome by Mendelian molecular genetics (Weiss, 1995 and 2000).

Since 1974, in former East Germany, I became involved in basic
research for the selection of top athletes. Selective breeding was no aim, but
we had in personnel selection the task of long range prediction for different
specialities. The problem was especially difficult, if we had to to select children at an age of 5 or 10, for example, and we had to predict their final stature, arm length, achievement in the long jump or endurance run (Weiss, 1979a), which are of importance in athletics and ice skating (or to predict IQ, which is of importance for team players), for which the children and youths must be trained 10 years and more to reach top performance and Olympic gold. Ideally we should have a single classification battery of measurements and tests which, by means of differential weighting, according to the multiple correlations between criterion and predictor variables, would enable us to predict success in each of a variety of specialities. In order to calculate true final scores and to minimize error variance, a second weighting was applied, which maximized the heritability of the battery. Why and in which way?

There exists an analogy between test theory and quantitative genetics. In classical test theory, the following linear model is assumed:

X = T + E,

where X is the observed test score of an individual, T is the true score, and E is the error of measurement. In quantitative genetics is assumed

P = G + E,

where P is the observed phenotypic value, G is the genotypic value, and E is the environmental deviation (including error of measurement).

Two tests are said to be parallel if they yield
identical true scores and if the errors are uncorrelated. The analogous
situation in quantitative genetics is the case of monozygotic twins reared in
random environments. In such a case, members of the twin pair would have
identical genotypical values and environmental deviations would be uncorrelated. From the above model of test theory follows

Var (X) = Var (T) + Var (E),

where Var is variance. The proportion of the variance in observed scores owing to variation in true scores is defined as the reliability pxx of a test X. It may be shown that the reliability has the following equivalent forms:

pxx
= Var (T)/ Var (X) = r2xt = btx = rxx ,

whererxt is the correlation between true and observed scores,
btx is the regression of true score, and
rxx is the correlation between performances of parallel forms of a test.

From the model of quantitative genetics follows

Var (P) = Var (G) + Var (E),

when G and E are uncorrelated (if correlated, see Weiss, 1979b). The proportion of the phenotypic variance owing to variation in genotypic values is defined as heritability in the broad sense h and has the following equivalent forms:

hpp
= Var (G)/ Var (P) = r2gp = bgp =
rpxpx ,

wherepx and px are measures on members of monozygotic twin pairs. Therefore, test-retest-reliability provides an upper-bound-estimate of heritability.

If we have a time span of 1, 2, or even 10 years between test and retest, we do not speak anymore of test-retest-reliability, but of the longitudinal correlation between the two measurements. Evidently, in longe range prediction it is better to weight the variables with these longitudinal correlations than with the test-retest-reliabilities at the starting time of prediction. But longitudinal correlations over 10 years can only be measured in longitudinal studies lasting 10 years. But quicker results are often required. From the analogy between test theory and quantitative genetics we can conclude, that longitudinal correlation and heritability in the broad sense are equivalent expressions for the reliability in the long run. However, heritabilites can be measured immediately, without longitudinal study. In
East Germany the total population at an age of 10 was measured each year, and a representative sample of 3000 twin pairs was drawn in 1974. This basic research was no secret (as some later applications), but published in detail (Weiss, 1977).

From the information about heritabilities and
the matrix of intercorrelation of the variables for each child, the score of the heritability index I (Weiss, 1980) was calculated:

I = a1h1X1
+ a2h2X2 + … + anhnXn,

Where X1, . , Xn
are the tested scores, h1, . , hn are heritabilities,
anda1,
…,an are weights depending
on the regression of predictor and criterion variables. For different
chronological ages the heritabilities are different, too. By using standardized scores and by standardizing I-scores, we can not only predict the relative performance, compared with other individuals, but also the absolute performance. Because it was, for example, very important to predict some final values (Weiss, 1979a) as accurately as possible (for example, the height of ice skating pairs, separately for male and female partners, or the
IQ of team players), information about the scores of parents (and even sibs) was included into the calculations. From the mid-parent score P (arithmetical mean of the parents) the genotypic value G of a child can be estimated as follows:

G = hf1 / (P – S)
+ S ,

where S is the mean of the subpopulation or type, to which the parents belong, and hf1 is the heritability calculated from parent-offspring correlations (or the heritability of sib pairs). Also such results can be included into an overall index for personnel selection.

With the breakdown of East bloc countries and their highly sophisticated sport research (Kovár, 1981; Wolanski & Siniarska, 1984), the ghost went back into the bottle and was replaced again by the purely theoretical and unfruitful debate (see all the citations by Stelzl) about the sense and nonsense of heritability, lasting already half of a century.