IQ and socioeconomic development across Regions of the UK: a reanalysis

A reanalysis of (Carl, 2015) revealed that the inclusion of London had a strong effect on the S loading of crime and poverty variables. S factor scores from a dataset without London and redundant variables was strongly related to IQ scores, r = .87. The Jensen coefficient for this relationship was .86.

Introduction

Carl (2015) analyzed socioeconomic inequality across 12 regions of the UK. In my reading of his paper, I thought of several analyses that Carl had not done. I therefore asked him for the data and he shared it with me. For a fuller description of the data sources, refer back to his article.

Redundant variables and London

Including (nearly) perfectly correlated variables can skew an extracted factor. For this reason, I created an alternative dataset where variables that correlated above |.90| were removed. The following pairs of strongly correlated variables were found:

median.weekly.earnings and log.weekly.earnings r=0.999

GVA.per.capita and log.GVA.per.capita r=0.997

R.D.workers.per.capita and log.weekly.earnings r=0.955

log.GVA.per.capita and log.weekly.earnings r=0.925

economic.inactivity and children.workless.households r=0.914

In each case, the first of the pair was removed from the dataset. However, this resulted in a dataset with 11 cases and 11 variables, which is impossible to factor analyze. For this reason, I left in the last pair.

Furthermore, because capitals are known to sometimes strongly affect results (Kirkegaard, 2015a, 2015b, 2015d), I also created two further datasets without London: one with the redundant variables, one without. Thus, there were 4 datasets:

A dataset with London and redundant variables.

A dataset with redundant variables but without London.

A dataset with London but without redundant variables.

A dataset without London and redundant variables.

Factor analysis

Each of the four datasets was factor analyzed. Figure 1 shows the loadings.

Figure 1: S factor loadings in four analyses.

Removing London strongly affected the loading of the crime variable, which changed from moderately positive to moderately negative. The poverty variable also saw a large change, from slightly negative to strongly negative. Both changes are in the direction towards a purer S factor (desirable outcomes with positive loadings, undesirable outcomes with negative loadings). Removing the redundant variables did not have much effect.

As a check, I investigated whether these results were stable across 30 different factor analytic methods.1 They were, all loadings and scores correlated near 1.00. For my analysis, I used those extracted with the combination of minimum residuals and regression.

Mixedness

Due to London’s strong effect on the loadings, one should check that the two methods developed for finding such cases can identify it (Kirkegaard, 2015c). Figure 2 shows the results from these two methods (mean absolute residual and change in factor size):

Figure 2: Mixedness metrics for the complete dataset.

As can be seen, London was identified as a far outlier using both methods.

S scores and IQ

Carl’s dataset also contains IQ scores for the regions. These correlate .87 with the S factor scores from the dataset without London and redundant variables. Figure 3 shows the scatter plot.

Figure 3: Scatter plot of S and IQ scores for regions of the UK.

However, it is possible that IQ is not really related to the latent S factor, just the other variance of the extracted S scores. For this reason I used Jensen’s method (method of correlated vectors) (Jensen, 1998). Figure 4 shows the results.

Jensen’s method thus supported the claim that IQ scores and the latent S factor are related.

Discussion and conclusion

My reanalysis revealed some interesting results regarding the effect of London on the loadings. This was made possible by data sharing demonstrating the importance of this practice (Wicherts & Bakker, 2012).