Using the pre-2012 data, the paired t-test expected number of medals for Britain in 2012 was 60 with 95% CI [53 67].Using the Wilcoxon signed rank test, the Hodges-Lehmann estimated 2012 medal count was 59.5 medals with Tukey 96% CI for that expected value: [52 68.5].

A real test for this simple additive model will be for Brazil in 2016, since Brazil only won 17 medals in 2012. This paired model assumes a simple additive effect, not a multiplicative effect. With the additional year of data and using the paired t-test, the expected number of medals for Brazil in 2016 is 30.4 with CI [24 37]. Using the Wilcoxon signed rank test, the Hodges-Lehmann expectation is for Brazil to win 30 medals with 96% Tukey CI: [22.5 38].

Now, using that same medal table, it can be seen that every host country's medal total declines in the next Olympic year. Using the paired t test, I can predict that Britain's medal total should decline from 65 to 58 with 95% CI [55 60]. Using the Wilcoxon signed rank test and the Hodges-Lehmann estimate of the effect size and Tukey confidence intervals for the effect size, Britain's medal total can be expected to decrease from 65 to 58 with 96% CI: [54 61].

Both the paired t test and Wilcoxon signed rank test can be performed on log-transformed data (transformed before taking the differences) to create a multiplicative model rather than an addtive model. The test then becomes a test of whether the medal ratios are greater to or less than 1, rather than the differences being different from 0. The expectation and 95% CI can be calculated on the log transformed data and then converted back to the natural scale using antilogarithms.

The histograms for these data indicate that the differences in medals are roughly symmetric on the natural scale, so there is no reason to log transform the data other than the feeling that the additive model might set too high an expectation for countries with low medal totals. Why should Brazil with just 17 medals this year be expected to increase by 13.4 medals, the same additive increase used to model Britain with a 47-medal tally 4 years earlier?

Log transforming the data actually makes the model worse, producing a strongly positively skewed distribution. The key assumption of the Wilcoxon signed rank test and paired t test is symmetry (the paired t test assumes normality which includes symmetry). These Olympic medal data have a couple of screwy results when log transformed or converted to ratios. Mexico in 1968 increased from 1 to 9 medals, a modest 8 medal increase in the additive model but a 900% increase when converted to ratios, Spain increased its medals by 550% relative to 4 y earlier, and Australia increased its medals by 318%.

Analyzing differences of log transformed medals, with results back transformed to ratios, the paired t test predicts that Brazil's medal total will double from 17 medals to 34 medals with a 95% CI of 23 to 50. Using the Wilcoxon signed rank test on the log transformed medal counts, the Hodges-Lehmann expectation is for a 170% increase in medals with 95% CI of 140% to 350%. So Brazil would be expected to increase its medal total from 17 in 2012 to 28 medals in 2016 with a 95% CI of 23 to 59 medals. These predictions are similar to those obtained using the additive gain model, except the upper 95% confidence limit is much higher with the multiplicative model.

In general, the Wilcoxon signed rank test and paired t test produce very similar results. It is known that the asymptotic relative power efficiency of the Wilcoxon signed rank test is over 95% compared to the paired t test if the differences are normally distributed. That means that to obtain similar p values 100 samples would be required for the Wilcoxon signed rank test to only 95 samples for the paired t test. For some types of non-normal data, it is possible for the Wilcoxon signed rank test to be more powerful than the paired t test. The major virtue of the paired t test is that it is easy to express the effect size using the 95% CI. The 95% CI for the Wilcoxon signed rank test, attributed to Tukey in a 1953 paper by Walker & Lev, is available on most computer packages now. The upper and lower confidence limits are the amount that needs to be added to each observation in the sample with the lower median to produce two-tailed p values of alpha/2. Gene GallagherUMass Boston