A problem of multiplicity

One thing a scientist doesn’t want to mess up is the problem of multiplicity (also known as ‘field significance‘). It’s just like rolling a die 600 times, and then getting excited about getting roughly 100 sixes. However, sometimes it’s much more subtle than just rolling dice.

In this study, a range of different so-called ‘solar proxies’ (describing the state of the sun – in this case monthly sunspot number, the aa-index, as well as the vertical Z and horizontal H component of the magnetic field measured at Eskdalemuir) is examined and compared with some climate indices – some rather obscure quantities that were assumed to represent the state of European climate, namely ‘mean-squared interannual temperature variations’ (MSITV) and ‘lifetime’.

Moreover, the similarity between the solar proxies and the climate indices were tested for different seasons.

The Le Mouel study found a ‘decent match’ only for one solar proxy and only in winter: the vertical Z component from Eskdalemuir.

Although the paper was not clear on this, it appeared that a number of estimates for MSITV and ‘lifetime’ were explored using different lengths of time intervals (sliding ‘window size’) and for different seasons. In other words, by searching for one particular choice that gives the best match for one solar proxy, they may have ended up (unintentionally) ‘cherry picking’ the data.

There were furthermore substantial differences between the solar proxy and lifetimes of T(2m), SLP, and wind direction before 1940 – and the paper forgot to even point this out.

However, a break-down of correlation outside a limited interval is typical of a fortuitous match – there is plenty of examples through science history of similar alleged links between solar activity and climate that eventually turned out not to hold up.

The mismatch before 1940 further points to my suspicion of a problem of multiplicity. The paper also neglects to discuss the statistical significance levels associated with the analysis – hence the validity of these conclusions is at best an open question.

Another weakness is that the paper offers little discussion on how the solar activity may affect local/regional temperature/pressure. E.g. how does Z affect the wind directions and the ‘lifetime’, and why is there a strong dependency of the ‘signal’ on the season? What is the hypothised physical link? As long as there is no hypothetical mechanism, there is no way of confirming whether the interpretations are correct.

It’s interesting to note that the time evolution of the Z or H components of the magnetic field shows no clear trend over the period 1920-2000, and if anything, it seems as if there are opposite trends in H and Z (these proxies have the highest correlations with the climate indices). Thus, it would be difficult to generalize these results to explain the past global mean temperature trends – as it is to make deductions for the global mean from a regional set of measurements.

The paper also contains a number of sweeping statements (e.g. the alleged strong evidence of the influence of solar variabilty on time scales as long as 100.000.000 years!) based on a dubious selection of citations that could be debated.

The use of the MSITV and lifetime is interesting – why not just look at the temperature, precipitation and sea level air pressure? Had there been a ‘solar signal’ in these, I’d expect those aspects would be discussed, rather than these unusual derivates. It’s therefore interesting to note that the LeMouel paper, on the one hand, demonstrates that there essentially is no relationship between several indicators for European climate and the aa-index or the monthly sunspot number, and on the other, argues in the introduction and discussion that these are important for the state of our climate…

38 Responses to “A problem of multiplicity”

Thanks, given that Europe is mostly above 45th parallel this is quite an unsurprise. The winter signal could be linked to ozone levels in winter or the northern lights, because sun isn’t shining too much. I’ll shut up now, or should I also get my coat?

I wish the solar physicists would shut the heck up with their “It’s the Sun!” garbage. When did they decide it was their mission to prove their field was superior by discrediting climatology and planetary astronomy? A scientist outside his field is no more qualified than anyone else. You’d think they’d learn from examples like William Shockley gassing about race and IQ.

I wouldn’t call it a smear. One can legitimately be of different opinions about including / excluding the Arctic in computing global trends. The article is not too bad actually, it leaves absolutely no doubts as to where all scientists except the village fool know temperatures are going in the longer term.

And Stefan puts his money where his mouth is:
Climatologist Stefan Rahmstorf is so convinced that his predictions will be correct in the end that he is willing to back up his conviction with a €2,500 ($3,700) bet. “I will win,” says Rahmstorf.

His adversary Latif turned down the bet, saying that the matter was too serious for gambling. “We are scientists, not po-ker players.”

You wrote: “The mismatch before 1940 further points to my suspicion of a problem of multiplicity. ”
Is that because if there were a correlation, the period before 1940 would be the most likely timeframe to find it? (since that is when the solar output indeed increased a little bit, whereas it hasn’t thereafter?)

[Response: More because if there is a strong relationship based on real physics, I’d expect it to be valid for the entire time period – unless it can be explained why this mechanism doesn’t work all the time. -rasmus]

Can’t see anything too exciting here. Perhaps there is an indirect link via cloud cover affecting albedo or as jyyh says, something to do with changes to ozone level, due to solar magnetic link to polar vortex?

@ Leonard Weinstein: you seem to be somewhat confused. Peer review is a junk filter: its goal is to *begin* separating out good science from bad science. Passing peer review is a first barrier to pass on the way to being established as a good piece of science, not a final barrier. Thus, the failure to reach the level of a peer-reviewed paper can be damning even while success in reaching this level does not guarantee the quality or correctness of the work in question.

An analogy might be to a decent e-mail spam filter: messages caught as spam are with extremely high probability low-quality; messages that make it through the filter have passed some basic tests of quality, but it doesn’t actually guarantee that they are of interest to you.

Sounds like they were convinced they’d find something, and then continued until they did.

As for the lack of a mechanism, looks like they say the data period is too short for them to think of one. I say if they have enough data to claim they have found a correlation, then they have enough data to make a preliminary hypothesis as to the mechanism. They do note Svensmark’s GCR business at the end, but don’t seem to tie it in with their own results.

Having done several (unrelated) research experiments collecting data the two golden rules that I keep coming across are ‘Keep it simple’ and ‘Pay attention to detail’. No point in complicating things by hunting for non-existent links and probabilities. This only ends up tripping you up with ever more complex calculations which are so easy to get wrong. Bit like the data on Iron content of spinach. Misplaced decimal point apparently. But hey that was a simple maths calculation! Wonder how accurate the big calculations are then?
I have found that with several papers and theories that have been presented as the ‘Emperors New Clothes’ they have been simply that….’nothing to write home about’ which have been quickly discredited or worse ridiculed. The last option can do untold damage to a scientists future credibility.
However I personally find it annoying to read stuff that is so complicated its almost as if the person writing it is trying to be deliberately high brow to make everyone think ‘Oh wow! Just listen to all those big words? He/she must know what he/she is talking about and I haven’t a clue.’
I’m probably being cynical about the above paper but to my mind by making it all the more complex does little to convince the ordinary people, ;et alone ones peer’s, that what they are doing is going to kill them and the planet if they don’t stop…..!
So how about we keep it simple for a change and educate the population in ‘layman’s’ terms so that they can understand and feel like they are as much responsible for climatic disruption as the rest of us who actually know we are as much to blame? It is the only way we can go if we are to secure a popular concensus on the much needed changes to ensure we ‘All’ survive and that includes the natural environment too.
Is it really that difficult to admit that we have a lot to learn about how we communicate this sort of information information?

“As long as there is no hypothetical mechanism, there is no way of confirming whether the interpretations are correct.” you mean like the assumed 3 times amplification of the warming effect of CO2? Never really observed or validated?

Correlation between indices has always been problematic. Apply this same reasoning to much of the multi-decadal oscillation claims, and similar problems pop out.

In this study, a range of different so-called ’solar proxies’…is examined and compared with some climate indices – some rather obscure quantities that were assumed to represent the state of European climate…

Multidecadal oscillation claims also rest on similar matchups between ‘ocean proxies’ and ‘climate indices.’

In physics, these kind of statistical approaches are indeed widely used, but only with well-characterized physical systems. The reason that physicists settled on statistical approaches to physical problems has to do with quantum mechanics, which is really a probabilistic theory, not a deterministic theory – that’s the 20th century vs. the 19th century.

Quantum or classical, if you have a poor understanding of what is going on in your system, than a statistical analysis can be misleading due to systematic biases in the approach.

For example, if you weigh the same thing a hundred times, and the answer is identical each time to within a few micrograms, you might think you have the right weight – but your scale was miscalibrated – you have an accurate wrong answer.

Likewise, if you select different sets of ‘solar proxies’, ‘ocean proxies’, and ‘climate indices’ and play mix and match, you can find a statistical correlation of some significance. This can easily turn into selective use of data to reach a desired conclusion, however.

How do you avoid these traps? Comprehensive data collection!

People used to propose such cyclic themes for atmospheric phenomenon all the time (1950s etc.) until the complete network of data collection was in place, and then it became clear that the atmosphere was not periodic, and that the old sci-fi dream of space-based weather control was not going to be possible – one might be able to affect the weather, but the result of that perturbation was usually not very predictable.

Similarly, by collecting atmospheric data you can see that the stratosphere is cooling, as expected due to increased energy trapping at lower levels due to greenhouse gases, but not what you’d see from an increase in solar radiation. Do the same for the oceans, and who knows? Put up Triana and get a direct reading of the planetary energy balance – why not?

In any case, statistical analysis of a poorly characterized (unconstrained) physical system is a fool’s game – professional scientists, faced with such a problem, immediately focus their efforts on gaining more comprehensive data, not on hand-waving.

P.S. for another good discussion of probability in climate, see the record high / record low analysis by UCAR, here’s a press release:

If temperatures were not warming, the number of record daily highs and lows being set each year would be approximately even. Instead, for the period from January 1, 2000, to September 30, 2009, the continental United States set 291,237 record highs and 142,420 record lows, as the country experienced unusually mild winter weather and intense summer heat waves.

Media coverage of this has been pretty ridiculous – as soon as Andy Revkin at Dot Earth heard about it, he rushed to get commentary from Pat Michaels at the Cato Institute. Spin, spin, spin.

Spurious correlations are a common problem in data mining and there are a number of good ways of accounting for them including the stability of the response over time(as mentioned), penalizing the p-value using a Bonferroni correction, or methods using jack-knifing and bootstrapping. Some physicists are very good at this kind of stuff (e.g., those who work in probabilistic frameworks) and some are pretty ignorant (e.g., those who measure stuff and report the values). No need to arm wave – just do the test right.

Leonard @Actually at #8 says
“Your comments make sense, but every paper or blog taking issue on AGW claims is dissed based on the fact that not as many peer reviewed papers support their position.”

Really? I don’t know of a single one that has passed peer review, publishing and post-publication review that in any way refutes the anthropogenic nature of the warming trend since 1850. In fact, I don’t know of a single one that hasn’t been found to have some major flaws, as with the above-discussed paper.

When I was in grad school, some decades ago, one of my fellow students had come up with a significant (statistically) 6-way interaction during an analysis of variance. He was asking our mentor how he could possibly explain it in the writeup. Our mentor asked something like, “Is this one of the effects you were looking for?”
“No.”
“Then don’t.”

I think the conversation went on to some advice not to mention it at all, but if you must, mention it only as something that might be intriguing for further study.

The mistake of dredging data is nothing new.
Peer review isn’t perfect; it just increases the odds that what comes through has some value.

It’s been said before in different ways, but to put it simply, no matter what cutoff you use for statistical significance, if you throw enough tests at a data set, some relationship is likely to cross that threshold just because of random variations. It’s only interesting if the relationship is repeated in another, or multiple other, independent data sets.

If Gavin’s interpretation of the paper is correct, someone like my mentor should have been involved in the review process of this paper.

“But a few scientists simply refuse to believe the British calculations. ‘Warming has continued in the last few years,’ says Stefan Rahmstorf of the Potsdam Institute for Climate Impact Research (PIK). However, Rahmstorf is more or less alone in his view.”

Then:

“Marotzke and Leibniz Institute meteorologist Mojib Latif are even convinced that the fuzzy computing done by Rahmstorf is counterproductive.”

I mean, give me a break! If this is not a smear, then what is?

And the text you cited is on page 2 — the damage was already done.

People have to push back against this type of junk, or is everyone just so inured to it that they think this is nothing?

I have always believed that you first have a hypothesis, then you test it. Finding a “pattern” is like the old chestnut in archeology – you pick up a stone, juggle it around, find a part that fits the hand, and then say “ah ha, this is a tool, see, I can hold it in my hand, ancient man must have done so too”.

In this case if they had started out saying – we believe that this element of solar variation should have this effect on this aspect of climate because of mechanism “X”. now, let us see if the data supports that hypothesis at an appropriate level of significance – it would mean something. Otherwise, sorry, no prizes.

This “lifetime” concept they focus on: Where else is it used, and what for? (I gather from Rasmus’s qualifier “obscure” that the short answer is “not for much”, but I’m still curious.)

The GCR-cloud hypothesis, for all its flaws, at least has the merit of being straightforward: more solar activity -> less GCR -> less cloud -> warmer. But what does it mean if a solar activity proxy correlates with the lifetime of disturbances in temperature or wind direction? Does solar activity correlate with warmer or colder, north wind or south? Or doesn’t it matter?

BTW, their best matches seem to be restricted not only to one solar proxy and only to winter, as Rasmus noted; their “most suggestive” graphs (fig. 10) are for one country, the Netherlands. Any thoughts why the Dutch in particular get such a good solar-climate correlation? Bart…?

Thanks for the comments, Rasmus. I’m sorry i don’t get (1) how the “problem of multiplicity” may apply to this study, and (2) what you mean by “they may have ended up (unintentionally) ‘cherry picking’ the data”.
I have few comments:
(3) not sure the “vertical Z component from Eskdalemuir” was clear: this is the vertical component of the geomagnetic field at this station, which short term variations are dominated by heliomagnetism (i.e., the solar magnetic influence on the terrestrial magnetic field). The Aa index is similar, but calculated between two distant stations in order smooth out local variations of this influence. So, a geomagnetic component from one particular station will poorly represent global heliomagnetic variations. The fact that the best correlation levels are found with the Eskdalemuir component (rather than with the Aa index, their figure 8) suggests that any link with heliomagnetism should be local.
(4) As you note, the size of the smoothing window (parameter theta in their Equation 1) probably matters, and is poorly detailed. In their figure 8, this size sets the number of data used for each correlation calculation, and we have no idea what this number is.
(5) Still, their figures 9 & 10 are impressive because some series display a high level of correlation. On the other hand, the fact that pressure is not correlated at all (figure 9d) seems odd (suspect) to me.
(6) The temperature data are not discussed at all: does anyone know whether these are homogenized datasets? If these are raw measurements, there may be some spurious shifts and/or long-term urban warming trends.
(7) Their conclusion (p.1316, §4) that they “have provided evidence of significant solar forcing of short-term variations in European temperature” is dishonest: one can speak of ‘forcing’ only if a physical and quantified causal link is demonstrated, not just correlation.
All the best.

Egyptian mythology is famed for its “multiplicity of approaches”. That is, the Egyptians often juxtaposed incompatible narratives, e. g. about the sun: in the space of a few verses of the same hymn, they’d describe how every morning the sun is born anew from a heavenly cow, and in the next breath hail the sun as the self-begotten creator of all else.

Perpetuating a bunch of mutually contradictory myths offering fanciful explanations of observed natural phenomena; ascribing magical powers to the sun… Blog Science: It’s As Old As The Pyramids.

I wouldn’t be totally down on data dredging (er data mining). Provided that the fact that most things that come out of such screening will after further analysis/data be shown to be false positives. Of course few outside the science or math communities can avoid being mislead by the false positives. So, as a technique for misinformation it can be useful, I suspect that is what we are concerned about here.

But, honestly I can imagine having started such a study, and because of sunk costs feel compelled until coming up with something, just so a publication can make good on the sunk costs.

“10JBL says:
20 November 2009 at 9:48 AM
@ Leonard Weinstein: you seem to be somewhat confused. Peer review is a junk filter: its goal is to *begin* separating out good science from bad science. Passing peer review is a first barrier to pass on the way to being established as a good piece of science, not a final barrier.”

I wouldn’t say peer review if the first filter. The content of most papers is reviewed numerous times at conferences (papers and presentations), departmental seminars, or in student dissertations, grant proposals, etc before the papers are formulated and sent for review. A lot of very critical eyes with no skin in the game look at the content long before it ever gets sent to a journal.

Le Mouel et al. (2009) in the Journal of Atmospheric and Solar-Terrestrial Physics did not investigate the Parkinson effect (1959, 1962), a conductivity constrast bewteen between land and sea which leads to characteristic perturbations of the fluctuating magnetic field at the coastline extending several hundred kilometers inland. Ocean currents do have an effect on coastal observatories modulating the response. The vertical component of the local geomagnetic field of coastal stations is noticeably effected.

Didier, why do you say this was not investigated — do you mean just because those papers aren’t cited in this particular paper? From the citations I find (knowing nothing about the field, I’m just looking with Scholar) it appears this Parkinson Effect is referred to quite often; have a look at these:http://scholar.google.com/scholar?q=%22Le+Mouel%22+1975+Parkinson

I think what this post is saying is there is no theory or physics behind the supposed association.

I use the ole positive association between level of storks and birth rates as a spurious correlation for my methods class…..turns out it isn’t the storks that increase birth rates, but that the both higher numbers of storks and higher birth rates are found in rural areas. Who’d have guessed.

Another thing I found interesting was how my horoscope jived so well with my life….tho I only followed it a few years back in the early 70s, so, who knows, there may have been many years of mismatch.

Hank, you are quite right. Following your thread it seems Le Mouel does know about the effect. I am wondering how Le Mouel at all. did disentangle the coastal effects from the purely ‘solar’ driven ones. This is not discussed in this particular paper it seems. Although they claim that the choice of observatory does not change the results. I have not checked the given reference (Le Mouel et al. 2005) I may add. I am just worried that the ‘North Atlantic signal’ may contaminate their ‘solar proxy’ and correlates to European winter temperatures.

I have been misunderstood. I was mocking the point of peer review. In fact it has now been established that all the pro AGW peer reviewed papers (not GW) are suspect. The skeptic papers are probably more near correct, although with the corrupted data the warmers have made available (as gate keepers to many of the data sources) it limits what they can count on to what they independently obtained. Peer review has little to do with the validity of a paper except that it applies an additional filter to try to catch totally wrong papers and clear obvious errors.

I still keep seeing the claims that global warming has not been disproved. What a straw man. That is NOT the issue! It is the human cause that is the issue, and the prediction of major future increases in temperature and flooding that are the result of the issue. Those have NEVER been supported with falsifiable real world data. The actual BELIEVABLE historical temperatures indicate the present temperature is a typical high from natural variation. The lack of temperature change the last 10 years or so and the up and down trends every 30 years prior to that support the claim that the trend will be generally down the next 20 years or so. If there is any human caused warming, it is of small effect and no problem.

[Response: Proof by declaration is not proof. On the other side, there are dozens and dozens of studies on the attribution of climate change (See chapter 8 of IPCC), not one of which supports your contention. – gavin]

Hmm. You say only one proxy: “The Le Mouel study found a ‘decent match’ only for one solar proxy and only in winter: the vertical Z component from Eskdalemuir.” You don’t really mean to say that the vertical Z component from the magnetic field of the sun is dependant on the place were it was measured??? Be it Eskdalemuir or anywhere else on the planet. I would say this is suggestive writing which has it effects on the other people responding here. The Z component is a very important proxy for the sun’s activity nothing wrong with that. Total magnetic field would probably do too but this gives a stronger signal. What I like about this study is that it does observations and not much explaining. As I am living in Holland and like to skate and am old enough I can observe the correlations that are showed here though I am suprised they have been able to get the correlations significant. It also corresponds with the findings done in peat bog research and Beryllium and the observations for the MWP and Maunder minimum: Western Europe seems to be more sensitive for the solar signal probably through changes in circulation that does not need to have a global character.

By now it must be clear that chapter 8 and the conclusion of the IPCC report are now questionable. They may in the end be validated, but are now clearly unsupportable with present references. If you do not agree the released e-mails and other material clearly requires a review of the entire result, there is no point communicating to you, as that is proof of a predetermined opinion.

[Response: Well that must be it then. Why Climate models need to be re-evaluated because the emails of a group of people almost wholly divorced from the process of developing climate models has been hacked is a mystery to me. But I’m obviously just biased. – gavin]