Overconfidence in IPCC’s detection and attribution: Part I

Arguably the most important conclusion of IPCC AR4 is the following statement:

“Most of the observed increase in global average temperatures since the mid-20th century is very likely due to the observed increase in anthropogenic greenhouse gas concentrations.”

where “very likely” denotes a confidence level of >90%. The basis for this statement is comparison of global climate model simulations with observations for the 20th century, for simulations conducted with natural forcing (solar and volcanic) only and natural plus anthropogenic forcing (greenhouse gases and anthropogenic aerosol). See Figure SPM.4. The agreement between 20th century global surface temperature observations and simulations with natural plus anthropogenic forcing provides the primary evidence to support this conclusion.

I have made several public statements that I think the IPCC’s “very likely” confidence level for attribution too high, and I have been chastised in the blogosphere for making such statements. Here I lay out my arguments in support of my statement regarding the IPCC’s overconfidence.

After digging deeply into this topic, I have finally diagnosed the cause of my recent head spinning symptoms: overexposure to circular reasoning by the IPCC.

Many direct quotes from these three documents are used here, including entire paragraphs. For ease of reading, I have not blocked or italicized the quotes, but indicate them with quotation marks and a parenthetical citation at the end of the paragraph. (clarification for the plagiarism police :) )

Overview of IPCC’s detection and attribution

“The response to anthropogenic changes in climate forcing occurs against a backdrop of natural internal and externally forced climate variability that can occur on similar temporal and spatial scales. Internal climate variability, by which we mean climate variability not forced by external agents, occurs on all time-scales from weeks to centuries and millennia. Slow climate components, such as the ocean, have particularly important roles on decadal and century time-scales because they integrate high-frequency weather variability and interact with faster components. Thus the climate is capable of producing long time-scale internal variations of considerable magnitude without any external influences. Externally forced climate variations may be due to changes in natural forcing factors, such as solar radiation or volcanic aerosols, or to changes in anthropogenic forcing factors, such as increasing concentrations of greenhouse gases or sulphate aerosols. “ (IPCC TAR)

The presence of this natural climate variability means that the detection and attribution of anthropogenic climate change is a statistical “signal-in-noise” problem. Detection is the process of demonstrating that an observed change is significantly different (in a statistical sense) than can be explained by natural internal variability. (IPCC TAR)

“An identified change is detected in observations if its likelihood of occurrence by chance due to internal variability alone is determined to be small, for example, <10%. Attribution is defined as the process of evaluating the relative contributionsof multiple causal factors to a change or event with an assignment of statistical confidence. Attribution seeks to determine whether a specified set of external forcings and/or drivers are the cause of an observed change in a specific system.” (IPCC 2009)

“[U]nequivocal attribution would require controlled experimentation with our climate system. Since that is not possible, in practice attribution is understood to mean demonstration that a detected change is “consistent with the estimated responses to the given combination of anthropogenic and natural forcing” and “not consistent with alternative, physically-plausible explanations of recent climate change that exclude important elements of the given combination of forcings” (IPCC 2009)

Let me clarify the distinction between detection and attribution, as used by the IPCC. Detection refers to change above and beyond natural internal variability. Once a change is detected, attribution attempts to identify external drivers of the change.

“Information about the expected responses to external forcing, so-called ‘fingerprints’, is usually derived from simulations by climate models. The consistency between an observed change and the estimated response to a forcing can be determined by estimating the amplitude of a ‘fingerprint’ from observations and then assessing whether this estimate is statistically consistent with the expected amplitude of the pattern from a model.” (IPCC 2009) Details of the optimal fingerprint method employed by the AR4 are given in Appendix 9A. This method is a generalized multivariate regression that uses a maximum likelihood method to estimate the amplitude of externally forced signals in observations. Bayesian approaches are increasingly being used in this method.

I haven’t delved into the statistics of the fingerprinting method, but “eyeball analysis” of the climate model results for surface temperature (see Figure SPM.4 and 9.5) is sufficient to get the idea. Note, other variables are also examined (e.g. atmospheric temperatures) but for simplicity the discussion here is focused on surface temperature.

(IPCC AR4) “Climate simulations are consistent in showing that the global mean warming observed since 1970 can only be reproduced when models are forced with combinations of external forcings that include anthropogenic forcings (Figure 9.5). This conclusion holds despite a variety of different anthropogenic forcings and processes being included in these models. In all cases, the response to forcing from well-mixed greenhouse gases dominates the anthropogenic warming in the model. No climate model using natural forcings alone has reproduced the observed global warming trend in the second half of the 20th century. Therefore, modelling studies suggest that late 20th-century warming is much more likely to be anthropogenic than natural in origin, a finding which is confirmed by studies relying on formal detection and attribution methods (Section 9.4.1.4).”

“Modelling studies are also in moderately good agreement with observations during the first half of the 20th century when both anthropogenic and natural forcings are considered, although assessments of which forcings are important differ, with some studies finding that solar forcing is more important (Meehl et al., 2004) while other studies find that volcanic forcing (Broccoli et al., 2003) or internal variability (Delworth and Knutson, 2000) could be more important. . . The mid-century cooling that the model simulates in some regions is also observed, and is caused in the model by regional negative surface forcing from organic and black carbon associated with biomass burning. Variations in the Atlantic Multi-decadal Oscillation (see Section 3.6.6 for a more detailed discussion) could account for some of the evolution of global and hemispheric mean temperatures during the instrumental period; Knight et al. (2005) estimate that variations in the Atlantic Multi-decadal Oscillation could account for up to 0.2°C peak-to-trough variability in NH mean decadal temperatures.” (IPCC AR4)

In summary, the models all agree on the attribution of warming in the latter half of the 20th century, but do not agree on the causal factors for the early century warming and the mid-century cooling.

Detection and the issue of natural internal variability

My assessment of the IPCC’s argument for detection and attribution starts with the issue of detection, which relates to the background of natural internal variability against which forced variability is evaluated. Detection (ruling out that observed changes are only an instance of internal variability) is thus the first step in the process of attribution. The issue of detection receives much more attention in the TAR; it appears that the AR4 bases its arguments on the detection analysis done in the TAR.

The modes of natural internal variability of greatest relevance are the Atlantic modes (AMO, NAO) and the Pacific models (PDO, often referred to as IPO) of multidecadal climate variability, with nominal time scales of 60-70+ years. A number of studies (journal publications and blogospheric analyses) have attributed 20th century regional and/or global surface temperature variability to the PDO and AMO; no attempt is made here to document these studies (this will be the topic of a future post), but see Roy Spencer and appinsys for the general idea.

There are three possible methods for assessing the background of natural internal variability: examination of the historical data record, examination of the paleoclimatic proxy data record, and long-term climate model simulations.

There are several problems with using the historic surface temperature observations. The time period (~150 years) is short relative to the ~70 year oscillations of interest. Further, the method of constructing the global sea surface temperature data sets uses EOFs to infer missing data and smooth available observations by making assumptions about the statistical properties of the observations using data from the relatively data rich period 1960-1990. This presumption almost certainly damps the longer internal multidecadal oscillations particularly in the data sparse Pacific Ocean (note this will be discussed in detail in a future post). Another problem is that in order to infer natural internal variability from the historical data set, the forced variability must be removed. This is accomplished using a climate model; however, the accuracy of this method is limited by incomplete knowledge of the forcings and by the accuracy of the climate model used to estimate the response.

The problems with the paleoclimate data are well known and will not be summarized here; however, the issue of interest in this context is not the “blade” of the hockey stick, but rather the modes of variability and their magnitude seen in the stick handle. Getting these multidecadal variations correct in the reconstructions would be very valuable in understanding the modes of natural internal climate variability, and to what extent such variations might explain 20th century climate variability. Interpretation of the natural modes of variability from the paleoclimate record suffers from the same challenge as for the historical data set; the forced variability (e.g. solar, volcanic) must be removed.

Owing to the problems in using both historical and paleo data in documenting the magnitude of natural internal climate variability, climate models seem to be best option. Several modeling groups have conducted 1000 year unforced control simulations, which are described in IPCC TAR. Figure 12.2 shows the power spectra of global mean temperatures in terms of period. For modes exceeding 60 years (which is the period of relevance for the PDO and AMO), the models all have less power than the spectra for te historical observations. And recall that the historical spectra for these periods is likely to be damped for two reasons: the EOFs damp variability at longer time scales; and the assumption of forced variability made in the model calculations used to separate forced from internal variability.

The summary from AR3 is: “These findings emphasise that there is still considerable uncertainty in the magnitude of internal climate variability.” The AR4 did little to build upon the AR3 analysis of natural internal variability. Relevant text from the AR4 Chapter 8: “Atmosphere-Ocean General Circulation Models do not seem to have difficulty in simulating IPO-like variability . . . [T]here has been little work evaluating the amplitude of Pacific decadal variability in AOGCMs. Manabe and Stouffer (1996) showed that the variability has roughly the right magnitude in their AOGCM, but a more detailed investigation using recent AOGCMs with a specific focus on IPO-like variability would be useful.” “Atmosphere-Ocean General Circulation Models simulate Atlantic multi-decadal variability, and the simulated space-time structure is consistent with that observed (Delworth and Mann, 2000).” Note, these models capture the general mode of variability; they do not simulate the timing of the observed 20th century oscillations, which reflects ontic uncertainty.

In spite of the uncertainties associated in documenting natural internal variability on time scales of 60-70+ years, natural internal variability plays virtually no role in the IPCC’s explanation of 20th century climate variability (which depends solely on natural and anthropogenic forcing).

Increasing attention is being paid to IPCC misrepresentations of natural oceanic variability on decadal scales (Compo and Sardeshmukh 2009): “Several recent studies suggest that the observed SST variability may be misrepresented in the coupled models used in preparing the IPCC’s Fourth Assessment Report, with substantial errors on interannual and decadal scales (e.g., Shukla et al. 2006, DelSole, 2006; Newman 2007; Newman et al. 2008). There is a hint of an underestimation of simulated decadal SST variability even in the published IPCC Report (Hegerl et al. 2007, FAQ9.2 Figure 1). Given these and other misrepresentations of natural oceanic variability on decadal scales (e.g., Zhang and McPhaden 2006), a role for natural causes of at least some of the recent oceanic warming should not be ruled out.”

Rethinking detection

The relative lack of attention to natural internal variability given by the AR4 leads to the inference that the IPCC regards natural internal variability as noise that averages out in an ensemble of simulations and on the timescales of interest. However, the primary modes of interest are those having timescales 60-70+ years, which is comparable to the time scale of the main features of the 20th century global temperature time series.

The temperature “bump” in the 1930’s and 1940’s, which has its greatest expression in the Arctic (Polyakov et al. 2003 Fig 2), is ambiguously explained by the IPCC as a combination of solar forcing, volcanic forcing, and anthropogenic aerosols. The AMO and PDO is at least equally plausible to the IPCC explanation for this feature.

The climate community and the IPCC needs to work much harder at clarifying natural internal variability on timescales of 60-100 years. The historical sea surface temperature data needs expanding and cleaning up. A focus of paleo reconstructions for the past 2000 years should be detecting multidecadal variability, rather than trying to convince that the recent decade is the warmest decade, etc.

The experimental design for elucidating internal variability from climate models needs rethinking. A single 1000 year simulation is inadequate for ~70 year oscillations. An ensemble of simulations is needed, for at least 2000 years. Owing to computational resource limitations, it seems that relatively low resolution models could be used for this (without flux corrections).

Spectral studies such as Figure 12.2 in the IPCC TAR need to be expanded, but using climate models to eliminate the forced behavior in the observed time series introduced circular reasoning in the detection/attribution argument (more on this in Part II).

And finally, attribution studies can’t simply rely on model simulations, since model simulations (even if they capture the correct spectrum of variability) won’t match the observed realization of the multidecadal modes in terms of timing.

Part II: forthcoming

The uncertainty monster associated with IPCC’s detection and attribution argument is of the hydra variety: the more I dig, the more heads the monster develops. In Part II, we will examine issues surrounding the forcing data and model inadequacy as it related specifically to the attribution problem (and how this then feeds back onto the detection problem).

102 responses to “Overconfidence in IPCC’s detection and attribution: Part I”

The sad fact is that the community of climatologist had absolutely no idea about the nature of Earth’s heat source because of misunderstandings of experimental data from analysis of two key extraterrestrial samples that suddenly became available in 1969 after:

a.) Lunar samples were returned by the Apollo Missions to the Moon, and

b.) The Allende meteorite fell near near the village of Pueblito de Allende,

This brief summary of experimental data and observations explains the Iron Sun theory – a concept that was apparently unfamiliar to Al Gore, the UN’s IPCC and most climatologists .

“…..the amplitude and even the sign of cloud feedbacks was noted in the TAR as highly uncertain…”

…..as an albedo decrease of only 1%, bringing the Earth’s albedo from 30% to 29%, would cause an increase in the black-body radiative equilibrium temperature of about 1°C, which is the same amount a doubling of CO2 will give without the unproven feedbacks.

“… the amplitude and even the sign of cloud feedbacks was noted in the TAR as highly uncertain, and this uncertainty was cited as one of the key factors explaining the spread in model simulations of future climate for a given emission scenario. This cannot be regarded as a surprise: that the sensitivity of the Earth’s climate to changing atmospheric greenhouse gas concentrations must depend strongly on cloud feedbacks can be illustrated on the simplest theoretical grounds, using data that have been available for a long time. Satellite measurements have indeed provided meaningful estimates of Earth’s radiation budget since the early 1970s (Vonder Haar and Suomi, 1971). Clouds, which cover about 60% of the Earth’s surface, are responsible for up to two-thirds of the planetary albedo, which is about 30%. An albedo decrease of only 1%, bringing the Earth’s albedo from 30% to 29%, would cause an increase in the black-body radiative equilibrium temperature of about 1°C, a highly significant value, roughly equivalent to the direct radiative effect of a doubling of the atmospheric CO2 concentration. Simultaneously, clouds make an important contribution to the planetary greenhouse effect. …”

“..Between 1968 and 1972…the surface temperature of oceans in the northern hemisphere plummeted by 0.3C…… Researchers had thought that the drop was due to the build-up of sulphur aerosols in the atmosphere from fossil fuel burning. These cool the planet by reflecting sunlight…. “The work in the paper questions this somewhat,” said Jones. “We didn’t know that ocean temperatures in the northern North Atlantic cooled so rapidly before this paper.” Previously, it was thought that sea surfaces in the northern hemisphere cooled gradually and steadily after the second world war before heating up abruptly in the 1970s…”

It’s all very well attributing changes to the PDO and AMO, but these are phenomena themselves, not fundamental causes. We need to delve much deeper to understand what, in turn, drives the ocean cycles.

Alex, the PDO,AMO etc. are natural internal modes, unforced, arising from the nonlinear dynamics of the coupled atmosphere/ocean system (manifestations of the spatio-temporal chaos of the system). so our current understanding is that there is no actual forcing. But I agree all this needs further investigation.

See Scafetta’s appendix for coupled oscillator theory as one explanation for how small cosmic forcings can cause major impacts to climate. Gravitational, TSI, solar magnetosphere modulating cosmic rays are examples of weak cosmic forcings that could underly the PDO etc.

the PDO,AMO etc. are natural internal modes, unforced, arising from the nonlinear dynamics of the coupled atmosphere/ocean system (manifestations of the spatio-temporal chaos of the system). so our current understanding is that there is no actual forcing.

But the system is dissipative, so if you don’t have some sort of forcing, won’t the oscillations damp out fairly quickly? Or are you just talking about anthropogenic sorts of forcings? I may just be getting tripped up on terminology that climate folks use a little differently.

I’m reminded of Richardson’s little ditty:

Big whorls have little whorls
That feed on their velocity,
And little whorls have lesser whorls
And so on to viscosity.

The system is both driven and dissipative – the differences in heating with latitude drive convection continually in both atmosphere and oceans. Large scale convection drives turbulence with a wide frequency spectrum at extra-tropical latitudes, and this can either feed energy randomly into resonant systems, or the chaotic turbulence can flip from one domain of behaviour to another at random. No change in forcing is required to get major and persistent changes in behaviour.

(Just for fun, try the chaotic sequence x(n+1) = x(n) + 0.01Sin(2Pi x(n)) + 0.3584Sin(4Pi x(n)) for x(0) between 0 and 1 as an example of what I mean by ‘flipping’. I’m not claiming it represents the climate system.)

Incidentally, I understand AR(2) stochastic processes can also give pseudo-periodic behaviour when the roots of the characteristic equation are complex. There are lots of ways that changes in output can arise without changes in forcing.

One could conclude that there are significant arguments for another null hypothesis.The IPCC arguments that areas that are well understood rarely stand the test of time say for solar eg Haigh et al 2010, or Volcanics eg Stenchikov 2007 ,2009,

New analyses of both satellite and radiosonde data give increased confidence in changes in stratospheric temperatures between 1980 and 2009. The global-mean lower stratosphere cooled by 1–2 K and the upper stratosphere cooled by 4–6 K between 1980 and 1995. There have been no significant long-term trends in global-mean lower stratospheric temperatures since about 1995. The global-mean lower-stratospheric cooling did not occur linearly but was manifested as downward steps in temperature in the early 1980s and the early 1990s.

Michael Ghil 2001 makes some convincing arguments.

To conclude, we briefly address Problem 9. More precisely,we ask whether the impact of human activities on the climate is observable and identifiable in the instrumental records of the last century-and-a-half and in recent paleoclimate records? The answer to this question depends on the null hypothesis against which such an impact is tested. The current approach that is generally pursued assumes essentially that past climate variability is indistinguishable from a stochastic red-noise process (Hasselmann, 1976),
whose only regularities are those of periodic external forcing (Mitchell, 1976). Given such a null hypothesis, the official consensus of IPCC (1995) tilts towards a global warming effect of recent trace-gas emissions, which exceeds the cooling effect of anthropogenic aerosol emissions. Atmospheric and coupled GCM simulations of the tracegas warming and aerosol cooling buttress this IPCC consensus.The GCM simulations used so far do not, however, exhibit the observed interdecadal regularities described at the end of Sect. 3.3. They might, therewith, miss some important physical mechanisms of climate variability and are, therefore, not entirely conclusive.As northern hemisphere temperatures were falling in the 1960s and early 1970s, the aerosol effect was the one that caused the greatest concern. As shown in Sect. 2.2, this concern was bolstered by the possibility of a huge, highly nonlinear temperature drop if the climate system reached the upper-left bifurcation point of Fig. 1.

The global temperature increase through the 1990s is certainly rather unusual in terms of the instrumental record ofthe last 150 years or so. It does not correspond, however, to a rapidly accelerating increase in greenhouse-gas emissions or a substantial drop in aerosol emissions. How statistically significant is, therefore, this temperature rise, if the null hypothesis is not a random coincidence of small, stochastic excursions of global temperatures with all, or nearly all, the same sign? The presence of internally arising regularities in the climate system with periods of years and decades suggests the need for a different null hypothesis.

Ghil is well known in non linear dynamics even if I would not range him to “experts” like Li , Yorke or Ruelle .
I think that the paper linked is not very enlightening as such and especially Ghil’s belief that there is a necessary consistence within the hierarchy of models (e.g from 0D to full 3+1 D) is more than subject to caution.
As it uses specific tools for non linear dynamics (normal forms and bifurcations) , it is unreadable by anybody who didn’t study non linear dynamics and temporal chaos.
However I would like to stress several points in his conclusion that I think to be most worth of notice.

we ask whether the impact of human activities on
the climate is observable and identifiable in the instrumental
records of the last century-and-a-half and in recent paleoclimate
records? The answer to this question depends on the
null hypothesis against which such an impact is tested.
The current approach that is generally pursued assumes
essentially that past climate variability is indistinguishable
from a stochastic red-noise process …
Given such a null hypothesis, the official consensus of IPCC (1995) tilts towards a global warming effect of recent trace-gas emissions, which exceeds the cooling effect of anthropogenic aerosol emissions.”

Of course the discussion about invariant statistical properties (if any !) of weather/climate what Ghil reminds as the necessary use of “the ergodic theory of non linear systems” , is the single most important part of testing this hypothesis as I have been saying on this blog since the very first post.
It is clear that whether one may or may not assume that “weather cancels” changes completely the picture and the conclusions.

The GCM simulations used so far do not, however, exhibit
the observed interdecadal regularities described at the
end of Sect. 3.3. They might, therewith, miss some important
physical mechanisms of climate variability and are, therefore not entirely conclusive.

Said by somebody who believes in the internal consistency of the model hierarchy, this is a rather strong statement despite the “mights” and “not entirelys”.

The presence of internally arising regularities in the climate
system with periods of years and decades suggests the
need for a different null hypothesis. Essentially, one needs
to show that the behaviour of the climatic signal is distinct
from that generated by natural climate variability in the past,
when human effects were negligible, at least on the global
scale. As discussed in Sects. 2.1 and 3.3, this natural variability
includes interannual and interdecadal cycles, as well
as the broadband component. These cycles are far from being
purely periodic. Still, they include much more persistent
excursions of one sign, whether positive or negative in global
or hemispheric temperatures, say, than does red noise.

I am afraid my idea of the use of models, and yours, Judith, are completely different. Validated models are mainly confined to engineering. Non-validated models are extremely useful in physics, and hence climate science. But for just about one purpose only. That purpose is the desgin of the next experiment. Since we are not talking the design of experiments here, non-validated models should have no place in the discussion.

If we examine variations in models, we may learn something about how the models work, but we will get no new information as to what is happening in the real world.

Jim,
I believe you have summarized a very big distinction between how engineering and climate science research as practiced. Climate scientists like to use the terminology of engineering software validation. Using it leads folks outside their field to misjudge and overestimate the quality and skill of the underlying modeling. That is a problem.

Gary, I hope not. I hope I have described how engineers and people who follow the classical “scientific methodology”, as developed by Galileo and Newton, use models, as compared with climate scientists who believe that what the IPCC has done is, somehow, physics.

I am not sure I have fully understood the metohodology here,
but we have a chaotic system. The IPCC then removes chaotic factors, as they cancel out over time, leaving just a few external variables to rule the whole system?

It seems like a total disregard for the concept of chaos. Only with extremely accurate measurements over a very long time can patterns of internal forcings (to a certain degree) be identified and “canceled out”. And this is far from being the case today.

Q: How do you find a needle in a haystack?

IPCC: It’s easy. Just make sure the haystack consists of maximum 3 straws.

IPCC: We observed the falling of the last few straws of a haystack being formed and therefore assume that all haystacks are formed in a similar manner. We will also assume that a needle was put into the haystack.

Despite knowing that straw moisture affects hay’s drop speed, density, and the amount that actually ends up on the pile vs. blown away by the wind, we assume it does not matter much so we don’t bother modeling straw moisture or wind speed.

Based on our best haystack model we can conclude that the needle is very likely to be 1 foot above the ground and 4 feet in from the Northwest corner (plus or minus 2 feet).

Since we know very likely where the needle is, we don’t have to look for it or verify that it is actually there. If you are skeptical and you don’t believe our best model, we also ran a few other models with an ensemble of needle deposit scenarios, and between them we have a 90% certainty that there is a needle located in the haystack.

If you do actually bother to look for the needle (not perfomed, more funding is needed to study the location of the needle), as long as you find a needle in the haystack then your results are “consistent with” the haystack models.

A very good read- it’s nice to see someone actually taking the time to try to understand not only the uncertainties, but the detection of signal over noise and it’s attribution to cause, something that’s been a permanent bug-bear of mine since reading the report.

Perhaps someone could clarify something for me, the IPCC report is trying to establish a signal above the baseline ‘noise’ and then attribute this to anthropogenic causes (glossing over the perils of looking for a specific outcome prior to starting your work). I see some major issues with this which will take some explaining as they’re all (predictably) heavily interlinked….

Given the incomplete understanding of the climate and all the forcing’s (both internal and external but especially clouds), how is it that the IPCC are able to attribute the recent very small temperature rise to an anthropogenic factor over an undiscovered/under-understood natural factor? It seems every few weeks a new study is released that further refines our understanding of the climate (recent example the work showing increased climatic impact from solar minimums) and although these individual factors themselves may not have enough ‘whack’ to explain the recent temp rise, a combination thereof may.

I guess my initial point here, is that before we even get onto the uncertainties over the fingerprint they are trying to locate, they should first find out whether this fingerprint could actually be something natural and as yet under-understood masquerading as an anthropogenic signal. I know they pay this lip service but until you fully understand the system, surely complete attribution is impossible? If you follow, have yourself a celebratory biscuit.

It is my very real worry that the attribution of this fingerprint is being given to something that is in fact, barely related. For example, I am as yet unconvinced regarding the causal relationship between co2 and temp. Blink tests on the paleoclimatic data suggest a reverse relationship to the one presented by the IPCC and it calls into mind a rather simple analogy:
Water in a pot on a stove: The stove heats (temp) and the water temp rises accordingly (co2 levels). The stove goes off and cools (temp falls). The water in the pot cools, but lagging behind (co2).
To my (very basic) mind, the IPCC are trying to suggest that a Hot water pot proves that it heats the stove, not the other way around.

Which brings me to my second point/question of this long (sorry!) post.
To me, given my scientific background, the more I study the subject the more I get the impression that the IPCC et al decided on the’ fact of the matter’ long ago (co2=bad) and have since been trying to shore that theory up, not test it. Using various tortured models to provide the proof that they (to now) have been unable to generate using normal methods.

I do not accept that models can be used to simulate/produce data. Models are exceptionally useful in tightly controlled, very well understood systems, but the climate is neither. I also do not accept that models can be used to prove a theory; you can use them to refine one via corroborative experimentation of course, but not as an ‘end’ point.

I guess what I’m trying to say is that I am very confused by the approach of the IPCC and the core climate scientists and just cannot reconcile it with what I recognise as the proper scientific process. Which is probably what is should have just written in the first place…

About
The John Ray Initiative (JRI) is an educational charity with a vision to bring together scientific and Christian understandings of the environment in a way that can be widely communicated and lead to effective action. It was formed in 1997 in recognition of the urgent need to respond to the global environmental crisis and the challenges of sustainable development and environmental stewardship.

Plus a repeated criticism of Professor LIndzen (accusation) it would be interesting to see Professor Lindzens response to Sir John Houghton.

“A widely quoted statement of Lindzen’s about IPCC Reports is that the chapters are fine but that the Policymakers’ Summaries contain different
messages and do not faithfully represent the content or conclusions of the chapters. Despite his constant repetition of this comment, he has never to my knowledge presented a single concrete example to support it. In any case discrepancies between the chapters and the Summary would never survive the scrutiny of the final intergovernmental IPCC plenary”

Another Extract:

The ‘balancing-out process’

“…….A key example of this balancing process concerns the best value of what is known as the climate sensitivity, that is the increase in global average temperature associated with a doubling of atmospheric carbon dioxide that, unless severe mitigating action is taken, is likely to occur during the second half of the 21st century. The likely value of climate sensitivity has large relevance for consideration of the likely magnitude of impacts from climate change in the future.

Relevant information regarding its value comes from observations of past climates (including the ice age period) and from model simulations. The IPCC 1990 report estimated its value as 2.5 ºC with an uncertainty range of 1.5 to 4.5 ºC. The largest uncertainty arises from the lack of knowledge of clouds, in particular the average magnitude of cloud-radiation feedback. Subsequent reports all gave detailed consideration to the value of climate sensitivity. The 1995 and 2001 Reports maintained the same best value and range as in 1990.

The 2007 Report increased the best estimate to 3 ºC and reduced the range to
2.0 to 4.5 ºC, considering that evidence now makes it unlikely the value will be less than 2ºC.”

Sir John Houghton: Oct 2010
“Uncertainty in IPCC Reports
Throughout IPCC Reports, statements of levels of uncertainty are frequently made; in the latest report many of these are quantified. For instance, the increased risk of extremes may be described as likely (67% likelihood) or very likely (90 likelihood). Some who look at IPCC reports interpret these references to uncertainty as anindication that the science of climate ]change is as a whole very uncertain.

Such a superficial interpretation is quite false; the IPCC has always sought to distinguish clearly those conclusions that are relatively certain from those where there is large uncertainty. Few if any scientific conclusions concerning the climate can be completely certain but to ignore those that appear likely or very likely would be highly irresponsible.
END
OF
BOX

Excellent summary of the problem. It is exactly that claims that models “can” replicate multidecadal fluctuations (PDO etc) does not mean that they do in the attribution studies for the late 20th Century. In contrast, I show that these cycles match the satellite record nicely and explain the warm bump of this period:
Loehle, C. 2009. Trend Analysis of Satellite Global Temperature Data. Energy & Environment 20: 1087-1098
a result that was scorned and reviled, if noticed at all. The claim by advocates seems to be that the only way to do an attribution study is with a climate model, which, as Judith notes, is a circular argument.

This is not a critisim of your paper, what I have to say is much more general in scope and applies to numerous analyses both published in journals and on blogs.

It is the assumption of “redness” to characterise the noise spectrum. I have not seen this justified. I might say (and actually beleive) that it is predominently “pink”.

Also the mathematical treatment relying on reduced or effective degrees of freedom based on a “red” assumption is not sound even if the noise is red. It is indicative but I would not rely on results got by using the reduced degrees of freedom approach to scale the slope/residual ratio (which is not uncommon) as the resulting statistic is not Normal/X^2, the numerator and denominator become entangled i.e. no longer independent. I believe this is true but I have never seen it noted as a limitation. There are also some rather nasty arguments based on priors that may not be adequately addressed.

The reduced DoF issue is I believe the less important, the assumption of red over pink for the characterisation of the noise being the greater issue.

I suspect that under a pink assumption one might find it difficult to show that it is “very likely” that less than half the warming since 1970 is not due to random fluctuations on the assumption that all we have is linear trend plus noise, without having to bother with other natural cyclic trends.

Pink noise is much more “edgy” with sharper short term gradients but also has true LTP which red noise doesn’t.

Like I said this is a very general criticism, the red assumption is the way that these things are argued and it seems to go unchallenged. The way you have used it is probably as valid as the way that it is used elsewhere, and it seems to be commonly used.

I can, I hope, argue these points in a lot more depth, but it is not in the scope of this topic, if there is a thread on another blog etc. I could flesh this out a bit more. My arguments come from considering the nature of the impulse response function i.e. the characteristic filter of the system. In simple terms the red assumption is compatable with a slab ocean thermal model and I don’t think that is an appropriate model and seems incompatable with the data.

Well the question was meaningful because there is nothing stronger for a stochastical process then knowing precisely the distribution. Generaly we only dream about that :)
It describes everything one needs for any imaginable analysis and there is plenty of things to tell about a random variable whose distribution one knows.

You could for example generate any amount of time series with this distribution and run on it all kinds of filters you want.
Obviously it is not something that one can answer just like that in 5 minutes, but the question is meaningful.

I was interested just out of curiosity what would say somebody who thinks of noises in terms of red-pinkish concepts (spectral analysis) if given a “noise” like the one I gave.

OK. The question is know meaningful as you state what the distribution is sdomethings are still not clear.

You say:

“It describes everything one needs for any imaginable analysis and there is plenty of things to tell about a random variable whose distribution one knows.”

Except of course for the properties of the filter. So I cannot see that your statement can be true. For to analyse a series generated by the filtering a series generated from a distribution one must have regard to the properties of the filter.

In general any series of values randomly drawn from any distribution, and therefor the drawing is independent of past behaviour, is in a sense white, it has a property that coloured noise does not have, i.e. that if you scramble the order of the series you still have a series with the same essential qualities notably its colour. But white noise from a gaussian distribution is still different than when from a different distribution.

If you take a coloured series and scramble the order throughly enough it will become white in the sense above, i.e. further scrambling will no longer effect its essential qualities, notably its colour.

To discuss coloured noise fully one needs to know both the generating function and the filter. For some aspects one only needs to know the qualities of the filter.

Determination of the proportion of the variance that a vector A will recover from a series is not I think a function of the generating distribution but only of the filter. For white noise the amount of variance recovered is independent of the choice vector A provided it is not null. That is for a series of length N it would be expected to recover 1/N of the variance of the series irrespective of the form of A.

When calculating the expectation of the dot product of a series generated by filtering white noise and a vector A, one must know some features of the generating distribution prior to filtering, certainly its mean and variance but not I think anything more.

To your original question, a noise series drawn from the distribution you posed can not be white as the mean will not be zero, all values being in the positive interval ]0,1[. It will not be red as, except for the non-zero mean, its amplitude will not decline with increasing frequency according to 1/f, so it is neither white nor red.

In general any series of values randomly drawn from any distribution, and therefor the drawing is independent of past behaviour, is in a sense white, it has a property that coloured noise does not have, i.e. that if you scramble the order of the series you still have a series with the same essential qualities notably its colour.

And I do not quite agree.
Drawing from a known distribution still doesn’t make a (sort of) white noise. The autocorrelation properties are encoded in the distribution because if X2 follows often X1 in the original series then their probabilities cannot be very different.

So if you reconstruct many series from a known distribution, it is more probable that the reconstructed series will piecewise near to the original series than very far.
F.ex clusterings which are a property of the spectral power distribution (e.g “color”) will also be reproduced by drawing from a known distribution.
But as spectral analysis is not something that I practiced often while you seem to have used it much, I was curious if there are methods (papers) saying how one could extract information about spectral properties of a variable when knowing its distribution.

The colour is associated with a time series (a joint distribution taken over all times). You can take any time series, apply a Fourier transform, and take the squared magnitude to get the power spectrum, which will have variations in spread at different frequencies. A peak at low frequencies constitutes ‘redness’.

Time series from a wide class of stochastic processes may be considered to be the output of a linear filter applied to white noise, but the terminology also applies to processes that are not the result of any filter.

The assumption that the PDO and other variability is random and will cancel over a long time period, if true, would mean that ONLY for time spans exceeding 150 years (more or less) would they average out for a detection study so that one could discount 60 yr cycles as causative. While this might lead one to ascribe the past 150 yr .7 deg C warming as not due to internal cycles (though 150 yrs starts before GHG were elevated, causing a time-causal paradox), it is not possible to use this argument to detect/ascribe cause to the 1980s/1990s warming, which is too short. Can’t have it both ways, guys, sorry.

We must remember that the IPCC has only shown us what they want us to see and even so the models almost completely fail to represent the 1910 to 1945 temperature increase. There is next to nohing nothing in the TAR on precipitation modelling yet the cycle of evaporation> water vapour> clouds> precipitation is at the heart of quantifying the anthropogenic impact.

Dr Curry
Detection and attribution – surely the 2 most important aspects of climate change. And of course, without detection, the question of attribution does not arise.
I gather from your post that detection relies entirely on computer modelling. So here is a thought: if the world did not possess these multi-million dollar playstations, would the average citizen of the world going about his daily life have had the slightest idea that climate was changing any more than it ever did?
For practical reasons, we have to focus on surface temperatures as proxies for weather/climate. What was so unusual about surface temperatures in the latter part of the 20th century? In a statement to the British House of Lords, Lord Hunt of King’s Heath said ‘Observations collated at the Met Office Hadley Centre and the University of East Anglia Climate Research Unit indicate that the rate of increase in global average surface temperature between 1975 and 1998 was similar to the rates of increase observed between 1860 and 1880 and between 1910 and 1940 (approximately 0.16 C° per decade).’ This has subsequently been confirmed by Phil Jones.
The level of scientific understanding is not at a level where computers can be usefully used to model climate, so why are we wasting this massive resource on a futile exercise. It’s a travesty!
Gary Mirada
PS Did I hear some giggling at the back of the class when I mentioned the Met Office and CRU?

Various posters make reference to 2AR. According to Chris Monkton (Viscount of Brenchley to you):

‘The scientists’ final draft of the 1995 IPCC report contained five clear statements to the effect that humankind’s influence on global temperature was not yet discernible. They are as follows –
“None of the studies cited above has shown clear evidence that we can attribute the observed [climate] change to the specific cause of increases in greenhouse gases.”
“No study to date has positively identified all or part [of observed climate change] to anthropogenic causes.”
“While none of these studies has specifically considered the attribution issue, they often draw some attribution conclusions, for which there is little justification.”
“Any claims of positive detection of significant climate change are likely to remain controversial until uncertainties in the total natural variability of the climate system are reduced.”
“When will the anthropogenic effect on climate be identified? It is not surprising that the best answer to this question is, ‘We do not know.’”

However, the IPCC bureaucracy did not find the scientists’ repeatedly-stated conclusion acceptable. Without reference back to all of the scientists who had collaborated in producing that final draft, the bureaucracy invited an accommodating scientist to excise these five conclusions, to make numerous other alterations, and to replace the deleted conclusions with the following:

“The body of … evidence now points to a discernible influence on global climate.”

And that has been the official position of the UN’s climate panel ever since. On any view, the process by which the conclusions of the scientists who drafted the IPCC’s 1995 report were tampered with after the scientists had finalized it, and without reference back to all of the scientists, was not a scientific process.’

Bob Carter also reports on this in his excellent book ‘Climate: The Counter Consensus’

The relevant statements in SAR main text, also TAR main text, are not unreasonable. Yes there were shenanigans with the Summary for Policy Makers. But as far as I can tell, from top to bottom, AR4 is unreasonable and shows much more confidence than in TAR, without any apparent reason.

There seems to be an undue emphasis on models. It seems there would be instead a full-on assault to gather data that enables the calculation of climate sensitivity, a more or less direct observation and measurement.

When a model doesn’t work – that is, predict accurately what will happen – it must be scrapped. But what we see is the ‘climate’ model-makers adding more and more ‘fiddle-factors’ to make the answers come out right. We must be thankful that aeronautical and other engineers don’t operate on such a paradigm!

In my view, the overconfidence that has been the hallmark of the IPCC has culminated in egregious flaws of logic which damage, if not destroy, the certainty of the IPCC’s hypothesis (viz., that anthropogenic CO2 is the dominant driver of climate change). At bottom the hypothesis is an argument from ignorance which I have seen repeated numerous times (see below*): Our models cannot identify any natural causes for recent warming, so it must be caused by CO2.

This conclusion is a logical fallacy. Our understanding of natural causes is only partial at best. Partial knowledge allows us to make only partial affirmations and partial negations. In fact the probability of certainty cannot even be estimated since the full impact of natural variability cannot be quantified by global climate models.

*As was recently stated in Nature, “Climate: The real holes in climate science” 463(7279):284 (2010): “Such holes do not undermine the fundamental conclusion that humans are warming the climate, which is based on the extreme rate of the twentieth-century temperature changes and the inability of climate models to simulate such warming without including the role of greenhouse-gas pollution.” Juxtapose this with a later statement in the article by one specialist: “We really don’t know natural variability that well, particularly in the tropics”.

It’s very important to note this logical fallacy, it is something that is far too often ignored, hidden or dismissed as irrelevant; part of the presented ‘proof’ behind the cAGW theory is the fact that they don’t know what else it could be. This is not proof.

I humbly submit that they don’t even know what they don’t know, so confidence limits are completely irrelevant and totally misleading.

I may be willfully showing my ignorance here but, does the IPCC say who decides on these likelyhoods, the evidence behind and why? I mean aside from the sentence or two in the report? is there a detailed procdure to it or is it just the best guess of the in-house experts?

What’s the upshot? Are more decades of data necessary? Or just more thorough work with what is available today? Or does the confidence need to be ratcheted down some number of points?

Perhaps those answers come in later installments….

That “90% very likely” estimate jumped out at me the first time I read the IPCC summary. How should one think about that? I think of it as 95% +/- 5%.

I once read an account of how the IPCC came up with those percentages. Contrary to what some people imagine, those numbers aren’t plucked out of the air. As I recall, each item in a critical logical path was rated for likelihood and error bars and then they were all multiplied together. Or something like that.

But geez, if that’s the case, the IPCC scientists must have rated almost everything at 98% or 99% or 100% certainty, because it doesn’t take many < 100% numbers to knock the overall likelihood result down.

‘That “90% very likely” estimate jumped out at me the first time I read the IPCC summary. How should one think about that?’

It’s a political statement, not a scientific one. That’s why IPCC has no credibility in my house.

As an aside, I have tried to pint out to my MP the failings of the IPCC process. His response is to tell me he is happy to rely on the ‘2500’ scientists who produced 4AR. With that sort of naivity, it’s no wonder UK PLC is kaput.

Lab
I think Judith’s tonite is our early morning. And you really should get the missus fired up on ‘climate disruption’ and get her posting here and on Richard Black’s blog. Much more interesting than soaps……
Mmmmm there’s a thought. Climate change is a soap! Duh
Gary

And finally, attribution studies can’t simply rely on model simulations, since model simulations (even if they capture the correct spectrum of variability) won’t match the observed realization of the multidecadal modes in terms of timing.

This reads like a criticism of the IPCC or somebody, but if so, it’s off target. Attribution studies require both model simulations and observations because they compare observed patterns to model-simulated ones. In the comparison, the 3-d spatial patterns are most important.

The incorrect amplitude of natural variability in climate models is not as important as would be incorrect spatial patterns in climate models. One does not imply the other — ENSO patterns in the models are better than ENSO amplitudes, for example — but I would agree that incorrect amplitude is not a source of confidence for the correct patterns, especially when the historical record is too short to accurately define the near-century-scale modes, whatever they might be.

For late-20th century warming to be attributed to natural variability, you’d need (a) a previously unknown mode (b) that shared all or most of the observed structure of recent temperature changes (c) with perhaps observation error accounting for the rest of it. Item (a) seems quite possible; the challenge is with item (b) since it’s very hard to imagine an ocean-driven mode of variability that would cool the stratosphere as much as it has cooled.

To first order, the level of confidence in the attribution of recent warming to anthropogenic greenhouse gases is governed by the extent to which one believes the simultaneous tropospheric heating and stratospheric cooling could be caused by something other than anthropogenic greenhouse gases.

[Aside #1: the tropical tropospheric hot spot is not a fingerprint of anthropogenic warming. Many otherwise intelligent people have been fooled by a color scale with insufficient resolution. Judy, add that one to your list of blog post ideas.]

[Aside #2: The absolute most common misunderstanding among AGW believers in the general public, in my observations, is that they can’t understand why the IPCC didn’t express absolute certainty that the recent warming was human-caused.]

“For late-20th century warming to be attributed to natural variability …”

Is quite different from “For half of late-20th century warming to be attributed to natural variability …”

It would not be necessary to show that all of it is natural, but merely that one cannot show that it is “very likely” that less than half of it could be natural, in order to dispute:

“Most of the observed increase in global average temperatures since the mid-20th century is very likely due to the observed increase in anthropogenic greenhouse gas concentrations.”

Further the burden of proof seems not to favour the IPCC statement, in that:

It is not necessary to show that it is very likely that half is natural, i.e. can be attributed to “known” natural causes, but merely that it cannot be shown that it is very likely that less than half of it could possibly be due to natural causes either known or unknown.

The IPCC statement as posed is open to dispute, if it had been characterised differently it would be less open to dispute, e.g.:

“Given our understanding of both the climate system as embodied in the current range of models, and the known forcings both natural and anthropogenic, it is very likely that not more than half of the observed increase in global average temperatures since the mid-20th century is due to causes other than the observed increase in anthropogenic greenhouse gas concentrations.”

That seems to me to be a statement that the evidence supports, but it is a much weaker statement.

Dr. N-G,
Very interesting points.
How do you resolve that if they have missed the amplitude by a substantial margin, then the argument that why would we need new mechanisms to explain the past decades?

Yes the “most” is an imprecise statement, apparently this word (heritage from earlier IPCC assessments) was negotiated in the SPM plenary discussion with policy makers. The model simulations imply all the warming in the later half of the 20th century.

For late-20th century warming to be attributed to natural variability, you’d need (a) a previously unknown mode (b) that shared all or most of the observed structure of recent temperature changes (c) with perhaps observation error accounting for the rest of it. Item (a) seems quite possible; the challenge is with item (b) since it’s very hard to imagine an ocean-driven mode of variability that would cool the stratosphere as much as it has cooled.

This really seems to be oversimplified.
I did not look at the stratospheric temperatures much but what I saw was :

– ozone is a strong driver of stratospheric temperature variations and ozone concentrations are not constant
– the variation of the stratospheric temperature is different on different altitudes
– in the 60-80 the tropospheric temperatures went down but the stratospheric temperatures didn’t go up .
– drawing conclusions from some 20 years of data when one knows that the tropospheric dynamics are governed by oscillations of period much longer than that is at least hazardous.

As for the 3D patterns , this is precisely a point where the models diverge wildly in everything but the most basic very large scale spatial features that 1D models can reproduce too.
Precipitations? A mess. Cloudiness and humidity? A mess. Pressures? A mess.
I was never interested to look at the ice/snow cover but I gather that it is a mess too.
Even regional temperatures don’t agree with the exception of the obligatory poles that must be very red by definition.

With the IPCC terminology, the operative word is “most”; it is certainly plausible that a combination of natural internal variability and solar could have produced more than 50% of the late 20th century warming; the IPCC statement already allow it up to 49% based on a definition of “most”.

– in the 60-80 the tropospheric temperatures went down but the stratospheric temperatures didn’t go up .

Exactly!

This illustrates that tropospheric aerosols and/or natural variability (the two mechanisms proposed to have counteracted CO2 increases during that period) do not affect stratospheric temperatures in the same way that CO2 does.

Wrt detection, where does the non-temperature evidence fit in? Of course there is the melting ice, but also the vast empirical evidence of phenological changes. Temperature is really just the tip of the iceberg. After all, “climate” is more than just temperature.

Dean – Note that “detection” refers to detection of anthropogenic warming, not warming in general. (If I could throw a pie at the face of the person who developed that definition, I would.) Ice and phenology aren’t so useful for detection in that sense because they are qualitative measures and it’s hard to extract relative magnitudes or spatial patterns from them. In other words, they’re not easily plugged into a computer algorithm.

One of the flaws in using glacial melt as evidence for AGW (attribution) is that once you raise the temperature, the ice will gradually melt for hundreds of years. So, if the little ice age was exceptionally cold, and then it got warmer 100 years ago, the melting ice in Greenland says nothing useful about temperatures in the 1990s, except that they are warmer than in 1700. But 1700 was way way way too early to attribute that warming to GHG. My newest paper sheds light on possible long-term climate cycles:
Loehle, C. and S.F. Singer. 2010. Holocene Temperature Records Show Millennial-Scale Periodicity. Can J. Earth Sciences 47: 1327-1336
anyone can email me at cloehle at ncasi dot org for a reprint.

John N-G – Thanks for the reply. Are we talking about detecting warming, or detecting climate change? And while I see your point that this type of evidence is not easily plugged into an equation in order to get a likelihood percentage, nonetheless when considering the issue in a broad sense, whatever percentage you get from strictly temperature-based methods would have to be considered a minimum for the broader issue of detecting anthro climate change. So in the sense that detection precedes attribution, that kind of detection would need to include the broader measures in some way. Because it is the broader climate change that we are working to attribute a cause to.

Dean – Neither in this case. “Detecting” means identifying a change in climate that’s distinct from unforced natural variability. Finding evidence that the earth has warmed over the past 30 years is very easy to do, and the ice/phenological evidence supports that. Saying it’s different from natural variability is harder.

Craig notes that there’s a time scale issue too. Knowing that ice is melting in Greenland is not as useful as knowing that the rate of melt has increased (apparently it has) or that the Arctic sea ice (much faster response time) is melting.

Yes, saying it’s different from natural variability is harder, but the phenological record would seem to support that. With a very large group of data sets and a clear trend that is strongly statistically significant, it can’t be random variations. So are oceanic oscillations causing animal migrations to change and the summer season to expand so consistently (by which I mean over many different measures)?

I don’t know how this record can be quantified so as to be included in likelihood measures of detection, but it does add a lot of strength to the case overall.

Dr. N-G,
Since the state of the science admits that it does clouds poorly, is being finding significant changes in understanding of the solar-Earth relationship, and a data base that has a large margin of error, why is ‘detection’ not a demonstration of ‘expectation fulfilment’?

“The general acceptation of the ‘global warming dogma’ is one of the costliest and least democratic mistakes of recent time, Czech President Vaclav Klaus said in a lecture at The Global Warming Policy Foundation in London yesterday.

Klaus went on to criticise ‘powerful special interests’ that advocated the ‘global warming doctrine.’

Not climate, but governments, politics and lobbyists are the core of the problem as the latter want to gain more power and more opportunities to decide, Klaus said.

Scientists should help ‘politicians and public to separate environmentalists’ myths from reality,’ he added.”

1. Every time I see the PDO being discussed I wonder whether the authors are referring to the classic representation (calculated as the leading PC of the North Pacific SST anomalies north of 20N) or to the cycle in the SST anomalies of the same region. They are inversely related.

The cycle in the North Pacific SST anomalies runs in and out of phase with the North Atlantic (the AMO) and at an amplitude that’s similar to but not quite as strong as the AMO.

2. Most analyses of these phenomena fail to examine the processes that create them. The AMO is typically thought of as a response to AMOC/thermohaline circulation. But how much of it is caused by the North Atlantic integrating the impacts of ENSO events? DiLorenzo is close to explaining the variability in the North Pacific, but he is treating the PDO in the classic sense and not simply as detrended SST anomalies of the North Pacific, so the importance of his findings may be overlooked.

3. ENSO is treated as noise. But it is a self-recharging process that occasionally releases vast amounts of warm water from below the surface of the PWP. An El Niño does not discharge all of the available heat and some of the warm water is returned to the surface of the western Pacific, where it gets swept poleward into the western boundary current extensions and into the gyres

4. It assumed that La Niña events “cancel out” El Niño events. Wrong. There are epochs when the frequency and magnitude of El Niño events are greater than La Niña events, and vice versa. El Niño events dominated from about 1915 to the early 1940s and from the mid-1970s to present. La Niña events dominated (slightly) from the early 1940s to the mid-1970s. (Keep in mind while pondering the timing of those epochs that global temperatures lag ENSO.) Yet, the long-term (1900 to present) trend of NINO3.4 SST anomalies is flat.

There are approximately three solar cycles in each phase of the PDO and solar cycles alternates the shape of the peak of cosmic rays from rounded to pointed. This allows two cycles of one type of peak and one of the other in each phase of the PDO. If there is a relationship between the shapes of the peaks of cosmic rays and heat accumulation in the oceans, which Leif Svalgaard considers unlikely, then a mechanism for the alternating cooling and heating of the two phases of the PDO is manifest.
==================

I’m an amateur here but the problem seems to arise from the artificial separation of internal variation and external forcing by the IPCC when looking at the climate history. It strikes me the noise is a signal of something. I can see why it’s done, and why in other circumstances it would be rational approach to take, but given our present level of understanding it seems like a dangerous process. It seems to relegate internal variability to secondary status and separate it off as something different from external forcing.

Until we measure the energy build up in all the compartments in the system and understand the movement of this energy around the sytem and the feedbacks that come from all this it strikes me that internal and external forcing can’t be separated and each need to be continuously re-assessed.

Can somebody give me some insights into the mechanism by which these internal changes were relegated to a second hand status? Even this year I’ve noticed plenty of papers coming out with data and theories to support internal variation as an important aspect of climate change yet it seems to have been put to bed by the IPCC and the fervant supports of a CAGW theory. It actually seems to be acknowledged in the first IPCC quote that Judith uses but I know from the few years I’ve been looking at this subject that it’s usually quickly dismissed generally with the argument that it doesn’t affect the net energy balance of the system.

In comments on another post, Dr Curry challenged my to show were she presents misinformation in her posts. This is a good place to start:

“I haven’t delved into the statistics of the fingerprinting method, but “eyeball analysis” of the climate model results for surface temperature (see Figure SPM.4 and 9.5) is sufficient to get the idea. Note, other variables are also examined (e.g. atmospheric temperatures) but for simplicity the discussion here is focused on surface temperature.”

Let me be clear, Dr Curry thinks it absurd to attribute a 90% plus confidence to the claim that the late twentieth century warming was not due to internal variability in the climate system solely on the basis of the size of the rise; and I agree with her. So, I suspect, would the various authors of the IPCC chapters on detection and attribution and their various scientific supporters. And certainly, the IPCC reports do not say anything different. They do attribute greater than 90% confidence to the claim, but not on a single line of evidence. In fact, in AR4 (Table 9.4), the IPCC lists five lines of evidence supporting the conclusion that “Warming during the past half century cannot be explained without external radiative forcing”.

As I have pointed out elsewhere, consilience of inductions is very important in science; and it is no less important here. Taking this example, if each of the five lines of evidence where such that the probability of a late 20th century temperature rise being entirely unforced, given that evidence were 50% or less; then we would have over 96% of that fact, given the combined evidence. In fact, given that any evidence that the temperature rise was due to CO2 forcing is ipso facto evidence that it was not due to natural variability the confidence of that statement is supported by far more than just those five lines of evidence. Of particular importance here is the evidence of spatial and temporal distributions of warming trends ( http://www.ipcc.ch/publications_and_data/ar4/wg1/en/ch9s9-2-2.html ), which combine to give a clear signature of greenhouse warming.

Taking one example of these, while the troposphere has been warming, the stratosphere has been cooling, a feature of warming due to enhanced greenhouse effects, and also due to warming due to reduced reflectance from aerosols. Confusing the issue, it is also to be expected from reduced ozone levels in the stratosphere. It is not to be expected from increased SST due to natural variability, so any theory attributing the rise in temperatures at end of the twentieth century too that variability must therefore attribute the entirety of the stratospheric cooling to ozone loss. However, the expected amount of cooling due to ozone loss is constrained by theory and experiment. Any theory attributing the entire cooling to ozone loss must then assume that cooling to be far stronger than expected by theoretical and observational constraints, and therefore assume the cooling is statistically improbable given those constraints. Conversely, the denial of that possibility, and hence the affirmation of forcing as a driver of the temperature increase will have a relatively high probability; thus providing strong support for the IPCC’s conclusion.

There are around 20 different lines of evidence canvassed in the IPCC chapter on detection and attribution. Should the probability of the IPCC conclusion, given each of these lines be as low as 15%, then the probability of the conclusion (given independance) still rises above 95%.

So, even if I agreed with Dr Curry’s criticisms of the one line of evidence she in fact discusses (which I do not); she would have no basis for her conclusion that the IPCC is overconfident. That, however, is a mere disagreement. What I find objectionable is her declining to discuss the full gamut of evidence “for simplicity”. Certainly it is simpler for her discussion to ignore relevant evidence. It is, however, not ethical – and represents a gross misrepresentation of the IPCC position.

I take on the “consilience of evidence” approach in Part III. Exactly what gamut of evidence did i neglect? I neglected a few people’s analysis of the evidence (which i find wholly unconvincing). I’m still waiting for some “misinformation.”

The example I used was specifically stratospheric cooling coupled with tropospheric warming. And yes, greenhouse warming could be due to increased humidity, which in turn could be due to normal variability in the ocean surface temperature.

However, thermal feedback is thermal feedback – it does not vary its responce based on the original source of the heating. So, if you wish to commit yourself to a net positive feedback to ocean warming resulting in greater overall warming as a result of internal variations in SST, then you are also committed to increased humidity as a net positive feedback to warming due to CO2 forcing. That is simply a matter of consistency.

For what it is worth, the article cited by Dr Curry above attributes the influence of the PDO and AMO (to which I would add ENSO) on global temperatures to just such a mechanism. I think some such mechanism is necessary to account for the variations in mean global temperature associated with the oceanic oscilations.

This is sophistry. What evidence? If it was referenced in the IPCC and i neglected it, it was because I thought it was unconvincing. If you would like me to refute any specific piece of evidence, let me know which one. I am already devoting close to 10,000 words on this issue before I am finished with with the series.

“Should the probability of the IPCC conclusion, given each of these lines be as low as 15%, then the probability of the conclusion (given independance) still rises above 95%.”

How does that work? By exactly the same argument, each of those lines has 85% for the alternative (100% – 15%), so given independence, the alternative conclusion gets more than 99%.

It’s best to accumulate confidence using log-likelihood ratios. 95% confidence is about 4.25 bits, you need about 0.21 bits from each independent line of evidence, or a probability of 53.67%. Or did I misunderstand what you meant?

In fact, in AR4 (Table 9.4), the IPCC lists five lines of evidence…/blockquote>
I think it’s funny that the table you cite (but don’t bother linking) groups the evidence into a.) surface temperature and b.) everything else (which also includes surface temp of the ocean btw, and last time I checked ‘volcanic forcing’ isn’t anthropogenic!).

The IPCC and Dr Curry are clearly part of a conspiracy to mislead…

The argument that attribution by GCM and other “evidence” is independent is bizarre if you view GCMs as our attempt at encoding the best of our understanding of the climate system and its processes. How could that be independent of the rest of the field that it is built on?

“Warming during the past half century cannot be explained without external radiative forcingGlobalExtremely likely (>95%) [1]Anthropogenic change has been detected in surface temperature with very high significance levels (less than 1% error probability). [2]This conclusion is strengthened by detection of anthropogenic change in the upper ocean with high significance level. [3]Upper ocean warming argues against the surface warming being due to natural internal processes. Observed change is very large relative to climate-model simulated internal variability. [4]Surface temperature variability simulated by models is consistent with variability estimated from [a]instrumental and [b]palaeorecords. Main uncertainty from forcing and internal variability estimates (Sections 9.4.1.2, 9.4.1.4, 9.5.1.1, 9.3.3.2, 9.7). ”http://www.ipcc.ch/publications_and_data/ar4/wg1/en/ch9s9-7.html#table-9-4

That is five lines of evidence on the issue of detection alone. The count of five different lines of evidence does not include the GCM modelling, which is not technically evidence at all. Rather, in each case it is based observations.

Finally, I have not claimed that the various lines are independant (which is a vexed issue). Rather, I have treated them as independant for the sake of exposition only. That is why I used independance as a suposition “given independance” rather than asserting it.

I suspect Tom’s argument was based on saying that 15% for the IPCC meant 85% against, but that 85%^20 is less than 5%, so more than 95% in favour of the IPCC. I’m not sure, though. I don’t wish to put up a strawman.

The problem with this sort of compounding of probabilities of the outcome becomes very low for *both* hypotheses, for and against. You can’t observe that the probability of the observation against your favoured hypothesis is low, so that in its favour must be high.

The approach I took is a fairly standard one in decision theory, and is normally derived from Bayes’ theorem.

If we assume two complementary hypotheses H0 and H1, an experimental outcome O, know P(O|H0) and P(O|H1), and have an assumed prior probability ratio P(H0)/P(H1), we can calculate the posterior probability as follows:
P(H0|O)/P(H1|O) = (P(O|H0)/P(O|H1)) (P(H0)/P(H1))
Take logs to convert that multiplication to an addition
log(P(H0|O)/P(H1|O)) = log(P(O|H0)/P(O|H1)) + log(P(H0)/P(H1))
and we interpret this as the confidence in H0 over H1 after the observation is equal to the evidence inherent in the result of the experiment plus our confidence before the observation.

I usually like to use base 2 logs and express my confidences in bits (it’s more intuitive), but natural logarithms are common with other mathematicians.

It is usual practice to start with log(P(H0)/P(H1)) = 0, representing no opinion, or a 50% probability. This question of priors is the big weakness of Bayesian methods, though – there is often no rational way to start the process off.

Log-likelihood is the closest we can come to quantifying the “evidence” obtained from an observation. In a more sophisticated setting, it is also known as the Kullback-Leibler information.

Another way to think of it is to calculate the ratio of probabilities of each hypothesis directly. Thus, we have 53.67% in favour, 46.33% against. We raise both probabilities to the 20th power to get 0.0003932% and 0.00002076% probabilities for and against, and the probability of the whole set of 20 observations for the hypothesis in favour is now 19 times bigger than that against, as required for a 95% probability.

It’s sometimes called the likelihood ratio test, and by the Neyman-Pearson lemma is the most powerful against a given significance level. There’s another version where the costs of errors in each direction are incorporated to minimise the expected cost of the decision.

As I understand it, the IPCC used a process of “expert judgement” where the experts tried to express how confident they felt. I will leave you to imagine how I as a mathematician feel about that.

I can probably try to answer any further questions tomorrow, if you have any. (You probably will. It’s hard to explain an entire maths lecture this compactly.) Hope that helps.

Oh, by the way. On re-reading that, I realised there was a point that could be potentially confusing.

The log-likelihood ratio is log(P(O|H0)/P(O|H1)) meaning the log of the probability of the observations given hypothesis H0 is true, divided by the probability of the observations given hypothesis H1 is true. I assumed, following Tom’s example as closely as I could, that if P(O|H0) was 0.15, that P(O|H1) would be 0.85. However, this is not necessarily the case. The probabilities could be any values. It’s perfectly possible even to have the exact same probability whichever hypothesis is true, in which case the ratio is one, the log-likelihood is zero, and the observation tells you absolutely nothing about your hypotheses.

This is a particularly important point in the case of the IPCC, because it is not clear how you would calculate the probability of a warming globe in the event that their anthropogenic CO2 hypothesis was false. We don’t know enough about all the alternative possible mechanisms to say.

I might have time to add some more tomorrow. Excellent blog! I look forward to Part III.

Nullius, you are entirely correct. I had hoped to make my logical point without breaking out the full Bayesian apparatus which, not being a mathematician, does not come naturally to me. Unfortunatly as a result I produced some mathematical nonsense. Duly chastened, I will break out Bayes to show my logical point is still sound.

By Bayes theorem, the probablity that a theory is true, given a set of observations equals the a priori probability of the theory, multiplied by the probability of the observation, given the theory, divided by the a priori probability of the observations. In mathematical notation: P(H|O) = P(H)*(P(O|H)/P(O), where H stands for the hypothesis (theory) and O for the observation.

For example,
Let P(H) = 0.1251, P(O) = 0.2, and P(O|H) = 0.3, then P(H|O) = 0.18765 An improvement on the a priori hypothesis, but hardly likely to inspire confidence.

But suppose we have 5 different observations, each independant (for simplicity) and such that P(Oi|H) = 0.3 and P(Oi) = 0.2 for each of the observations (O1, O2, …, O5).

Substituting in we than get P(H|(O1&O2&…&O5) = (0.1251 * 0.3^5)/(0.2^5) =(approx) 95%.

This fact, that multiple supporting observations very quickly lead to high levels of statistical confidence is fundamental to the practise of science (and of polling). It is the basis for the demand for consillience of induction as the gold standard in science. It is the reason we want multiple repeated observations of an experiment, prefferably using independant data sets under different circumstances. It is only by the multiplication of numbers, and better yet types of observations that we can have reasonable confidence in the validity of our theories.

As you correctly point out, contrary observations will very quickly reduce confidence. There are also issues as to what counts as an a priori probability in differnt contexts, and whether the probabilities can even be formalized. But even if we cannot formalize the probabilities, multiple independant supporting lines of evidence give us good grounds for epistemic confidence, and do so because of this property of probabilities.

And that is my point. It is insufficient to criticize a theory to point out that one line of evidence only justifies a 30% probablity of the theory being true when the theory is supported by multiple indendant lines. As illustrated above, even five lines of evidence such that P(H|O) = 0.3 can still justify a 95% probability of the theory being true. When the theory is supported by around 20 lines of evidence, discussing only one and dismissing the rest as “unconvincing” amounts to evasion of the issues rather than discussion of them – whatever the merits of your particular discussion.

Sure. If P(O) = 0.2 and P(O|H) = 0.3 then P(O|¬H) = 0.1857. The ratio of 0.3 to 0.1857 is about 1.6155. In other words, the observation is 62% more likely with the hypothesis than its complement. That’s stronger evidence than I suggested. Had we started with a zero information prior, one such observation would have got us to 62%. By comparison, my example offered only 54% in the same circumstances.

With the stronger evidence per observation, you can get to 95% in five steps even starting from the much weaker prior. You can make it even more extreme if you want. You can propose a situation where the probability of the IPCC’s hypothesis after a single observation is just one in a million, while the probability after two observations is 99%. It just requires an extremely weak prior combined with extremely strong evidence. However, it is not *necessarily* the case in the conditions stated.

The question of what prior to use regarding the CAGW hypothesis is itself an interesting one – the discussion could easily get very philosophical. But I think the major issues/disputes here are more likely to be with the calculation of the weight of evidence, as it seems to me that P(O|¬H) is essentially unknown.

Which leave open the main question here – did the IPCC use such a quantitative method in coming to its probabilistic judgements? On this topic, the IAC report says:“The IPCC uncertainty guidance provides a good starting point for characterizing uncertainty in the assessment reports. However, the guidance was not consistently followed in the fourth assessment, leading to unnecessary errors. For example, authors reported high confidence in statements for which there is little evidence, such as the widely-quoted statement that agricultural yields in Africa might decline by up to 50 percent by 2020. Moreover, the guidance was often applied to statements that are so vague they cannot be falsified. In these cases the impression was often left, quite incorrectly, that a substantive finding was being presented.”

I’m not inclined to take inquiry reports at their word, so I don’t regard that as in any way definitive. But it would be interesting to know how the IPCC did calculate that 90% number in their conclusion.

There seems to be a issue when gaining information about attribution from the models.

The problem is the we are not asking a general question. We are asking about a specific instance, the last half of the 20th century as it actually occured.

What were the chances of the component due to GHG being >50% during that period. The component due to GHG is a system property that has a unknown value, it is not a variable.

In the models it is a model property but now we could ascertain its value by running the models again and again both with and without GHGs.

Now if the models be trained in any way, or selected for during development, so as to reproduce the observable temperature record during the period in question we would have a problem. That training would result in models being encouraged to support the claim that GHGs produced >50% of the warming, given that they show that the record would have otherwise been basically flat or falling.

There might be a temptation to try to infer information from the no-GHGs model runs but that would not necessarily be valid. There is no guarantee that the range of natural variation without GHGs would be the same as with GHGs.

That may seem a biased view, but the wording of the statement does not refer to how little of what did occur would have occurred anyway, but how much of what did occur was due to GHGs, they are not necessarily the same thing.

Also when arguing from the non-GHG runs you may have no information as to how that run would have developed for the with GHGs case and hence the proportion due to GHGs, or whether they would have matched the observed record which is the instance that we are required to consider.

Personally I think it matters not one jot. I think it is a bit of a bizarre and unnecessary claim in that it seems almost unprovable when arguing from the models unless we know what credence to give the models.

The ensemble mean matches the observed record during the period in question with extraordinary fidelity, which is both an encouraging and worrying sign. Worrying as the match in the prior period is much worse, and that is also the baseline period, it would be tempting to infer that some training had occurred to produce the brilliant fit. If the problem in the baseline period were a data error, it could impact the baseline and hence the fit during the period of interest.

Regarding the fit of the ensemble mean to the observed record post 1950, it would appear that the observed temperatures track the ensemble mean very closely in fact there is very little variation at all post Agung.

If the ensemble mean has correctly followed the underlying trend in the observed record one would still expect a divergence due to natural variation. It would appear that there is almost no natural variation when viewed in that way except for ENSO events. That seems very strange.

Alternatively the ensemble mean does not track the underlying trend so closely but the observed record has by chance reproduced the ensemble mean due to natural variation, which would be a rare event if the natural variance is much bigger than the observed variance between the ensemble mean and the trend.

Can someone say for certain if the models have been trained to the observed record or if there is a risk that the development process selected for the observed record.

If so one might expect individual models to vary quite considerable from the observed record but for the ensemble mean to track it rather closely. The individual models might be producing instances of the training data plus error but the ensemble revealing the common training data which would be a match for the observed data. I really will go spare if that is the case.

“Based upon the precautionary principle, the United Nations Convention on Climate Change (UNFCCC) established a qualitative climate goal for the long term: stabilization of the concentrations of greenhouse gasses in the atmosphere.”

Obviously the PP didn’t anticipate the run-away inflationary bubble of a limited resource. Heck, it was created and sustained and floated by a decades long media campaign.

Maybe the inflating bubble was just supposed to be confined to the well being of the IPCC and those who benefited from promoting the hype.