Sunday, May 25, 2008

Once more unto the breach dear friends, once more...

...or fill up the bin with rejected manuscripts.

I wasn't going to bother blogging this, as there is really not that much to be said that has not already been covered at length. But I sent it to someone for comments, and (along with replying) he sent it to a bunch of other people, so there is little point in trying to pretend it doesn't exist.

It's a re-writing of the uniform-prior-doesn't-work stuff, of course. Although I had given up on trying to get that published some time ago, the topic still seems to have plenty of relevance, and no-one else has written anything about it in the meantime. I also have a 500 quid bet to win with jules over its next rejection. So we decided to warm it over and try again. The moderately new angle this time is to add some simple economic analysis to show that these things really matter. In principle it is obvious that by changing the prior, we change the posterior and this will change results of an economic calculation, but I was a little surprised to find that swapping between U[0,20C] and U[0,10C] (both of which priors have been used in the literature, even by the same authors in consecutive papers) can change the expected cost of climate change by a factor of more than 2!

We have also gone much further than before in looking at the robustness of results when any attempt at a reasonable prior is chosen. This was one of the more useful criticisms raised over the last iteration, and we no longer have the space limitations of previous manuscripts. The conclusion seems clear - one cannot generate these silly pdfs which assign high probability to very high sensitivity, other than by starting with strong (IMO ridiculous) prior belief in high sensitivity, and then ignoring almost all evidence to the contrary. Whether or not such a statement is publishable or not (at least, publishable by us), remains to be seen. I'm not exactly holding my breath, but would be very happy to have my pessimism proved wrong.

46 comments:

Probably I am missing something... it says on page 10 that "a gaussian likelihood in feedback space has the inconvenient property that f(O|L=0) is strictly greater than zero, and so for all large S, f(O|S) is bounded below by a constant."

No, those are likelihood functions (the notation is a bit ambiguous, but I think it is standard practice). So f(O|L=x) = f(O|S=1/x) is true by definition for all x. At the bottom of page 3 we mention that the cost function is always truncated at 100% (a rather simplistic assumption, but it doesn't really matter how we do it).

Two problems with the "just give the likelihood" idea (which has been suggested in the past) are: (1) Everyone demands that climate scientists produce "the answer" or at least, an answer. After all, if we cannot choose sensible priors for S, then why should some politician or economist be able to do better? With reference to my previous comment you quote, it seems that the Bayesian analysis, or something like it, is required for (conventional) decision support. (2) It would be bogus to pretend that the likelihood can stand alone in objective glory anyway - it is based on all sorts of implicit and occasionally explicit choices.

It shouldn't matter what the prior is written in terms of, so long as we think about it and make a sensible choice. Indeed it probably makes more sense to think about it in terms of feedbacks, since we have the Stefan-Boltzmann law as a starting point, and significant physical understanding of the other feedback terms (although there may be some debate about how much of this is truly independent of recent observational evidence). But basically, any prior broadly centred on a reasonable value for feedback will give results equivalent to those we present.

Just to test my understanding, in Figure 1 we see plotted p(S), an actual probability density function, obtained by multiplying the likelihood function f(O|S) with the uniform (in S) prior, right?

My proposed calculation would be produced be the same likelihood function multiplied by a prior that is uniform in L, i.e., 1/(S^2) when mapped to S ... right?

So, in the Bayesian approach it's the chosen prior that converts likelihoods into probability densities?

No offense, but don't you think it's a bit arbitrary, and may I say, unnatural, to combine a uniform prior on S with a gaussian likelihood on L? I suspect that you're only getting the trouble you're asking for ;-)

(edit proposal: "must be truncated" -> "must be truncated -- as we did --" )

You claim that a negative sensitivity (to CO2) leads to an unstable world. Without data, can you really rule out the possibility of a small negative sensitivity with the stability of climate comes from something else (e.g. a Gaia like daisyworld effect)?

How bad would you consider a uniform from -2 to +8 as a prior to be? (Yes I know that -2 to -1, 2 to 3 and 7 to 8 each having a 10% chance is pretty ridiculous when 8 to 9 has no chance at all.)

P133.2 Expert priors"Having demonstrated how the widely-used approach of a uniform prior fails to adequately represent “ignorance” and generates rather pathological results which depend strongly on the selected upper bound"

I think that you want to make clear you want to complain about three things:

1. That the priors used in the literature like U(0,10) are pathological (with U(0,20) of Frame et al being an extreme example to demonstrate a point similar to you using U(0,50)).

2. If you use a uniform prior approach, you need to carefully select the bounds (both of them AFAICS).

3. The approach of using a uniform prior. Errors can and in your opinion have be made in selecting appropriate bounds and even if bounds were successfully chosen it doesn't really make sense: per '-2 to -1, 2 to 3 and 7 to 8 each having a 10% chance is pretty ridiculous when 8 to 9 has no chance at all’ argument.

Given these problems why use a uniform approach?

Is that justifiable or is it more a case of:If there are uniforms that are as extremely different as is plausible and these do not matter to the result then a uniform prior approach is OK.

Or Is it difficult to generate uniforms that are as extremely different as is plausible so different distributions are likely to be needed? So a uniform approach is unlikely to be appropriate?

So in what order would you place the following problems?

a) The priors actually used (i.e. U(0,10))

b) The failure to show the effects of different priors (which should be plausible)?

c) The uniform approach?

I think I would go for the order a b c.

From the above I think you should see that I would like to see the conclusions include recommending that other papers should show the results of at least two different priors. A uniform prior is the least of the problems but you would recommend against it. However if researchers insist on using a uniform prior then the bounds of all ranges should be selected so that P S>6 should be less than 30% and P S>8 should be less than 10%. Once a paper has done this then the same bounds should be used for other papers for comparability.

> Two problems with the "just give the likelihood" idea (which has been suggested in the past) are: (1) Everyone demands that climate scientists produce "the answer" or at least, an answer. After all, if we cannot choose sensible priors for S, then why should some politician or economist be able to do better? With reference to my previous comment you quote, it seems that the Bayesian analysis, or something like it, is required for (conventional) decision support.

In the present context, I would find an answer in the form of a confidence interval much more informative than a prior-dependent answer. Such an answer could provide a solid basis for policy making.

> (2) It would be bogus to pretend that the likelihood can stand alone in objective glory anyway - it is based on all sorts of implicit and occasionally explicit choices.

Well - any analysis uses some model and thus some assumptions. The assumptions going into the prior are made in addition to those going into the likelihood, and are much more arbitrary.

> It shouldn't matter what the prior is written in terms of, so long as we think about it and make a sensible choice.

What I am wondering is whether a prior that looks "sensible" when presented as a distribution over S still seems sensible when represented as the equivalent distribution over L.

> But basically, any prior broadly centred on a reasonable value for feedback will give results equivalent to those we present.

This is an artifact of L being constrained to be positive. What happens if you use a broad prior, symmetric in log(L)? I think in that case you would find very small values of L a-priori probable, and thus also a-posteriori probable.

It appears to me that the Bayes formula at the top of page 2 would apply to probabilities. I am not aware of this formula being used for likelihoods(?). I would suggest systematically using p() for probability densities, and some other symbol, say q() (not f) for likelihoods.

Then, as we have

p(O|L]dL = -p(O|S)dS,

it follows using the definition of likelihood (and "reverse" notation) in the Wikipedia article:

q(s|O) = \alpha p(O|S=s) andq(l|O) = \beta p(O|L=l), so

\alpha q(S|O)dS = -\beta q(L|O)dL,

where the constants \alpha, \beta cannot depend on L or S (as otherwise the whole notion of likelihood ratios between pairs of L or S values goes out the window).

Again it follows that

q(S|O) = const * q(L|O) * S^{-2}

as in my first post.

Sorry to be a pain, but the article's notation could be more explicit ;-)

"You claim that a negative sensitivity (to CO2) leads to an unstable world. Without data, can you really rule out the possibility of a small negative sensitivity"

With negative sensitivity, a loss of CO2 would lead to warmer climate, faster weathering, and faster removal of CO2- a process that would feed back to give a CO2-free atmosphere at some elevated temperature that is inexplicable without CO2.

Similarly, an increase in CO2 would eventually yield a venus type atmosphere over a snowball earth.

And daisy type effects don't explain climactic stability during the 92% of Earth history that predates the colonization of land.

Martin, f(O|S) is a pdf when S is fixed and O is treated as a variable. But when O is held fixed (at the observed value) and S is considered a variable, its integral has no particular meaning, although when a uniform prior is used it then coincides with the posterior pdf f(S|O), up to a normalisation constant at least.

No offense, but don't you think it's a bit arbitrary, and may I say, unnatural, to combine a uniform prior on S with a gaussian likelihood on L? I suspect that you're only getting the trouble you're asking for ;-)

Sure, but note that this is not actually what I am asking for, but rather what (almost) the entire field of climate science has been doing for the last few years. We are describing the reasons for its failure.

Chris, the lower bound doesn't really matter in practice, so this is one occasion where we don't really need to be precise. I wouldn't object to a small prior probability of negative values but the effect on the results would be negligible and it is common practice to exclude it.

I presume that U[0,10] was deliberately chosen by the IPCC authors as a "reasonable" compromise in terms of giving scary enough, but not absurd results (at least in their view). It also happens to coincide with a recent publication of one of the lead authors :-) I assume (and hope) this problem has been being talked about for some time before I got involved, but by the time I was making a nuisance of myself I expect they were mainly focussed on thrashing out some reasonable compromise in time for publication. I suppose if I had to choose my own preferred uniform range, it would be something lower but I don't think there is much point in choosing a uniform range just so as to give similar results to a reasonable expert judgement. It is important to remember that no-one has ever (to my knowledge) actually defended any uniform prior as actually representing a reasonable judgement about anything at all. Even as late as the 2nd draft of the IPCC they were claiming it "represented no knowledge or assumptions about the parameters apart from the range of possible values".

I think it is sensible to investigate the sensitivity of the results to the prior (and other decisions for that matter) but the range of priors used still has to be reasonable - you can get any answer you want from a sufficiently unreasonable prior. But if people just start to use a range of priors including a wide uniform one, and then point to those results and say "we cannot rule out..." then it will hardly be an improvement.

James, OK, I see what you're trying to do... but you seem to do so more by demonstrating the absurdity of consequences than questioning the theory (at least I don't see much of that). That would make the argument even more forceful.

If all you want is a confidence interval, we can give up all this costly climate science. Toss a coin 5 times, if it comes up 5xH then the CI is the empty set and otherwise it's the whole number line. Voila, a 97% CI for sensitivity. I'm not sure quite what policy that would support, but I am pretty confident it would be the wrong one!

OK, that is a cheap trick. But actually, I think for most analyses, the natural confidence intervals would seem pretty reasonable (basically equivalent to the credible intervals arising from a uniform prior in feedback). However, some might go to infinity "and beyond", especially at the higher confidence levels.

I have not done the sums, but I think a prior that is lognormal in L would still give reasonable results, even though this would certainly increase the probability of high values of S.

OK, I have just done the sums (v quickly - may be wrong). I think that a lognormal prior of N(0,1) in base 10 (and SI units) gives a posterior 5-95% range or S of 1.2-5.5C.

This prior for L has a 68% range of -1 to 1 in log units, or 0.1 to 10 in SI units (W/(m^2K)), which means a sensitivity of 0.37 to 37C with the median at 3.7C. The 95% limits are 1/10x and 10x those of course. I think that is an extraordinarily pessimistic prior (high median, and 2.5% probability to S greater than 370C!), but it still seems to gives less extreme results than U[0,10] (although it will have a thicker tail above 10C...).

"I don't think there is much point in choosing a uniform range just so as to give similar results to a reasonable expert judgement."

I don't think there is much point in doing this in practice for arriving at a pdf either. I was doing this as a device to see if it clarified whether the draft paper was directing too much distain at the uniform approach when the main problems lay elsewhere (the priors used and not testing sensitivity to reasonable priors). If there are good reasons for dislike of the uniform approach rather than the unreasonable priors used, are these reasons made clear in the paper?

Chuck"And daisy type effects don't explain climactic stability during the 92% of Earth history that predates the colonization of land."

Thanks good points. I did mention 'Gaia like' which I intended to include ocean life releasing chemicals to affect cloud cover in response to uncomfortable radiation levels but I didn't go to the trouble of explaining that. How good is our knowledge of climate pre ocean life?

OK a uniform infinite prior is the truly ignorant one; so you are correctly arguing that some information has to go into the prior whether it be data or physical limits. At that point the question becomes how much is the minimum amount of information you need to reasonably constrain the result.

The problem with this argument is not that it is a cheap trick but that it is an argument against uninformative likelihood models rather than against CIs.

Of course, the likelihood model is used in the Bayesian approach as well. You could "give up all this costly climate science" and use the degenerate likelihood model you suggested with the Bayesian approach. Then, you will get a posterior distribution that is identical to the prior distribution (whatever that may be). This would, of course, be as useless a basis for policy as the degenerate CI you constructed.

> the natural confidence intervals would seem pretty reasonable

True, and that is also why the lognormal prior that I suggested gives results that seem reasonable to you. The problem with the uniform priors in S is not that they give high probability to high S, but that they give low probability to anything but high S. The lognormal prior distributes its mass more evenly on a wide range of S values.

> (basically equivalent to the credible intervals arising from a uniform prior in feedback)

Absolutely untrue. A uniform prior on L artificially puts very little mass on low L values in the same way that a uniform prior on S artificially puts low mass on low values of S.

Imagine, for example, that the likelihood was proportional to N(2.3,1.4), instead of to N(2.3,0.7). Then a standard 95% CI would include L=0. The posterior probability based on a U[0,10]-in-L prior (say), however, would still give P(L < .1) < 2.5%.

> I think that is an extraordinarily pessimistic prior (high median, and 2.5% probability to S greater than 370C!)

I don't know exactly why. It seems quite reasonable to me. The median (3.7C) is not that high, and can easily be shifted downward (to, say, 2C) without changing the results materially.

|| I think that is an extraordinarily|| pessimistic prior (high median, and 2.5% || probability to S greater than || 370C!)

| I don't know exactly why. It seems quite | reasonable to me. The median (3.7C) is | not that high, and can easily be shifted | downward (to, say, 2C) without changing | the results materially.

Well yes, I would agree. For a prior this seems reasonable: I know one very Earth-like planet that has suffered runaway feedback. Actually two, counting Snowball Earth. Where do you philosophically draw the line between prior and observed?

The Bayesian literature makes a fairly big thing out of "background information". So I think we are entitled to use ancient physics; Tyndall's measurements, Arrhenius' calculation, Stefan's Law; thermo if needed.

So here is a guess at an uninformed prior. Arrhenius said climate sensitivity to just a doubling of CO2 is 6 K (hope I have that right). From Tyndall, water vapor doubles that (hope I have that right). But then warming the oceans doubles that again (hope I have that right). So S = 24 K, around which I'll most pessimistically draw a Cauchy distribution, being quite uncertain about these values.

>"Well yes, I would agree. For a prior this seems reasonable: I know one very Earth-like planet that has suffered runaway feedback. Actually two, counting Snowball Earth. Where do you philosophically draw the line between prior and observed?"

James has made clear in the paper that he has used a prior from expert opinion as at 1979 and updated with data entirely post 1979.

Now you might want to argue that to arrive at that prior you need a more ignorant prior and update it with knowledge to 1979. But this really isn't necessary - James is arguing that the precise details of his prior doesn't matter too much as long as something plausible is used. So the need to go back to an ignorant prior and expert knowledge to 1979 isn't necessary.

A sensitivity of over 90C would presumably lead to oceans boiling. I suspect a sensitivity of 40C would be enough to restrict life to only near the poles several times in Earth's history and we would have known about that happening by 1979. This should be ruled out at the 2.5% level.

I don't see much wrong with a median of 3.7C but a 2.5% chance of S greater than 370C is crazy unless you are trying to get back to a prior with only knowledge to 1850 or earlier or something (James has indicated the split between data used to update and all other knowledge doesn't need to be timewise.)

What is more important to ask is how well scientists in 1979 could have predicted the warming rate over the next 30 years for the greenhouse gas levels that have actually existed. If they could have done very well, would this imply some double counting of the (predicted/actual) data to limit the remaining uncertainty?

Eli, why ask about 'minimum amount of information' when it is clear that all information must be used either in the prior or in the data used to update else you end up with something that isn't a credible probability distribution?

For confidence intervals you don't need to make assumptions about the distributions of the unknown parameters: they don't have any.

I used the FG measurement of L (with its associated Normal uncertainty) as it appears in the A&H paper to generate a CI for L. One convenient property of CIs is that they allow easy transformation of the parameters.

Well, it seems that Chris has answered most points pretty well - thanks!

Yoram, it is not in dispute that one can generate an arbitrarily large range of possible posteriors by selecting extreme enough priors. The question is what sort of priors are reasonable, and no-one, not even the most extreme advocate of uniform priors, has every suggested that 370C is a sensible value. But even in that case (lognormal in L) the posterior does not actually appear to be that unreasonable. This seems to me like a strong demonstration of robustness, not a show of arbitrariness and methodological weakness.

I don't know what you found to disagree with me about confidence intervals - the range you present in your later comment is precisely the 95% probability interval that Forster and Gregory presented (for this, they explicitly assumed a uniform prior in L, noting that this choice was unconventional) and your point about wider intervals extending to L=0 is precisely what I meant with my statement "However, some might go to infinity "and beyond", especially at the higher confidence levels." Of course a confidence interval will just be routinely misinterpreted as a probability interval anyway...

David, there could br a case for using Arrhenius' 6C, but note that this was based on a calculation that is known to have substantial inaccuracies (and does not need doubling and redoubling). The canonical 3C can be viewed as more careful and credible version of the same calculation. But there is also the issue Chris emphasises of what information is contained in the prior versus likelihood - anyone arguing for a really wide and high prior based on ancient history will have to also consider how to take account of all that we have learnt in the meantime, not just the last couple of decades of some satellite observations that I used.

James: great paper. I have one worry, though. If there's a chance of a 100% world GDP loss (which would represent human extinction or the collapse of civilization or whatever), then from a utilitarian point of view it won't do to take the expected value, as it seems like the effects of a 100% loss would be permanent in a way that other losses probably wouldn't be.

So my question to you is -- do you have any idea how to estimate the probability of a doubling of CO2 causing permanent collapse/extinction? I'd put it at significantly less than 1% but I'm not sure how much less exactly.

One thought that has crossed my mind: would the distribution of sensitivities from the 22 or so models included in the AR4 report constitute a valid prior? These models are supposed to be based on the physics only, untuned to observations.

That's effectively what the original NAS report (Charney) did with the two main models around at the time (Hansen and GFDL). So it's certainly not an unreasonable idea! One still needs to think about what sort of tail to add outside the range of models, though, as it would not (IMO) be reasonable to claim a priori that S cannot like outside that range. And using modern models brings up the question of double counting, which the 1979 NAS report avoids much more straightforwardly.

Steven, there is no reasonable scenario I can think of in which 2xCO2 causes the end of civilisation (unless society is so brittle that any minor disruption triggers a nuclear holocaust, in which case the next flu pandemic or peak oil will do the job first). In general though, your question points towards issues of nonlinear utility - this is a standard economic issue but one that we preferred not to touch on in this paper.

James...beginners question...what is a sane way to interpret confidence intervals?

Eg. Statistical Analysis, Kachigan p. 141;

"In other words, we are 95% sure that the true mean weight of the ball-bearings, had we measured every single one of the day's production, would be somewhere between 149.21 and 150.39 grams. More technically, we are 95% sure that the procedure for creating the obtained confidence interval would produce an interval encompassing the population mean. For the actual interval produced, the true mean either is or is not in it, so strictly speaking we cannot say there is a .95 probability that the mean falls in the interval. This is a moot point from a practical standpoint, but has importance in a theoretical context in which no quantitative probability-type statement is allowable for a specific interval.We will adhere to the more practical view that "95% sure" or "95% confident" are meaningful common sense statements with respect to specific intervals, and are as legitimate as statements which invoke "odds" or phrases such as "substantial assurance" to circumvent the probability issue."

... in other words, is "confidence" used as a hand-waving way of saying "probability", because saying "probability" would be wrong?... if it is likely (but unstated) that a person intended to convey a probability when they wrote a confidence interval... should the interval be interpreted as a confused frequentist statement, or as bayesian with uniform prior (regardless of what one thinks the individual "intended").... if the frequentist frame does not give a location for the population parameter, how can a frequentist frame be useful when interpreting confidence intervals?

Crandles, my point is that the amount of information in the prior effects the result. Given that, is there only an arbitrary way of separating the two (1979, 1980, 1981, etc??).

If so prior construction becomes an art rather than an algorithm and one can easily argue that assigning uniform probability from 0 to 100 C is ok, even if the oceans boil (priors are supposed to be ignorant!). My preference would be to separate models and observational information, using the former to build the prior.

I don't know. Well, I know the technically correct interpretation (eg an interval generated according to a random process such that p% of intervals so generated will contain the true parameter value), but I also know that almost everyone misinterprets them as probability intervals in practice. There is a huge discussion on the Wikipedia page about this (here on down). The technically correct interpretation seems rather useless in practice...and the term "confidence" seems to sometimes be used as a con-trick in full knowledge that the unwary will indeed interpret it incorrectly as "probability". So in general I am not a fan of them. although they do have the advantage of simplicity.

Eli,

If you are going to persist in claiming that a uniform prior represents "ignorance", you are going to have to address the question of how one can be ignorant about x but knowledgeable about 1/x, or vice-versa. That doesn't correspond to any plausible definition of "ignorant" IMO (in either technical or common usage).

Eli>"Given that, is there only an arbitrary way of separating the two (1979, 1980, 1981, etc??).

If so prior construction becomes an art rather than an algorithm"

Priors are meant to be personal beliefs that differ from person to person. So even if two people agreed to use expert opinion to 1979 their priors would be different.

You are welcome to have a go at it using a different split.

Anyway yes it is more of an art of extracting your own beliefs than an algorithm.

Does this mean that you 'can easily argue that assigning uniform probability from 0 to 100 C is ok'?

Simple answer - No.If I said that I believed there was a 99% chance that aliens would come and steal the moon tomorrow would this make that a reasonable expectation? I think you would dismiss me as mad much more readily than you would accept that as a reasonable belief.

Where precisely to draw the line between reasonable beliefs and unreasonable ones may not be very clear. However if there is enough of a gap then the precise position does not need to be determined. I doubt you would disagree with me saying any belief that the odds are greater than 25% is an unreasonable belief (other than to suggest a lower percentage could be substituted).

Uniform probability from 0 to 100 Cimplies a 50% probability of sensitivity over 50C. That seems crazy to me. Could any intelligent and well informed to 1979 person reasonably believe that the chance of sensitivity being over 50C is greater (let alone 5 times greater) than the chance of sensitivity being between 0 and 10C?

which states "We find that the climate sensitivity is reduced by at least a factor of 2 when direct and indirect effects of decreasing aerosols are included, compared to the case where the radiative forcing is ascribed only to increases in atmospheric concentrations of carbon dioxide."

Blogged here. It is obviously wrong, and happens to have been published in the same special issue (guest editor: P Chylek) that spawned the Schwartz nonsense. I'm not aware of any attempt at a peer-reviewed comment on it, but it will certainly not influence the field.