Willis on GISS Model E

Willis Eschenbach has done three posts recently at WUWT on the linear relationship between input forcings and global temperature output (here, here, here), with a useful contribution by WUWT reader Paul here. Mosher observed that Isaac Held has also posted recently on the same topic in connection with the GFDL model.

Willis’ posts were accompanied with Excel spreadsheets rather than R scripts – a decision that, in my opinion, makes it much harder both to decode what is going on and to reconcile to original data. (Isaac Held provided neither.) I’ll publish a short R script below that retrieves original data and does the analysis.

The idea that global temperature arising from a GCM can be expressed as a simple expression of the forcings was discussed on a number of occasions in early CA days, notably in some posts discussing Kaufmann and Stern’s related (but different) observation that a simple linear relationship between forcings and global temperature outperformed the corresponding GCM output. They observed:

none of the GCM’s have explanatory power for observed temperature additional to that provided by the radiative forcing variables that are used to simulate the GCM…

They had experienced extreme difficulty in trying to run the gauntlet protecting climate science doctrine. They turned up at realclimate here, but, rather than continuing this interesting but subversive discussion online, Schmidt asked they take the conversation offline. See CA discussions herehere.

The idea of using much simpler models to emulate GCM output of global temperature was the idea behind the MAGICC emulator of Wigley and Raper, used in IPCC TAR to simulate GCM output, a point noted up at CA at the time here:

if your interest is primarily in global temperature, it’s hard to see why you need the terabytes of operations – maybe you do, I haven’t delved into the matter. I notice that, trom time to time, they model GCM outputs by simpler models and use these simpler models to project GCM outputs. That’s what they did in IPCC TAR – see Wigley and Raper.

Wigley’s MAGICC algorithm featured in the first Climategate discussion of FOI, with Wigley and Jones discussing methods of evading the legislation:

From their wording, computer code would be covered by the FOIA. My concern was if Sarah [Raper] is/was still employed by UEA. I guess she could claim that she had only written one tenth of the code and release every tenth line.

So it was nice to see fresh discussion of the issue at WUWT. (And while the Excel spreadsheets were ungainly, they did at least show was being done.) I spent some time parsing the spreadsheets into (what I think are) more functional tools.

Retrieval of GISS Forcing Data: Willis provided an Excel spreadsheet showing his collation. But the direct retrieval is equally easy. Volcanic and nonvolcanic forcing were separated in his analysis and are retrieved here.

The parameters are estimated by minimization of the sum of squares relative to the target. This can be implemented by defining a utility function in the parameters only and then using the R-function nlm as follows (starting values are required):

Willis, run a search on the phrase “Friends don’t let friends use spreadsheets.” While problems with spreadsheets aren’t limited to Excel, Excel will commonly be the most mentioned. I quit using Excel for stats in the ’90s after it gave me a negative variance. These days I export to csv an use R for anything serious.

The article is particularly interesting in climate contexts because it addresses the reliability of results in linear and nonlinear regression operations. IIRC, the hazards are primarily in using canned routines.

I’m still curious as to why volcanic forcings are arbitrarily reduced to make the “match”.

Steve, perhaps you can replicate the obs temp record with obs forcings, using the same methodology as Willis, Paul, and others have. I mean, if one can replicate surface temps better than the GCMs with the GCMs’ own forcings, then it should be generalizable and replicate one of the temp obs time series with relevant observations of the real forcings.

Ellis,
What makes you think volcanic forcings were reduced? I don’t see that in what Steve did or what Willis did. What am I missing?
Steve: this was done in the method of Willis and Paul. I don;t vouch for it. Nor do I have any reason to believe that the volcanic forcings are known with much precision.

It’s not surprising that Willis is able to replicate the complex GIS model with a very simple function.

It is very likely that in the beginning phase of development, the GIS programers starterd with a very simple equation such as:
T(n) = T(n-1) + sensitity constant * (Co2(n) – C02(n-1))

As that would not model real temperature fluctiations at all, they would have tinkered with the equation, gradually adding additional factors.
That would have given a better fit in some years but worse in others, so they fiddled and added some more.
In the end they have a very complex model, requiring a large computer costing many millions of dollars to run.

But their model could not vary too far from replicating actual temperature, so for every ADD they would have have to included a DEDUCT. So all their fiddle faddle and augmentation could not significantly change the overall shape of the output.
Which is why it so closely resembles Willis’ very simple model.

Their basic problem is that T does not vary with CO2, so regardless of sofistication, their model does not replicate reality.

A better model for the period 1850 to 1998 would have been a simple linear model such as:
T(n) = T(n-1) * (1 + (0.7 / 100))

It is perhaps not entirely likely that this simple model can be used to project temperatures into the future to say the year 2100 or thereabouts, with any degree of accuracy.

“It is very likely that in the beginning phase of development, the GIS programers starterd with a very simple equation such as:
T(n) = T(n-1) + sensitity constant * (Co2(n) – C02(n-1))”

it is highly UNLIKELY that they started with anything of the sort. You can go research GCM development and see this.

As I’ve said before its not very surprising that the low order output of a GCM can be linearized. Given the underlying simplicity of the system, nobody should be shocked. What’s difficult is producing a higher order emulational.

Steve, if one can linearize a GCM output, then where is the chaotic system? the uncontrolled heating? the synergistic positive feedbacks? the tipping point(s)? Explain to me again why you are not shocked.

Re: Steve McIntyre (May 16 00:23), The interesting thing is that Held argues that over a 100 year period we only see the TCR– transient climate response.
Held also notes that this statistic should be compiled for all the models

Ron. Im not shocked because climate is a boundary value problem not an initial values problem. I WOULD be shocked if you could predict 10 day weather with a simple linearization.

Feedbacks? Well there are two kinds. Fast feedbacks and slow feedbacks. basically in the 100 year period you see here you dont have any of the slower feedbacks that cause the major problems.

The other thing is that people foget that a chaotic system doesnt have to be chaotic in all of its metrics. the avergae surface temperature is just one dimension of the system. Actually, its a index of a proxy for heat.

Finally, most people get mezmerized by this one metric ( its not an output ) of the system. dont be most people.

Re: Frank K. (May 16 07:57), Sorry, I think Pielke is wrong, but this place is not the place to discuss it. The long term average of weather, (we call this climate)is control by the boundary conditions. The average of OHC over the next 1000 years, is not sensitive to the exact state of PDO today.

Now if climate were defined that way we’d all have to conclude that there is currently no climate change. However, when most folk refer to climate change they MUST mean only global temperature because every other long term weather metric is not significantly changing.

Steven,
Your response regarding the chaotic system is reasonable. I can accept that. The answer regarding slower feedbacks is contrary to my understanding. According to Hansen, the heat in the pipeline will not take 100 years to show up on the surface. You did not address the tipping point issue at all. And, as Frank notes, you are mistaken about it being a boundary value problem.

When I read modelers talking informally about their models, they use words like “we are dealing with something real.” They describe computer runs as “experiments.” They treat the models as though they have real predictive power. I think they would be shocked to learn the temperature output can be matched a simple formula. And I am shocked that you are not shocked.

The climate system is not JUST a boundary OR an initial values problem. The GCM’s went to boundary as they had a bad tendency to lose their minds (go outside reasonable expectations) and had to be limited. They ended up with something that appears to generally follow the climate direction.

The lack of initial values means that they do not match the squiggles of the actual climate. The recent flat temps were a surprise to the modellers until they went back and INITIALIZED the models with realistic inputs. they then saw the flat temps we are experiencing.

So, we have initial and boundary both needed, yet, the GCM’s still cannot match reality. Why is that? Oh yeah, the largest forcing in the system, the sun, has been determined to be a VARIABLE Star and we are currently getting some eddimuhkashun in what that actually means to the earth system. If you don’t KNOW where you wife set the thermostat in the house and how many windows/doors the kids left open you won’t KNOW what the temperature will be when you get home. (snicker)

“Im not shocked because climate is a boundary value problem not an initial values problem.”

The arguement over this will no doubt be endless, however I believe many are talking past each other here. Perhaps you would be happier with “Total energy content of the climate system is a boundary values problem. The near surface temperature is somewhat related to this, but is not linearly tranformable from it” or something like that.

LIke Mosher, I’m unsurprised that there is a low-order relationship between forcings and global temperature and surmised this quite early on. Nor am I particularly enamored with chaos arguments (and have discouraged them here.) The low-order relationship seems to me to be closely related to the climate sensitivity of a model. The properties of the relationship for individual models seems well worth describing.

I think if you look into the CAM3 model for even a few hours, you will find that several potentially complex factors are quietly reduced to far simpler results. It is necessary for some aspects of the calculations, however, it leaves little room for higher order ‘unexpected’ results.

What’s a “high order emulational”? I missed that in my college numerical methods courses…

In any case, most AOGCMs start with time-dependent partial differential equations supposedly derived from the principals of conservation of mass, momentum and energy – and then they go downhill from there. Auxiliary models are layered on top of the basic “core” models in an attempt to get all of the relevant “physics” in the codes. The source terms (i.e. “forcings”) and boundary/initial conditions dominate the solution, which probably explains why low order metrics like the GMST follow the forcings in relatively simple way. Amusingly, it is these very same low order metrics that the modelers trot out to demonstrate that their models are “validated” by real-world data…

what is a higher order emulation? very simple. A GCM step by step calculates 3 dimensional states. So for example at a given time step you have a state for a large number of atmosphere boxes and ocean boxes. Such a state has a high dimension. Now, select a very small subset of the atmospher boxes ( the air at 1 meter) and the sea boxes at the surface. Average all of those measures to 1 number. There you have a low dimension. 1 number.

Now, Many many many different higher order configurations will result in the same one number.

If the worlds average temp is 14C, that tells you nothing about the mass of data that was averaged to get 14C. You’ve gone from a description with many dimensions to a system with 1.

The real challenge that folks are working are are emulation models that get the higher order states correct. So, emulators that can take input forcings and generate a map of temps rather than just a point estimate of the area average.

Thanks Steve. I agree with your statement – averaging 3-D solutions to obtain a 0-D result (the ubiquitous GMST) tells you nothing about the quality of the GCM solution. Why then is this the “popular” way of conveying the “accuracy” and “robustness” of the models?

That doesn’t of course stop countless, costly local impacts papers that use multiple model runs as if they were useful for that purpose. So while they certainly should all know that models are inadequate for such impacts assessment tasks, in practise they either ignore that salient (and undisputed) fact, or they don’t really know it at all.

AusieDan,
I don’t think that’s what they did. From what I understand they started with weather forecasting models and expanded them to include additional factors that they decided were important in modeling climate, like the various forcings. I don’t believe that makes them any better than what you suggest, however I think it’s important to report these things accurately and to correct false assumptions.

I’ve read the advanced simulations are 3 dimensional box models. Each box has an internal state and an IO interface with its immediate neighbours. This was how I pictured a program would do it and confirmed with some googling. You then run a turn based simulation where each turn moves forward in time and results in a changed property state of each box. At the vertical bottom the surface can be defined as it actually appears on the earth. Obviously rotational dynamics affect solar input.

Not sure if the model in question is like that, but that’s how an advanced model would do it. Smaller boxes provide greater detail, kind of like what polygons are to a 3D CGI model.

Confessing I’ve lost count of versions, does this version of GISS-E agree well with the raw data of the original countries/bodies supplying the data? It would be a shame to spend time on a hypothetical construct. All of these reconstructions need an audit, still. In the early days, there was a decline between about 1945 and 1970, but its shape keeps changing. More ocean data, I suppose.

if one can linearize a GCM output, then where is the chaotic system? the uncontrolled heating? the synergistic positive feedbacks? the tipping point(s)?
.
The short answer is : In the real world .
The longer one would be that you must not forget that this exercice compares only 2 MODELS and has no relevance to the reality .
This is basically saying that the yearly temperature averages (or anomalies) produced by a GCM whose internal workings are too complex to be understood in detail can be perfectly (accuracy > 99%) reproduced by a simple model with 2 parameters .
.
The obvious conclusion is that (all ?) GCM despite their apparent complexity behave just like sensibility/lag 1 D models as far as temperature averages are concerned .
Following this demonstration and if you are interested by the temperature averages it is valid to substitute the simple formula to any GCM run .
Then in a second phase you may compare the simplicist formula to the reality and observe that it cannot correctly represent it .
For instance the sensitive dependence on initial conditions (also called deterministic chaos) which is as we know present in the real world for time scales at least up to several years is not in the formula as are not multidecadal temperature cycles .
I prefer “bifurcation” to “tipping point” but here too we know by paleo evidence that temperature bifurcations exist in the real world(this time on much longer time scales) yet the formula can’t produce them .
Etc .
.
As the formula clearly doesn’t describe temperature dynamics in the real world , if you make the jump to assimilate the formula to GCM production , you necessarily conclude that GCMs don’t correctly describe the temperature dynamics either .
Of course one should never forget that when the discussion is about dynamics (like here) , the correct definition of time scales is paramount .
What may be true for 1 time scale is not necessarily true for another .

Arbitrarily reducing the influence of volcanic forcing is a bodge, the apparent necessity of which clearly indicates that the model is inadequate whether coded in R or Excel. It is patently absurd to try to model climate with a single time scale – the atmosphere, the surface oceans and the deep oceans all have their own time scales. This was recently demonstrated, yet again, over at http://tamino.wordpress.com/2011/05/14/fake-forcing

The change in lambda is is from 0.3349366 to 0.33232276 and of tau from 2.6342481 to 3.52347902.

It seems to me that the issue of whether there one can represent GISS global temperature as a small-parameter expression of forcings is a totally different issue of whether it makes sense to model climate on a single time scale. You may well be right on this latter point, but it may also be the case (as it appears to be) that GCM output of global temperature can be represented as a relatively simple expression of forcings. Indeed, although unpublicized, as I pointed in my post (as Mosher had pointed out at WUWT), this seems to be common ground with Isaac Held, who posted recently, making what seems to me to be more or less the same point as Willis in respect of the representation, though, obviously not in respect to the relevance of this observation to the GCM enterprise.

As to Tamino’s post, while he spent considerable energy in criticizing Willis’ handling of volcanic forcing, he didn’t do the obvious step of showing the actual impact of eliminating this parameter – as I did here. Unlike Tamino, I’ve provided turnkey scripts whereby readers can check results if interested.

As an additional comment on “fake forcing” – a term used in Tamino’s post. Many people have observed that there is no solid data on historical aerosol forcing, for example. Kiehl observed that the selection of historical data in GCMs has had the opportunistic result of making results more similar i.e. high-sensitivity models use low historical variation and vice versa. Though Tamino seems unworried about this.

I would not be in the least surprised in the mean global temperature in a GCM can be represented by a rather simple expression. If the sole value of interest were the mean global temperatures, GCMs would perhaps be excessive tool to use, but it is not. Mean global temperature is only one of many variables of interest, including regional temperature and precipitation variability, sea-ice, and hurricane numbers and strength. I doubt these are so easy to fit a simple expression to.

Re: Scott B. (May 16 11:42), The suck at certain things like sea surface salt concentrations. Things like changing salinity due to rainfall, when you cant get the rainfall right, cause you cant get the clouds right. etc etc.

Good agreement was found with the observed temporal response of solar and thermal radiation, but the observed net radiation change was about 25 percent smaller than calculated. Because of uncertainties in measured radiation anomalies and aerosol properties, that difference is not large enough to define any change to the stratospheric aerosol forcing. It is noteworthy that reduction of volcanic aerosol forcing by that amount would bring simulated temperatures after historic eruptions into better accord with observations, but given the absence of further evidence we employ the above equation for the aerosol forcing.

The 25 percent factor estimated by Hansen here is almost exactly the same as Paul’s calculated amount.

What Willis and Steve achieved here is an emulation of the MODELS not the climate. The MODELS are essentially linear functions of forcing. They are known not to be able to emulate multi-decadal oscillations such as the PDO. If these oscillations are internal (due to ocean currents for example) then the models miss this functionality that is not linear. If the oscillations are forced by solar activity in some way (my prefered theory), then the models are missing important inputs. Either way, the graph above is unrealistic and too linear.

I was merely trying to provide a usable tool to understand what was and wasn’t being argued.

I find most discussion of “Oscillations” to be unenlightening as most such “oscillations” seem to me to be 1/f phenomena and not true “oscillations”. And to lack explanatory power. It also seems evident to me that the models simulate a variety of turbulent phenomena.

To the extent that I have an issue, it’s this – if one can represent global temperature as a relatively low-parameter response to forcing, then it seems to me that many interesting and difficult modeling problems in GCMs may well be irrelevant to understanding the impact of doubled CO2. And that this should be relevant to expositions to the public.

Craig,
I think we all understand the formula is emulating the models and not the climate. I am shocked that we have invested something like $80 billion into climate models that are as you describe them “essentially linear functions of forcing.” Is that really what the government thought they were paying for?

Steve – Ron, you’re oversimplifying here. The GCMs obviously do a lot more than a simple linear function. In addition, the parameters of the linear fit are obtained by fitting to GCM outputs and could not necessarily be pulled out of the air. Nonetheless, I think that it’s a reasonable question whether there is over-investment in the GCM sector relative to their usefulness. But that’s a different issue.

Steve,
According to my 2008 FOIA, GISS has not performed NASA standard SQA, so your statements that GCMs “obviously do a lot more than a simple linear function” is tenuous, at least in the case of GISS Model E.

Steve, perhaps I am oversimplifying and I acknowledge the GCMs have several outputs (although I don’t know what all of them are) but the output everyone cares about is the global mean surface temperature.

Perhaps it would be helpful to know all of the GCM outputs because that would give us other ways to assess the GCMs and their errors.

It would be very cool to see a matrix of selected output parameters comparing the different models, the ranges within specific models, and a ranking of how well the output ’emulates’ nature (assuming some of these things can be physically measured).

They find that external forcing can give Atlantic multidecadal fluctuations, but note that the model output in this post is completely lacking a PDO type oscillation (or just the mid century warming and 1950 to 1975 cooling to limit the claim), as do all the outputs shown by the IPCC. This analysis would suggest that they are thus lacking adequate forcing data (or misusing it).

This reminds me of a situation in macroeconomics where 2 very different models generated the same apparent fit to a macro data set, namely the scatter of aggregate consumption against aggregate income. The simple linear Keynesian marginal propensity to consume (MPC) model fit the data just fine, but so did the Friedman Permanent Income Hypothesis (PIH) model. The MPC model was static, but the PIH is dynamic and has long forward-looking intertemporal decision-making in it. The test between the two arose from predictions about responses to policy changes. The Keynesian MPC model predicts a large, rapid change in consumption due to a one-time deficit-financed stimulus. The PIH predicts little or no change, because households anticipate the future fiscal contraction needed to pay for the present stimulus.

Both the MPC and the PIH can simulate consumption-income curves that look like real-world data. But one of them was conspicuously worse at forecasting the likely effect of policy measures during the 70s stagflation.

I’d be curious to see how the linear insta-GCM forecasts the future temperature line in a zero-GHG-after-2010 experiment. As I understand it the giant models project lots of continued temperature increase due to inertia and in-the-pipeline processes. I suspect the insta-GCM forecast would flatten out pretty quickly.

I’d be curious to see how the linear insta-GCM forecasts the future temperature line in a zero-GHG-after-2010 experiment. As I understand it the giant models project lots of continued temperature increase due to inertia and in-the-pipeline processes. I suspect the insta-GCM forecast would flatten out pretty quickly.

The linear fit shown here is up to 2003, but that isn’t material to what you’re wondering about. As you surmise, it flattens out quickly without further input of forcing with only about 0.1 deg C in the “pipeline” on this emulation.

Interesting. What you have are 2 models that take input forcings and generate an output series, which yield very different forecasts under the same future forcing scenario, yet there is (likely) no basis for selecting between them based on statistical test of historical fit since their historical behaviour is nearly identical.

We have the big complicated model that says (say) 2C warming is in the pipeline, while the simple linear insta-GCM says only 0.1 C is in the pipeline. We can’t rank these 2 models based on historical accuracy since they yield historically identical outputs. On what basis should we rank them?

It is tempting to use an a priori criterion: the big model is preferred because it builds in more physical processes. But that also means more places where the model can go wrong. And Occan’s razor would favour the simple model. Either way, any claim about what’s in the pipeline appears to be based on an arbitrary choice between 2 models with equivalent global behaviour.

In Biology, there was a war in the field of systematics during the 20th Century. It was very bitter, with the two sides differing in fundamental approach in how to determine the relationship among species. You had two totally opposed schools at each other’s throats for years.

But when you looked at results, the two opposing approaches pretty well agreed down the line.

Not sure if this has already been asked, but are the GISS E projected forcings up to 2099 available somewhere? How does the insta-GCM with parameters fitted using the hindcast forcings compare to the GISS E forecast if we use these projected forcings up to 2099?

A1B forcings for GHG are available. I don’t know what they use for aerosols and other things. GISS seems to be the best in making forcing information available. Locating forcing for other models is a nightmare. It would be worthwhile making a thread and this and seeing what readers can turn up. It will probably be necessary to get modelers to fulfil their obligations.

if your interest is primarily in global temperature, it’s hard to see why you need the terabytes of operations – maybe you do, I haven’t delved into the matter. I notice that, trom time to time, they model GCM outputs by simpler models and use these simpler models to project GCM outputs. That’s what they did in IPCC TAR – see Wigley and Raper.

Well, you can get examples where large simulations can be used to get at a simple solution. For example, in finance, you can use a Monte Carlo solution to get the same answer as a closed form analytical solution for Black-Scholes.

It would be interesting to set up a short term prediction for global temperatures and see how well it does.

What I would be interested in is a discussion on the ‘meanings’ of the various fitted constants. What are the intepretations of the lag as an example.

Jim Hansen has frequently talked about a GCM’s response function, which gives the fraction of equilibrium warming as a function of time (as a result of an instantaneous forcing).

He then showed that using this response function one can simulate the global avg temp quite well:

“All we must do is multiply the response function by the annual change of forcing and integrate over time.
In ten to the minus seven seconds, we obtain a predicted global temperature change.
The result, the red curve, agrees very well with the GCM result, which required 10 to the plus seven seconds, or four months.
So we saved a factor of ten to the fourteenth power.”

Re: Bart Verheggen (May 16 09:26),
Bart, you say that Hansen has “frequently” talked about this phenomenon and while it is unsurprising to specialists, it’s not something that is prominently stated in IPCC reports. (I haven’t parsed IPCC reports looking for mentions but don’t recall it being mentioned. If this is incorrect, I’m happy to amend). Even your reference is not to something in indexed academic literature, but to a talk (though it would be surprising if the phenomenon is not described in indexed academic literature somewhere.)

Apologies if I haven’t got the gist of this but as I understand it you have successfully written a model which takes the inputs of a GCM and replicated the outputs by deriving the function which (with lags and different weightings) best matches the output of the GCM. A good start and given the uncertainty in projecting the forcings I am of the view that any results you could derive from this “GCM emulator” are just as good as the results from the fully functional GCM.

However, it seems to me that you could go one better. If you took the forcings and found the simple model which gave you the best match to the actual temperature record then you might well find that you can out do the GCM’s accuracy. In a sense you can’t help but do better because the GCM’s are bound by the theories and understanding of science that the modeller has. If you are just taking forcings and finding the best match to the data you are being agnostic about what actually matters.

If I understand I think this is similar to what Prof. Dr. Nir Shaviv did and he got a residual of half of the GCM.

Really curious to know what you think….

Great site and amazing due diligence on a dreadfully sloppy profession.
Steve: you have to be careful in being less critical of this analysis than of GCMs because you “like” the results.

Steve, you are right that I should take care on that point, but this whole thing reminds me hugely of the reservoir engineering discipline. I am a Petroleum Engineer by trade. On a macro level we use material balance and analytical methods to model reservoir performance; to get at the detail we build complex reservoir models with many cells and lots of detail. The physical system is very complex and we can never truly represent it so we use equations that have a physical or experimental basis and fudge factors to make them match reality. The value of the detailed model is that it lets you make detailed decisions like placing a well in a particular spot to drain some unswept oil… they don’t tend to be any better at predicting the overall future than the simple models. Of course young engineers think they are better but those of us who have been around a while are less convinced.

The GCM is analogous to the detailed reservoir model and if you acknowledge all the inputs properly it should be possible to build a history match that gets the overall answer exactly right. What is difficult with reservoir models is to get all the individual well pressure and flowrates right and thats because you haven’t enough fudge factors to tweak…. but at an overall level (ie gross field oil and water production rates) its easy. It should be easy to make the GCM’s match the overall surface temperature if 1) the temperature record is sound and 2) all the inputs that matter in reality have been accounted for. The fact that the GCM’s gloss over the early 20th C warming and are over sensitive to Volcanic inputs is a sign that all is not right.

The simplified model you present here is analogous to the material balance models and analytical methods we use in reservoir engineering – all that is missing is that instead of deriving the model that best fits a GCM output – which is interesting – if you could derive a model that best fitted the climate history you would have a tool which is likely significantly better than a GCM at predicting the overall global temperature trend. No good for saying which areas are drier or wetter but then if the GCM’s don’t get the basics right I doubt their regional predictions are worth a damn.

I have become convinced that “the cosmic ray impacts some cloud cover and thus impacts global temperature to some extent” theory is right. The CLOUD experiment at CERN is going to demonstrate that there is a sound physical mechanism for this. But as a reservoir engineer I don’t think I have to derive the equation that links cosmic ray intensity to temperature impact theoretically – I am happy to include it along with a bunch of other forcings do a history match and tweak the forcing factors / fudge factors (we normally think up scientific sounding names like pseudo-relative permeability) to get the best fit – then we have a workable tool…. I thought that Prof Shaviv was making a first stab at that about 20 mins into his talk.

With your analytical skills I bet you could come up with something that would knock the best of the GCM’s into a cocked hat.

The US Gov’t has been interested in condensing the results of large, multi-variate simulations into a simpler, EXCEL spreadsheet type model so they can weigh alternatives more easily. I am not saying this is better, they just are frustrated by spending tons of money and time on the set-up and running of large simulations. Of course, you have to actually run the large sims with likely inputs to get the results needed to build the simple models (much like running GCMs). Over time, if the process you are studying has significant changes to some input variables, you re-run the big sims, then
re-do the simpler model to match those results. You can see where it would be tempting when working with the GCMs to assume that all of the inputs are “known”, or their increase or decrease with time can be estimated (ie rises in GHGs, decrease in forest, etc). From the above discussion, it looks like future efforts may be well spent on reviewing the actual level of the forcers, and less time and money spent on the actual models.

It looks like the Kaufmann and Stern paper (available in DP form here has never been published. They regress the time series of global average surface temperature (T) on 4 classes of explanatory variables: a GCM prediction of T, the input forcings to the GCM, other forcings not used in the GCM run, and other exogenous variables (time trend, lags etc). Their specific model is an error-correction model, which separately identifies short-run and long-run relationships based on cointegrating vectors.

They can then test whether the GCM output adds any explanatory power over and above a linear combination of the input forcings themselves, in other words whether the structure of the GCM has explanatory power. Likewise they test if the forcings have any explanatory power over and above the GCM structure.

The results are that in 7 of 8 cases, the GCM-generated T has no explanatory power, in the sense that if the forcing inputs are included, the hypothesis of zero explanatory power for the GCM output cannot be rejected. However, if the GCM-generated T series is included, the hypothesis that the forcing variables can be excluded is always rejected.

They give no details about how the forcing variables were generated, referring to a 2000 paper that is not in the list of refs, but is probably their 2002 JGR paper, which only says they used the Shine et al. (1991) and Kattenberg (1996) formulas for producing forcings from observed GHG and sulfate levels. So they appear to have shown that the development of the structure of GCMs beyond simple forcing concepts circa 1991-1996 did not yield any explanatory power for global average surface temperature. Ouch!

I must have spent too much time at Science of Doom´s blog trying to bone up on all the radiation physics. I cannot see how CO2 forcing in W/m2 can be an input to a GCM. To calculate the forcing you have to calculate changes to the OLR and that cannot be done without knowing the atmospheric temperature profile and the amount of the various GHGs at all altitudes.

I had always assumed that it was the job of the GCMs to calculate the temperature profile and distribution of water vapour so that the energy exchanges by radiation could be calculated. This obviously has to be an iterative procedure as the temperature profile is affected by the radiation exchange.

I quite understand the concept of forcing as in the IPCC definition but I can´t see how to relate this to the calculations in a GCM of the evolving changes in OLR, temperatures and water content.

Suely the concept of forcing in terms of changes in OLR calculated by a GCM is an emergent property and not an input.

I think the fact that different models (or different runs of the same model) give the same trend but very different absolute values is significant here (see Lucia’s post here: http://rankexploits.com/musings/2009/fact-6a-model-simulations-dont-match-average-surface-temperature-of-the-earth/. This can only happen if there are no feedback responses to temperature, or if the model has become dominated by “non-physical” parameters (ie parameters that are real but with unknown values, or “fudge factors” intended to preserve energy balance etc). The models are “spun up” until they are stable, then extra forcings are added (volcanoes, anthropogenic CO2 etc). The energy change without any feedback is by definition linear in the forcing, hence the temperature change must also be linear in the forcing. My guess is that any models that show instability are rejected, leaving a subset that are stable. It would be relatively easy to choose the unknown parameters to force stability, compared to the task of choosing them to give an acceptable amount of instability.
To answer Ross McKitrick’s question, the way to choose between different models is to see how well the models represent reality on a smaller scale then global averages. You may of course reject all the models on this basis.

This is a Wired article describing the work at Cornell by researcher Hod Lipson. The AI program Eureqa will take data and forcing and using genetics algorithms breed formulas that predict the data from the forcing. The formulae are bred from combinations of simple functions such as lags, exponentials, sinusoids, multiplications etc. Successive generations of formulae are generated and tested to create useful formulae.

For example, I have seen a video in which Lipson demonstrated how his system took the data for the movement of double pendulum and from this generated both the Hamiltonian and the Lagrangian. Note that Lipson acknowledges that the system has no understanding of what these mean – i.e. the conservation of energy – only that the formulae can be used to provide accurate renditions of the outputs from the forcings. He also demonstrated work that he did with a biologist on a biological system. Initially he had used the basic functions that he had used for physics. The output formula was awkward and useless. Then in discussion with the biologist, he recognized that he should supply the system with lags. The formula that eh system produced was not only simple and elegant but simpler and just as accurate as the one that the researcher had just published in Science. However Lipson acknowledged that neither he nor the biologist understood exactly what this formula was describing. That was to be the subject of further research. He also did work for some particle physicists and rediscovered a basic law relating the mass of particles from the data provided. The system does appear to work

When I was watching the video, I was immediately struck by the question is Eureqa could usefully be used to take climate forcing and measured parameters and to create a model that links them. Eureqa is available on the web and Lipson stated that he was looking for interesting problems.

1. we want understanding.
2. The input data is suspect in certain cases. When we have the physics correct and the data and physics dont match that is important. It sends us back to the data collection drawing board
and the physics drawing board. So with this approach the chances of finding mismatch vanishes.
For problems where the data are known to be valid, then fitting a black box ( either a NN or a genetic algo approach ) is fine. Think of a HMM ( hidden markov model) used in speach to text as a good example.

And Lipson did distinguish it for neural networks in that there are no hidden layers that no one understands. It creates a discrete formula linking the forcings to the outputs. As I recall, e stated that the evaluation function for each generation used a form of Ockham’s Razor to give preference to short elegant formulas.

I have no particular knowledge of this. it just seems to me to be a useful venue for further exoploration. It is available free on the web.

What I would love to see, and likely will never see, is a comparison of model runs when the input variables are fixed. It drives me nuts that things like the CMIP don’t ask for this kind of comparison. In fact, it drives me nuts that the model programmers all treat each other like they were actors at the Oscars, doing nothing but complimenting each other’s work. In any other field, they’d be saying “my model is the best and here’s the tests that prove it!”

Instead of that, we have a bizarre model “democracy”, one model one vote, that is scientifically indefensible. Hey, guys, get your act together and let us see which one of the models really is better than the others. Propose some standard tests. I’d suggest starting with looking monthly temperature (absolute, not anomaly) along with its first and second derivatives. Those have to be “lifelike” for the models to be even considered. From there, the tests would get harder.

If we had consistent forcings and clear tests, then we could easily see whether the models are adding information or not. I like the Kaufman and Stern approach, which looks to see whether explanatory power is increased by adding information such as the GCM predictions, other forcings, and the like.

But obviously, none of the modelers want that. If they comment unfavorably on someone else’s model, the other modeler might return the favor … and of course, their model might fail the test.

But obviously, none of the modelers want that. If they comment unfavorably on someone else’s model, the other modeler might return the favor … and of course, their model might fail the test.

That is largely misleading. Gavin and I have had public discussions on RC about model scoring and the problems with a democracy of models. in fact for attribution studies they do not use a democracy of models. Model scoring is an active area of research. I’ve pointed this out to you and others.
I’ll do it again. here is just one example

Well, kinda. That describes the CMIP5 plans to provide forcings and temperatures for what they are curiously calling the “Last Millennium (850-1850 CE)”.

I’m not sure how that came to be the “Last Millennium”, but setting that aside, this seems like a huge attempt to stack the deck. I was asking for a bunch of models run with the same forcings. Here’s their description (emphasis mine):

The experiments described below are thus intended to be as realistic a description of past forcings and response as possible, while trying hard to ensure that the uncertainties in the forcings are properly represented. This has lead us to suggest a number of alternate forcing histories for volcanic and solar changes rather than specifying that all models use a single, arbitrarily chosen, one. This approach is different from the one adopted for previous PMIP experiments, which requested that all modeling groups used similar forcing in order to favor model intercomparisons

So the situation is getting worse rather than better. I have mentioned before that the modelers have little interest in comparing and testing the models. This is another example.

The real issue that I have with generating a set of forcings for the period 850 -1850 is … what on earth are they planning to compare their results to? I mean, you can happily run a climate model over estimated forcings for the period … but how will you know if it is giving wrong answers for the year 972??? And since we can’t know, what’s the point?

Thanks, Mosh. My point is that if all of the models are using different forcings, it’s really hard to say anything about the results.

Gavin, on the other hand, advises that there should be a variety of forcing histories available for the use of the models.

I just want some tests to weed out the wheat from the chaff … unfortunately, most reasonable tests wouldn’t reveal any wheat, so there’s little interest in the field in comparing models in any realistic manner.

Re: Willis Eschenbach (Jun 8 16:24), I think the engineer in me wants to see the variety of responses to the same forcings. I was also shocked to find that they use models that dont have volcanic forcing.

Lets say this. The way you or I would structure the experiment would be to focus on validation and weeding out the less skillful models. They dont seem to share that bent. I’d also structure SRES differently

This is a Wired article describing the work at Cornell by researcher Hod Lipson. The AI program Eureqa will take data and forcing and using genetics algorithms breed formulas that predict the data from the forcing. The formulae are bred from combinations of simple functions such as lags, exponentials, sinusoids, multiplications etc. Successive generations of formulae are generated and tested to create useful formulae.

For example, I have seen a video in which Lipson demonstrated how his system took the data for the movement of double pendulum and from this generated both the Hamiltonian and the Lagrangian. Note that Lipson acknowledges that the system has no understanding of what these mean – i.e. the conservation of energy – only that the formulae can be used to provide accurate renditions of the outputs from the forcings. He also demonstrated work that he did with a biologist on a biological system. Initially he had used the basic functions that he had used for physics. The output formula was awkward and useless. Then in discussion with the biologist, he recognized that he should supply the system with lags. The formula that eh system produced was not only simple and elegant but simpler and just as accurate as the one that the researcher had just published in Science. However Lipson acknowledged that neither he nor the biologist understood exactly what this formula was describing. That was to be the subject of further research. He also did work for some particle physicists and rediscovered a basic law relating the mass of particles from the data provided. The system does appear to work

When I was watching the video, I was immediately struck by the question is Eureqa could usefully be used to take climate forcing and measured parameters and to create a model that links them. Eureqa is available on the web and Lipson stated that he was looking for interesting problems.

Interesting question, Tom, and interesting program. It seems to me that what Eureqa does is automate the “black box analysis” process. I’ll have to look into that.

LIke Mosher, I’m unsurprised that there is a low-order relationship between forcings and global temperature and surmised this quite early on. Nor am I particularly enamored with chaos arguments (and have discouraged them here.) The low-order relationship seems to me to be closely related to the climate sensitivity of a model. The properties of the relationship for individual models seems well worth describing.

This is a recurring problem with climate, people constantly mix up the models and the reality.

I have not shown, nor has anyone to my knowledge, that “there is a low-order relationship between forcings and global temperature”. That’s the question we’re trying to answer, not a statement anyone can make. I don’t think such a relationship exists, any more than there is a relationship between the internal temperature of my house (“global temperature”) and how much gas I burn to heat my house (“forcings”).

I suspect you meant “there is a low-order relationship between forcings and [model simulations of] global temperature”. At least I hope so …

given the utter simplicity of what has to happen in a climate system ( watts OUT has to over time equal watts in)
it’s pretty obvious that there will be a low order relationship between forcings and the temperature.
To be sure on smaller spatial scales and shorter time scales we will find all sorts of complications and twists
and turns. But the underlying governing physics and the boundary conditions make it fairly obvious.

I have a hard time with this statement as posited. In the real world it is tantamount to the declaration that hadley cells, cloud formation, ocean currents, albedo have near zero impact WRT measured and assumed inputs. Is it not possible, or even likely, that feedbacks are powerful and non-linear.

That’s the beauty of the simple fits, it shows that the models are far less free to solve the unknown than advertized.

Steven,
Our measurements of watts out and watts in are completely unreliable. According to Trenberth, we have a great deal more coming in than going out. Yet the oceans are not warming. All this missing heat is a travesty, remember?

We don’t know what is going on with climate. Negative feedbacks may be the answer, but no one really knows.

Re: Ron Cram (May 16 22:04), You miss the point. The point is
watts out will balance watts in, over time. So again, I see no reason whatsoever to expect that the higher order outputs cannot be emulated by a linearized system.

Now, on the short term, at the fine spatial scale, I have no such expectation. In the long run the temperature of the earth is very predictable. absolute zero.

Steven, I didn’t express myself well. The second paragraph should have been first, because it is the important one.

We don’t know what is going on with climate. We don’t know the forcings of natural variability so the GCMs, even simple models, can not emulate the forcings of natural variability (including the natural negative feedbacks). Even if we did know all of the forcings, it is leap of logic to assume there would be a linear relationship to the forcings. It may be possible to have years, even decades, of imbalance with more heat coming in than going out offset by years or decades of imbalance in the opposite direction.

All of this uncertainty is expressed by Trenberth’s travesty comment. Our energy accounts don’t balance. We don’t know why. The climate is doing things we don’t understand and the planet is far cooler than cAGW theory expected.

Worst of all, the $80 billion invested in climate models cannot model the holy grail of a “tipping point” which certainly is not linear. No, the models are stuck with a linear output of the metric held most dear by climate scientists – global temperatures. This has to be depressing to climate modelers.

‘We don’t know what is going on with climate. We don’t know the forcings of natural variability so the GCMs, even simple models, can not emulate the forcings of natural variability (including the natural negative feedbacks). Even if we did know all of the forcings, it is leap of logic to assume there would be a linear relationship to the forcings. It may be possible to have years, even decades, of imbalance with more heat coming in than going out offset by years or decades of imbalance in the opposite direction.”

The argument from ignorance doesnt get much traction with me. We always “know” something. Lets take natural variability. First off, that a horrible description. So what do you mean by that. You might mean “internal variability”. That would be the variability ( in what measures) that the system exhibits when external forcing stays constant. So that hinges on the definition of external and internal forcing. While it might be LOGICALLY possible to have years of imbalance, a little understanding of the inertia in the system will give you some confidence that you are not going to see massive non linear over shoots in the system. When the dominant input ( the sun) dims by a little we do not get instantaeous overshoots into a snowball earth. Not happening on the scale of years. On the largest of scales ( spatial and temporal) its a heavily damped system

But if you take into account the outside temperature, you will find a nice linear relation between in/outside temperature difference and gas use, no matter how the inside temperature is controlled…
Something similar to energy balance and ocean heat content, the ultimate resulting parameter of reference (directly correlating with global temperature)…

It also helps to account for the heat generated within the house. Thus the accounting relative to “degree days” are offset to 65 F (for traditionally poorly insulated houses.)
I don’t think “climate models” are able to detect the actual anthropogenic heat generated – they just attribute consequences to the accumulation of CO2 (without a much understanding of whether warming oceans cause clouds – or thinning clouds warm the oceans etc.)

How much in these models are actual physics and how much are turning of the knobs to get the desired result? As someone in this blog put it a number of years ago “pulling back the curtain reveals a bunch of bumpkins pulling levers on GCM’s”

It sure seems like with all the uncertainties, such as cloud dynamics for instance, GCM outputs are more like playing SIM World than replicating real world physics.

“the only parameter settings that are usually al- lowed to change are the sea ice albedos and a single param- eter in the atmosphere component. This is the relative hu- midity threshold above which low clouds are formed, and it is used to balance the coupled model at the TOA. A few 100 year coupled runs are required to find the best values for these parameters based on the Arctic sea ice thickness and a good TOA heat balance”

this model is not tuned to hit a temperature series. One knob is turned to balance the model at TOA. Ice albedo is tuned so that thickness of sea ice matches observations.

That’s a fairly sound practice. If they just wriggled knobs to hit the global average temp, that would be more of an issue.

Mosh, a modeler told me recently that climate sensitivity is a “tunable parameter” – the main knob being cloud parameterization. He seemed to regard this as common knowledge. I expressed surprise, noting that no remotely comparable statement appeared in IPCC. He said that there were references in the literature, eventually referring me to some articles from the early 1990s.

Ellingson in a dry comment in the 1990s observed that most models at the time had incorrect radiative schemes (most being corrected by now), but that even models with quite different radiative schemes had very similar results form doubled CO2, leading him to raise his eyebrow slightly.

Re: Steve McIntyre (May 16 23:52), I’m looking at recent statements made by the CCSM folks as to which knobs they turned. See my comment below.
the fiddled with a relative hudimity/low cloud formation to balance energy at TOA.

I make my comment also based on an AGU presentation I saw where the modelers were creating a emulation of the GCM so they could explore the parameter space. The inference being, that the were unable to explore the parameter space by turning knobs because of run time constraints.

In fact the emulations allowed them to find bugs in the GCM ( like the climate.net case) the point being that very often with huge simulators like this free parameters are set once and then NOBODY DARES TOUCH THEM. cause it worked. So they will settle on a couple knobs to twist.

Then when they finally do twist the untwisted knobs… crash.

What people should be complaining about is not the twisting of knobs, but rather the fact that the parameter space has not been fully exercised. Twisting a knob is the logical equivalent of exploring uncertainty. We thing the paramter lies between .2 and .8 So we set it at .5, and then we never explore the IMPACT of our ignorance.

So as usual, I view knob twisting exactly opposite of the way most people look at it. I wanna see them twist knobs, but I want to know what it looks like at the limits.

But that is a good question. Which knobs did you turn? Ideally, you want to know how a system runs with every min/max setting of the free parameters. That would be a good diagnostic. And you’d wanna know what models twisted what knobs.

Held seems to be an open guy, maybe he would answer such a question. Tim Palmer might as well.

The BBC climateprediction.net experiment has 34 adjustable parameters. Even if they were all only two state parameters it would take 170 million completed runs to explore 1% of the space. It is not really feasible to do such a thing.

I actually asked Gavin on rc why this wasn’t done and he responded answered that this would not be “useful”.

Of course we all know that the spread of results would be just as in climateprediction.net or larger so such an exercise could only weaken the argument that GCM’s are useful to any degree at all for policy.

They aren’t building bridges here, they are merely guesstimating in the face of huge uncertainties with zero accountability if they are wrong; so they get away with it.

I’m pretty sure that we agree on this. At one time, I read some papers on really stripped down Aquaworld models. I rather wish that there were more presentations based on models with less baroque inputs.

“the only parameter settings that are usually allowed to change are the sea ice albedos and a single parameter in the atmosphere component. This is the relative humidity threshold above which low clouds are formed, and it is used to balance the coupled model at the TOA. A few 100 year coupled runs are required to find the best values for these parameters based on the Arctic sea ice thickness and a good TOA heat balance”

Unfortunately, I fear that is a huge oversimplification that leads to an incorrect conclusion.

The model has four separate and individually operating components—ocean, land, atmosphere, and sea ice. These components are first individually tuned to match the historical record by adjusting their own various parameters. Then they are coupled together.

You left off the first part of the quote, which says (omitted words in bold):

Once the components are coupled, then the only parameter settings that are usually allowed to change …

Clearly, this means that there are an unknown number of other adjustable parameters. All but two of them are held fixed for a given run. However, this says nothing about the adjustment of the various other parameter settings before coupling of the components. Or in between runs, when the individual components may be re-tuned. All it says is that once they are hooked up there’s only two knobs allowed to turn … but what about the host of knobs that are turned before they are coupled? The individual components are tuned by adjusting their own parameters. Then, once they are tuned to reproduce the past and they are coupled together, only two parameters can be changed … sorry, but that doesn’t make me feel better about the number of adjustable parameters. And no … the number of adjustable parameters isn’t “two” as you claim.

Finally, since one of the two adjustable knobs is used to balance the model out regarding TOA radiation … how can they then claim to find out anything about the TOA radiation balance? Which way the balance goes after it is artificially forced into balance depends on which way and how much the knob has to be twisted to balance it all out, and the reasons why it wasn’t balanced to start with.

After it is artificially balanced, we can measure which way the balance moves when something else changes … but it doesn’t mean anything without knowledge of what exactly what lack the artificial balancing is making up for. And we don’t have that knowledge.

w.

PS – the claim that “the models take too long to run to tune by twisting a bunch of knobs” is clearly untrue. We know for a fact that the models are tuned … and how do you think they are tuned? They do a run and see how the model did. Then they change some things and try again. Yes, it’s slow … nature of the beast.

Finally, there is also a parameter available in the coupling, viz:

The atmosphere, land and sea ice components communicate both state information and fluxes through the coupler every atmospheric timestep. The only fluxes still calculated in the coupler are those between the atmosphere and ocean, and the coupler communicates them to the ocean component only once a day.

Since the atmosphere and the ocean exchange momentum, mass, and energy across the interface, seems like that would make at least three more adjustable parameters, maybe more …

Re: troyca (May 16 20:01), Re: Willis Eschenbach (May 17 00:54), with 34 knobs to turn ( 39 if you include state variables like slab ocean on/off, sulfur cycle on/off)
its pretty clear that they do not spend time twisting a bunch of knobs. Here is a suggestion. Spend some time exchanging mail with some of these people. Some of them will take the time to explain what they do. That will help you stick closer to the actual state of affairs. However, I wish they did twist a bunch of knobs.

Mosh, in fairness to Willis, he’s spent lots of time corresponding with CRU trying to get data and getting run around. He has recently been refused to access to some model data/ ALso, scientists complain that they don’t want to spend time explaining what they do.

My point is this. I’ve done a loads of searching around and pointed people at resources, explained how to get them, written to people and gotten answers, watched their videos, attended their talks, and in all that I’ve seen no evidence whatsoever of some of the things claimed

“but what about the host of knobs that are turned before they are coupled? The individual components are tuned by adjusting their own parameters. ”

I’ve seen no papers claiming this occurs, no presentations, no discussions, no emails, no nothing.

I’m suggesting that willis should get some evidence that this occurs, before he claims it does occur. To suggest that he not write people, because others have denied him in the past, doesnt square with me. CRU denied me as well. That didnt stop me from asking others. Many have no issue sending me info. Even when they know who I am. heck I even exchange email with guys on the team ( second stringers of course )

So I object to making claims about what they do without some citation.

We know what they say in public and we know that is different to what they say in private.

Anyone who has done any large scale numerical simulation knows that they must tune the models or they just couldn’t replicate anything with these huge uncertainty ranges.

Now you’ve now heard that everyone knowledgable about climate is “surprised” that all these models with all these differing parameter values with huge error bars somehow more or less get similar results.

You’ve also now heard that in private climate modelers admit they can easily tune models. In practise most knobs have little significant effect; the aerosol uncertainty traditional swamps everything and internal variability (of apparently unknown origin) is getting more and more used.

“Also, scientists complain that they don’t want to spend time explaining what they do.”

Well, that explains a lot … I seem to recall Gavin Schmidt stating once that he didn’t spend any significant time writing documentation of his codes because he wasn’t paid to do that … he was paid to do “science”!

Not sure if that was a glitch replying to my original comments, as I didn’t mention anything related to knob turning or anything like that. I was simply curious if we can run a regression against the A1B scenario forecast with the forcings like we did with the hindcast, which would require the forcings for the GISS-E A1B scenario. After looking into a bit more, it looks like it may simply be that most of the forcings are held constant at 2000 levels, except for a simulated solar cycle and the increase in wm GHG. Unfortunately, the only forcings estimate data I can find for GHG is in the mixing ratio estimate for those separate GHG, and I’m not sure how to get this into a combined W/M^2…using the efficacy table for conversion I can get reasonable figures, though they may not be precise enough for these purposes.

Re: troyca (May 16 20:01), Re: Willis Eschenbach (May 17 00:54), with 34 knobs to turn ( 39 if you include state variables like slab ocean on/off, sulfur cycle on/off) its pretty clear that they do not spend time twisting a bunch of knobs.

Last time you said there were only two knobs that were adjusted. Now there are 34 knobs, and your explanation is that there are so many of them, they couldn’t spend time adjusting them.

What am I missing? How is this an explanation?

“but what about the host of knobs that are turned before they are coupled? The individual components are tuned by adjusting their own parameters. ”

I’ve seen no papers claiming this occurs, no presentations, no discussions, no emails, no nothing.

I’m suggesting that willis should get some evidence that this occurs, before he claims it does occur.

Steven, your own citation provided the evidence that the individual components are tuned individually. It said that once the individual models were coupled together, only two of the parameters were adjusted. That means there’s more than two parameters. Here’s the quote:

Once the components are coupled, then the only parameter settings that are usually allowed to change …

This clearly means:

1. There are more parameters than the two that are allowed to change.

2. Before the components are coupled, those two plus the other parameters are all allowed to change.

On my planet, that’s called the “evidence” that you have accused me of not providing … and you are the one that provided it.

given the utter simplicity of what has to happen in a climate system ( watts OUT has to over time equal watts in)
it’s pretty obvious that there will be a low order relationship between forcings and the temperature.

Given the utter simplicity of what has to happen in my house ( watts OUT has to over time equal watts in or the house would burn down), perhaps you can explain why the consumption of heating gas does NOT have a low order relationship to my house’s internal temperature?

In other words, the fact that a system or sub-system has to be in energetic balance means nothing about the relationship between inputs and the resulting temperature.

What I would love to see, and likely will never see, is a comparison of model runs when the input variables are fixed. It drives me nuts that things like the CMIP don’t ask for this kind of comparison. In fact, it drives me nuts that the model programmers all treat each other like they […]

Re: Willis Eschenbach (May 16 16:43),

But obviously, none of the modelers want that. If they comment unfavorably on someone else’s model, the other modeler might return the favor … and of course, their model might fail the test.

That is largely misleading. Gavin and I have had public discussions on RC about model scoring and the problems with a democracy of models. in fact for attribution studies they do not use a democracy of models.

It’s not a model democracy? Who knew? I assume, in that case, that you can spell out for us the procedures for the inclusion of only certain models in the IPCC reports, and what tests were used to separate the good models from the bad in AR3, and how those procedures to weed out the bad models were improved for the FAR …

In fact for the IPCC reports in general they do use a democracy of models. And studies like the Santer et al. analysis of tropical tropospheric amplification use a democracy of models.

Now, that may be changing. But to claim it is not happening is not true.

Also, discussions on RC are not public, since many people like myself are prevented from joining in the discussion. Which is why I rarely read it, it’s just like reading badly reported news … and many places, unlike RC, I can comment on the discussion. So I’m sorry to have missed the stimulating interaction, but it would only have sent my blood pressure through the roof.

“it’s not a model democracy? Who knew? I assume, in that case, that you can spell out for us the procedures for the inclusion of only certain models in the IPCC reports, and what tests were used to separate the good models from the bad in AR3, and how those procedures to weed out the bad models were improved for the FAR …”

All science is a work in progress. The way things are handled in the FAR and TAR and and Ar4 and Ar5 will be an evolution. If you read the literature I pointed you at, you’ll see the pathways that people are looking at to secure the results on a statistically defensible footing. I don’t think its fair to say that the democracy of models was recognized as the best and only approach. the issue was discussed in the mails of course. As for model selection in Ar4, I’ve pointed out to you several times the interesting issue in the attribution studies. The point is this. There are some who recognize the issues with a democracy of models. It’s an active area of research.

“Also, discussions on RC are not public, since many people like myself are prevented from joining in the discussion. Which is why I rarely read it, it’s just like reading badly reported news … and many places, unlike RC, I can comment on the discussion. So I’m sorry to have missed the stimulating interaction, but it would only have sent my blood pressure through the roof.”

I’ve found that if I ask open and honest questions that gavin will almost always respond to me. It also helps to retest your assumptions. But I’ve talked about the democracy of models here as well. Still, you could google gavin schmidt democracy of model
and actually see what gavin has said about this

So, you know, when the phrase democracy of models came up in the mails, I recalled that gavin and I had discussed this back in 2007-8 and I wanted to see if he had written anything on that. Just a little research in to the state of the science so as to avoid obvious misstatements.

“it’s not a model democracy? Who knew? I assume, in that case, that you can spell out for us the procedures for the inclusion of only certain models in the IPCC reports, and what tests were used to separate the good models from the bad in AR3, and how those procedures to weed out the bad models were improved for the FAR …”

All science is a work in progress. The way things are handled in the FAR and TAR and and Ar4 and Ar5 will be an evolution. If you read the literature I pointed you at, you’ll see the pathways that people are looking at to secure the results on a statistically defensible footing. I don’t think its fair to say that the democracy of models was recognized as the best and only approach. the issue was discussed in the mails of course. As for model selection in Ar4, I’ve pointed out to you several times the interesting issue in the attribution studies. The point is this. There are some who recognize the issues with a democracy of models. It’s an active area of research.

Steve, you claimed it was not a model democracy. Now, you are saying that it is an “active area of research”, and people are looking at “the pathways” to fix that … change goalposts much? It has been a democracy up to now. Yes, people have talked about changing it, but that’s not what you claimed. You claimed it wasn’t a model democracy, and in the IPCC assessment reports it definitely has been. Might change in AR5 … but I’m guessing not.

Next, I invited you to tell me the difference between the TAR and AR4 democracy. You come back and claim there is an “evolution” … so let us in on the secrets. What evolved? I mean you must know what changed between TAR and AR4, or you wouldn’t claim there’s an “evolution” … so what evolved?

Here’s the simple way to settle this. Give us the names of the models that weren’t allowed to be considered in TAR and AR4, the ones that didn’t make the cut, so we can see for ourselves how the entrance requirements have “evolved” over time.

Or, you could just admit that there was no entrance exam, there was no suite of tests that the models had to pass to be allowed into either TAR or AR4 … in other words, a “model democracy”.

A number of commenters seem to be under the impression that this model represents (just) a slightly more complicated version of a statistical fit of a linear combination of forcings to predict temperature.

I would characterise it differently. It is a physics-based model with two fitted parameters (ignoring the volcano fudge, which is irrelevant). Consequently, it allows a “complete” description of the system, within the limits of validity of the model assumptions.

In this instance, this means, inter alia, that this simple model solves (simultaneously) not only for temperature but also for accumulated heat energy and the TOA radiative difference, whether you want it to or not!

Nobody here seems to have observed that in the same post in which I shared this temperature solution with Willis, I later included a match of the simple model to the gain in OHC in the GISS-E model. This is important!

I promised to do a post for Lucia on ECS within the next week or two, and I believe I have some significant results to add to the above, but I also think that it would be discourteous of me to share too much in advance.

How is the issue of a false correlation being addressed with this analysis?

The analysis takes as its beginning a relationship, expressed in a formula, between temperature and some measured parameters. It then determines a best set of factors which link the input parameters to the output for a set of measurements done in only one climate regime. That is it produces a set of factors for the present day. However how can we be sure that the same factors would be derived for time periods such as the LIA or the MWP? I do not see anything in the formula that indicates any feedback between the forcings. For example, there is nothing that indicates the effect of a change in albedo. That seems to be implicitly held as a constant. That is that there is nothing in them that would indicate that the forcings themselves are functions of temperature. So the amount of methane in the atmosphere both affects the temperature and is dependent on it.

It is these feedbacks that would prodcue the natural variations that are of concern. It is not surprising that this formula does not indicate chaos since it is contains no feedbacks that could create chaos. To me it looks like they are jsut assued to be constant and not a function of temperatue.

A fully functional GCM would have these feedbacks within it and would be able to produce valid outputs for multiple climate regimes. This method would seem to be able to produce an approximation that would be valid in local regions of individual climate regimes. It would be more useful as a convenience for calculations and explanations in individual regimes. The correlations that it generates are only valud in local areas

I believe that the modelers are unable to backcast the MWP which is the other reason why it was so important to minimize it. Since the Roman period was an even larger excursion they seriously cannot replicate earth conditions, not to mention having little idea about the ice ages or hot house earth. Remember their statement that without the “anthropogenic” CO2 they cannot replicate recent temperatures??? Alledgedly no high CO2 levels back then.

Anyone thinking the models are anything more than somewhat useful experimental tools have been fooled by the hype. They do not know enough about the system to successfully model it.

After searching for an hour or so, the oddity is that T, the CCSM3 model predictions, doesn’t appear in the top five results … meaning that the sum of just three forcings did better at predicting the actual temperature than did the CCSM3 model results. Not a good sign for the model.

More to be reported. Next, I’ll give it all of the forcings, plus the previous actual temperatures, plus the CCSM3 model results, and see who EUREQUA thinks is important. Oh, the winner of the simplest formula was

given the utter simplicity of what has to happen in a climate system ( watts OUT has to over time equal watts in)
it’s pretty obvious that there will be a low order relationship between forcings and the temperature.
To be sure on smaller spatial scales and shorter time scales we will find all sorts of complications and twists
and turns. But the underlying governing physics and the boundary conditions make it fairly obvious.

This kind of statement completely misses the point of the time scales .
The dynamics of the Earth system is uniquely defined by the number of the degrees of freedom .
It is only for the degrees of freedom of the system that initial conditions matter while everything else are boundary conditions .
But it would be trivially wrong to think that the number of degrees of freedom doesn’t change with the time scales .
In fact it does and this is the fundamental complexity of the Earth’s dynamics .

If the system is studied at time scales of weeks then the only degrees of freedom are atmospheric velocity , temperature and pressure fields . Only they need initial conditions and everything else can be approximated as boundary conditions .

Now if the system is studied at decadal time scales , the number of degrees of freedom increases because to the former add oceanic velocity and temperature fields .
At these time scales the latter fields stop being boundary conditions and need initial conditions . It would clearly be absurd to study the dynamics at these scales and say that oceans’ initial conditions (streams and temperatures) don’t matter .

At still larger time scales , polar ice , continental movements and astronomic parameters stop being boundary conditions too . Even the solar output stops being constant at a large time scale .

So with the increasing time scales the number of degrees of freedom increases and the initial conditions of the new degrees of freedom begin to matter .
Then there is of course the poorly understood cloud dynamics which is probably a degree of freedom at every time scale .

It is also obviously wrong that the system must realize “watts in = watts out” by integration over some time scale .
This would mean that the internal energy is constant what was not the case during the last 4.5 billions of years .
One can observe that at all time scales from hour to millions of years , the internal energy always varied .
An Ice Age Earth has a lower internal energy than a hot humid Earth without ice caps . A cold year less than a hot year . Etc .

Even if the average temperatures are a very bad proxy for internal energy because for the same temperature average there are very different internal energies and for the same internal energy there are very different average temperatures , there is enough evidence to estimate the internal energy and see that it varies at all time scales .
So in the Earth’s history there had never been and will never be a time scale at which “watts in = watts out” .
One would wish that the REAL system’s dynamics would be as trivially simple and linear as that but it isn’t .

* a fundamental physics part which are the pressure gradient force, advection and gravity. There are no tunable constants or functions.
* the remainder are parameterized physics (parameterized physics means that even if some of the equations of a physics formulation is used, tunable constants and functions inlcuded that are based on observations and/or more detailed models). These parameterizations are almost developed using just a subset of actual real world conditions with the one-dimensional (column) representations yet then applied in the climate models for all situations! The parameterized physics in the atmospheric model include long- and short-wave radiative fluxes; stratiform clouds and precipitation; deep cumulus cloud and associated precipitation; boundary layer turbulence; land-air interactions; ocean-air interactions).

So this statement concerning CCSM:

“the only parameter settings that are usually al- lowed to change are the sea ice albedos and a single param- eter in the atmosphere component. This is the relative hu- midity threshold above which low clouds are formed, and it is used to balance the coupled model at the TOA. A few 100 year coupled runs are required to find the best values for these parameters based on the Arctic sea ice thickness and a good TOA heat balance”

is either the CCSM modelers are embellishing, or they’ve got virtually all the physics, including clouds, the least understood parameter, figured out. To say modelers do not “tune” their model to match observations is more than a bit of a stretch. Which models predicted Arctic ice conditions in 2007 reportedly caused by unusual wind and ocean circulation patterns, and led to modelers predicting ice free conditions as early as 2008 or 2013? None. Which models predicted the North Atlantic OHC dropping since 1998? None. Which models predicted Arctic region OHC correctly? None. Which models predicted global OHC flattening after 2003? None. At least I couldn’t find any evidence of that. In fact, ocean heat in the tropics (20s/20n) should be gaining more than anywhere, but that isn’t happening either. Why? As oceans appear to be the key player ultimately in global land SAT, I fail to understand how GCM’s can be considered predictive tools for future climate when they cannot predict the behavior of oceans.

That there are no untested adhoc “tunable” assumptions built into these models is hard to swallow. I don’t buy it.

BTW, if the wind is blowing during winter, it is virtually guaranteed fuel usage will increase without an increase in outside temperature; one of those tunable “parameters”. Changing sunlight exposure will alter fuel needs as well without a change in outside temperature.

It beats me why no-one else seems to think this is important, but I’ll risk repeating myself: All of Willis and Steve’s analysis is not on the global mean temperature, but on the anomaly or trend. If the absolute value of the temperature is wrong, and we know it is because the models all give different values, then none of the rest of the physics can be right. The only other possibility is that temperature is strictly an output, ie there is no temperature feedback whatsoever. In this case of course the temperature anomaly will be linear in the forcing.

Well, if you want to have lots of fun while wasting lots of time, go get a copy of Eureqa (see above).

Here’s some results hot off of the presses. When given a choice of all individual GISS forcings, the total GISS forcing, the GISSE hindcast, the change in total forcing, and the previous year’s temperature, Eureqa finds the best mix to determine the actual temperature to be a linear combination of volcanic, stratospheric H2O, black carbon, and last years temperature. However, it is worth noting that the out-of-sample r^2 is about 0.6, not what I’d call a good fit.

As with the CCSM3 data reported above, this combination of individual forcings does better than any combination using the actual GISSE hindcast. This does not say good things about the quality of the GISSE hindcast, when individual forcings do better out-of-sample than the hindcast does …

Interestingly, black carbon, which GISS lists as warming the planet, gets a negative value in the equations, indicating that it cools overall rather than warms … I’ve held for a while that the net effect of BC must be one of cooling and not warming. While this is very weak evidence in support of that, it goes in the right direction. (I say that because if you imagine the atmosphere being a shell of black carbon, there’d be no greenhouse warming at all …)

Not sure what kind of weight to put on any of this, not much so far, but it is great to have an automated black-box analyzer. More experiments to follow. The main conclusion I have so far is that no linear transformation of the variables does well in hindcasting the actual temperature … which is no surprise to me.

Snip this if it’s too vague, please.
To help follow the argument, I resort to a parallel model that I know better, being the temperature of the human body. This interacts with various repetitive functions such as pulse rate, respiration rate, perspiration rate, day/night, awake/asleep. Some of these can act in synchronism at some times, or be fairly independent at others, or lag. An unspecified forcing can act on one or several but it is not constrained to a linear effect. Feedbacks are obviusly at work as the body hunts around to try to reach an equilibrium.

Now, if you had measures of all of these parameters for a person over a 10 year period at hourly intervals, would you be able to construct a predictive model? My guess is that it would be impossible. For example, just when you have all the knobs aligned, for a male, a smashingly nice perfumed woman walks by and proceeds to disrobe. c.f. unforecast volcanic eruption.

Surely, for climate temperature models, the unpredictable externalities can overwhelm; and some of the known are too hard to quantify. We speak about clouds upsetting the model, but will we ever known about other upsetting factors that have so far escaped inclusion? My take is that the problem is too hard, the uncertainties too great and the value of a model that works “acceptably” is not offset by the huge cost. I go the complex way on Occam’s razor selection processes. The more variables that can be included, the better, so more black boxes seem better than fewer. Reqally, I would not even consider Occam’s rajor as part of the logic – its a synthetic beastie like the precautionary principle.

Is the analogy useful? It might be easier to test than climate data. Simpler tests could perhapds be done on data from unfortunate people on life support.

If you look at my previous postings on this site, I derived a simple proof long ago that one can choose nonphysical forcings for any system of time dependent equations that will produce any desired solution. That is exactly what you have shown, i.e., that the climate model parameters have been tuned to reproduce the historical record using physically inaccurate forcings. That this is the case can also be seen in Sylvie Gravel’s manuscript on short term forecasts where supposedly better forecast accuracy is obtained using inaccurate parameterizations.

The incorrect type of dissipation (e.g.hyperviscosity) or size of dissipation
(physically orders of magnitude too large) leads to the incorrect cascade of enstrophy as compared to reality. Then one can only obtain a energy spectrum
that appears reasonable by using inaccurate forcings.

Notice in Fig 5 on the post by Willis on WUWT the difference in smoothness and magnitude of the various forcings. The individual volcanic forcings are similar to a Dirac delta function that is well approximated by exponential functions that decrease in width and increase in height and are much larger than the remaining forcings. Although there is spatial smoothing of the forcing due to the dissipation and temporal smoothing due to the time integration time integration, it is easy to see how the modelers have used the volcanic forcing to control the model output to match the spikes in the obs data. As Willis has shown, the remaining forcings have little to do with the agreement between the model results and the obs.

You’re right, Jerry. The wiggle-matching between T and volcanic forcings in Figure 2 is essentially perfect. The GHG forcings produce the overall line shape and the volcanoes produce the wiggles. Who could ask for more? 🙂 I tested the little linear equation in my Skeptic paper for my own interest back then and, given the exact forcings, it also gave good matches to ModelE outputs. So Willis’ result seems to be general.

Do you have time for a quick question, Gerry? Can you show that climate models use “physically inaccurate” forcings, rather than just asserting that they do? As I recall, Willis used the exact same forcings for his analysis (he says he did) that the model used; can you show that these forcings were “physically inaccurate”? I’d also like some clarification as to what you mean by “inaccurate” – if (say) an annual average of CO2 from Mauna Loa is 376 ppm, but the model uses a value of 375.95, is that too inaccurate?

I am always happy to answer any questions and cite references that I feel might be helpful.

In this case I might suggest that you peruse Sylvie’s Gravel’s manuscript on this site (if it is not here still I can ask Steve to repost it). Here is a brief description of the results.

Sylvie ran the Canadian Met office global forecast model for a series of tests over many days (standard practice when any changes are made) and compared the
forecasts with the obs using mathematical norms to measure the relative accuracy. All parameterized physics (forcings) were removed
and the accuracy compared to the accuracy of the full operational model. The accuracy of the full physics and no physics forecasts was
essentially identical for the first few days, i.e. the physics parameterizations were of no help during this period.

Sylvie then determined what physical parameterization had the most impact on the forecast accuracy for the next few days. It turned out to be the boundary layer parameterization. Without that parameterization, the wind speeds near the lower boundary would increase too rapidly. However one could still see the errors in the solution increase near the surface and eventually destroy the solution at higher altitudes. A simpler form of drag produced just as good a result, and in both cases the parameterizations were clearly not physical, but just ad hoc gimmicks to simulate the real boundary layer.

The only reason the forecast models stay on track is because of the updating of the wind obs every 6 hours (I can cite a theoretical
paper by Kreiss and me here if you want or look at the experiments
in Sylvie’s manuscript.) Without the updating of the wind obs, the physics parameterizations would drive the forecast model off track in a matter of a few days.

Now consider the climate models. Their resolutions are much cruder
than the forecast models. So they are not approximating the dynamics
as well as a forecast model. Because of the poor resolution,the climate model dissipation coefficients are necessarily much
larger. This means that the cascade of enstrophy is much different
than a forecast model (even more inaccurate). And the physical
parameterizations have to be much different than reality to add energy artificially to the model that is removing it too quickly.

As we have seen above, the parameterizations are already inaccurate in a forecast model and are an ad hoc gimmick to overcome shortcomings in the forecast model. But they are not physically accurate. The parameterizations in the climate model are even more crude given that they are based on grid points hundreds of kilometers apart. Thus we have a model that is not numerically
approximating the NS equations to any degree of accuracy and the forcings are even more inaccurate than in a forecast model where they are already inaccurate. And the long integration period is not mathematically of numerically justified by theory.

The simple emulator is an example of and LTI system. One issue here arises from the eigenfunctions of LTI systems which are the complex exponentials, (sinusoids, real exponentials and a product of the two).

In as much as the forcings differ little from these eigenfunctions the mapping is just a scale and a temporal shift.

Now the non-volcanics are not that far from an exponential(nor is a linear slope for that matter) and they together form a class that will map with just a single scale and lag, the volcanics are not at all close to an eigenfunction, but the do tend to map by a scaling lagging plus an additional dispersion (smearing).

Now for the exponentials one can trade the lag and scale parameters off, in an attempt to try and capture the form of the combined forcing. This may well lead to the lag being shorter (to match the volcanic) than the “true” value for the non-volcanics. This may tend to give the false impression of a low lag combined with low sensitivity.

A more accurate response function, say one based on a diffusive ocean as opposed to a slab, will still be LTI and hence share the same eigenfunctions but will account for the dispersion of forcings such as the volcnaics, which can be considered to be the sum of dourier compenents (which are all eigenfunctions each with their own scale and lag values) more accurately. If it does, then it is likely to produce a much longer lag for the non-volcanic part and a higher sensitivity. (Once a candidate response function has been chosen the scale and lag values for eigenfunctions are fully determined and can be calculated directly).

That said, there may still remain an issue on the efficacy of the volcanic forcing. It is plausible that a simple model cannot scale the mapping of this forcing in the same way as the simulators do, e.g. if the spatial warming patern is of significance.

I mean, the problem is so obvious that there must be something wrong with it. It’s like when you see a late-night TV ad that promises to sell you a book on how to make gobs of cash in your spare time. It looks so convincing, yet you know something must be wrong.

This is the same way. This problem is so obvious that I’d think Hansen would throw up his hands and discard the model — only I know that he won’t. Why won’t he?

Robert, I felt the same way about other work by Willis. W.E.’s work seems clear. There doesnt seem to be the same clarity to the responses to it. It leaves an ordinary engineer feeling very uncomfortable. Its like when a highly respected Judge was accused of falsifying paperwork to avoid a speeding fine. Most ordinary people didnt want to know about it. (Many people wished the accusers hadnt raised the topic. The Judge was jailed in the end.) When a simple proposition by Willis needs a long and complex refutation it makes me suspicious.

I have to say that I found the exposition on Watts up with That unconvincing. It appears to me that one is solving a linear 1st order differential equation. The parameters of the equation are optimised to fit observed data and an r^2 value is quoted.

The problems I have are:

1) To quote an r^2 is implicitly using the data that was used to form a hypothesis to test that hypothesis.

2) The discussion appears to revolve around the linearity of the model. If one solves an ODE with short step length and optimises the parameters to fit the data, this seems to be a consequence of Taylor’s theorem (or Euler’s method …..). This is definitely not a formal test of linearity in a physical system as such a system must show:
a) Proportionality
b) Stationarity &
c) Superposition.

This exercise does not test whether a model is linear or not. All it shows is that given a fitted set of parameters one input gives a response that approximates the output.

If one is saying that a climate model can be reduced to a convolution, it is not clear whether this is an empirical curve fitting exercise or represents a fundamental insight into the physics of climate, captured by a particular model. In the latter case, the linearity of the system has not been tested. One solution of a particular model may be represented as a 1st order ODE, without a proper analysis of the model itself, and understanding its behaviour under a potentially infinite range of perturbations, I cannot see what has been achieved.

If this is a curve fitting exercise, it seems to be trivial; if it is claimed that because a model is linear under some circumstances, model is linear, it strikes me as wrong.

Do not confuse an under resolved numerical models of the Navier Stokes equations (with gravitational and Coriolos forces) with orders of magnitude too large of dissipation with the real atmosphere. There is no mathematical or numerical theory that states that the numerical model is anywhere close to the NS equations with such unrealistically large dissipation or over such a long period of integration. Sylvie Gravel’s manuscript on this site already shows the problem in a short term forecast, i.e. the deviation from reality and artificial improvements in forecast improvement thru unphysical forcings. Things do not get better over longer integration periods with much less resolution.

This isn’t a question of imagination – it is a statement of the meaning of linearity and how this is demonstrated. In most physical systems, Taylor’s approximation can be used to predict small changes. This is what has been applied here and I do not think that the result is terribly helpful because it confuses two effects: a linear operator for the solution of the ODE and true system linearity. The question of how small is small depends on the observability of the system.

There is good evidendence that the magnitude of various periodic efects, JMA, NAO, PDO etc inflence each other in terms of amplitude, frequency and phase. These are highly non-linear effects.

They find that external forcing can give Atlantic multidecadal fluctuations, but note that the model output in this post is completely lacking a PDO type oscillation (or just the mid century warming and 1950 to 1975 cooling to limit the claim), as do all the outputs shown by the IPCC. This analysis would suggest that they are thus lacking adequate forcing data (or misusing it). ”

I think you are attributing more skill to the GCM than they possess.
RR

I merely meant that this was an example of a non-linear interaction that can arise but appears to be linearised.

Having said that, the goal in the numerics of many problems is linearisation using Galerkin’s method. If this can be done, it generally makes life a lot easier. However this is a specific mathematical technique that used to aid solution and does not stem from the assumption that the underlying system is linear.

Not the case. Time dependent nonlinear hyperbolic systems may well be approximated for a short period of time by a linear system. But the nonlinear interactions due to the nonlinear terms, e.g. convection, can cascade enstrophy down the scale very rapidly and the solution will quickly deviate from the solution of the linear system generated at an earlier time.

If I read Pat Franks and Gerry Brownings comments correctly are they saying:

1) The parameters have not been set properly
2) Natural systems move into a chaotic process quickly
3) The parameters dont recognise a chaotic system
4) It was decided not to change the parameters but
5) Make the system non chaotic ie semi or more than semi linear

If that is the case does this mean:
1) AGW still hasnt found the critical variables to model climate that would match observation
2) Or the sought for signal output (temperature) is too small to be detected in the noise (chaotic variation) ?

Pat and I are saying that by suitably adjusting the forcing, one can obtain any solution one wants. And although the solution is the one that is desired (e.g. close to obs), both the dynamics and physics can be completely incorrect. This has been shown in a simple proof on this site and in Sylvie Gravel’s manuscript on this site. Can you guess why the modelers wouldn’t publish her manuscript?

Willis’ clear and unequivocal language is a model for excellence, and like others commenting here, I am suspicious of criticisms of Willis’ work that are unclear and equivocate with each passing paragraph. It reminds me of getting caned (painfully) in high school maths 60 year ago for arriving at the correct solution by the ‘wrong’ method, a ‘shortcut’ my father had shown me.

Willis touches on the issue of climate sensitivity more than once, saying that when fitted to historical data the model indicates much lower climate sensitivity than is usually assumed in reported GCMs. My understanding is that this difference is in terms of positive feedback effects assumed in the models, such as for water vapor. What then is the justification for continuing to assume high postive feedbacks?

This creates the file which you can click at the bottom of the resultant page(Download text file” (click on the “text file” portion) to download to your computer. Save and read directly from your drive…

Since you can connect there is some sort of network error from where I am (Wakefield UK). I’ve no idea why – other US sites are accessible as usual. Do you have an email address at GISS to which I could report this?

GISS responded very promptly and fully. They said it was probably a routing issue somewhere between them and me because their servers were working fine. Likely it was (I got a 109 error: host unreachable). I never had a chance to trace route as everything just started working again. Finally following RomanM’s instruction I have reproduced the chart. Happy!

The expression that was used to generate the response plotted above is:

T(n+1) = T(n)+λ ∆F(n+1) / τ + ΔT(n) exp( -1 / τ ) (n+1)

The factor is the climate sensitivity or a constant of proportionality that links forcing to temperature.

What strikes me about this is that there is no dependency expressed in this formula between forcing and temperature. So for example, the change of albedo as a result off changes in ice or cloud cover is not captured. The success of this formula in capturing the recent temperature history would then, to me, indicate that the effect of such non-linearities in forcings must have been negligible in the period under study.

Is this interpretation reasonable?

If it is reasonable, I see it as evidence against

a) the idea of tipping points from run away feedbacks

b) the idea that the climate is chaotic and that there is wide scale natural variations

I am always happy to answer any questions and cite references that I feel might be helpful.

In this case I might suggest that you peruse Sylvie’s Gravel’s manuscript on this site (if it is not here still I can ask Steve to repost it). Here is a brief description of the results.

Sylvie ran the Canadian Met office global forecast model for a series of tests over many days (standard practice when any changes are made) and compared the
forecasts with the obs using mathematical norms to measure the relative accuracy. All parameterized physics (forcings) were removed
and the accuracy compared to the accuracy of the full operational model. The accuracy of the full physics and no physics forecasts was
essentially identical for the first few days, i.e. the physics parameterizations were of no help during this period.

Sylvie then determined what physical parameterization had the most impact on the forecast accuracy for the next few days. It turned out to be the boundary layer parameterization. Without that parameterization, the wind speeds near the lower boundary would increase too rapidly. However one could still see the errors in the solution increase near the surface and eventually destroy the solution at higher altitudes. A simpler form of drag produced just as good a result, and in both cases the parameterizations were clearly not physical, but just ad hoc gimmicks to simulate the real boundary layer.

The only reason the forecast models stay on track is because of the updating of the wind obs every 6 hours (I can cite a theoretical
paper by Kreiss and me here if you want or look at the experiments
in Sylvie’s manuscript.) Without the updating of the wind obs, the physics parameterizations would drive the forecast model off track in a matter of a few days.

Now consider the climate models. Their resolutions are much cruder
than the forecast models. So they are not approximating the dynamics
as well as a forecast model. Because of the poor resolution,the climate model dissipation coefficients are necessarily much
larger. This means that the cascade of enstrophy is much different
than a forecast model (even more inaccurate). And the physical
parameterizations have to be much different than reality to add energy artificially to the model that is removing it too quickly.

As we have seen above, the parameterizations are already inaccurate in a forecast model and are an ad hoc gimmick to overcome shortcomings in the forecast model. But they are not physically accurate. The parameterizations in the climate model are even more crude given that they are based on grid points hundreds of kilometers apart. Thus we have a model that is not numerically
approximating the NS equations to any degree of accuracy and the forcings are even more inaccurate than in a forecast model where they are already inaccurate. And the long integration period is not mathematically of numerically justified by theory.

I have been trying to understand the issues that are being discussed in this posting and have come up with the description below. I would just like to ask if my understanding of this issue is at all reasonable.

===============

The expression that was used to generate the response plotted above is:

T(n+1) = T(n)+λ ∆F(n+1) / τ + ΔT(n) exp( -1 / τ ) (n+1)

Since the atmosphere is of constant mass, this could be considered to be an expression for the amount of heat energy in the atmosphere as a result of forcings.

Now teh net forcing would also have to take into account, the amount of energy that will move from the atmosphere to other sinks such as the latent heat in the melting or freezing of ice or the sensible heat in heating the deep oceans. The forcing that maps to the heating of the ocean could be mapped to the difference between the atmospheric and ocean temperatures adjusted with a constant of proportionality

Where  is the constant that describes the coupling of heat between the atmosphere and the ocean and TO is an equivalent temperature describing the amount of heat in the ocean.

To me, the benefit of this expression, is that it would indicate that the temperature of the atmosphere would continue to rise for a time even if the external forcings remained constant. As the temperature of the contain rose, the amount of energy that is taken from the atmosphere by the new factor will decrease and the temperature of the atmosphere would continue to rise. Thus the value for the λ that would be generated from the initial expression would have to be increased to take into account the new sink supplied in the revised expression.

In particular Hansen’s comments about “warming in the pipeline [paraphrasing) has puzzled me. This was the explanation that I came up with. That is that the ocean is acting like a sink (with its different time constant than the atmosphere . And that this would cause the continued warming even for the onset of constant forcings or even reduced forcings.

I’m trying to understand what these shock effects of volcanic forcings, and the seemingly mean reverting response of ‘avg temp’ that follows tells about the forces controlling climate.

Over the last three major volcanoes i see -.1, -.2, -.4 degrees in yr2, all three rebound to almost exact pre-volcano temp in yr3, and all three increase +.07, +.15, +.22 in yr4.

First, does this pattern of temps after large volcanoes show up in temperature records? Second, what is it about the 1-4yr time frame that creates negative/no feedback, and versus the long time frame positive feedback to constant forcing of say GHGs?

A climate model is close to a heat equation whereas a forecast model
approximates a hyperbolic system with very small dissipation
Any discontinuities in space or time are smoothed out very quickly because of the large dissipation.

darn! my first day with excel2010 and i got tricked into doing a stacked graph of the tempratures. McIntyre would glad to see another idiot tricked by excel. The above should read:

For the 1960s, 80s, and 90s, i see a -.15, -.1, -.2 lagging two years behind the the beggining of the eruption. On the 3rd year, tempratures return to approx pre-eruptions levels. The 4th year all incr +.07. (There is less of a shock now then i was seeing earlier.)

I found a good comparison of model v observed temps on Gent’s writeup on p17-as to how well they did on these last eruptions and its mixed:
60s: nailed it – 1964 seems to be the only point in the history where v3, v4, and observed match.
80s: total miss, or strange lag effect – where the models predict a trough, observed goes up, when model temps recover, observed then dip.
90s: overstates the magnitude but gets the shape right.

Try adding a shock function to the linear convective diffusion
equation with different sizes of diffusion. With sufficiently karge diffusion the shock will be smoothed out in time very quickly.
But I bet there will be oscillations just as you are seeing (Gibbs phenomenon)?

I think in terms of time dependent quasilinear hyperbolic systems of equations with forcing
and their approximations by numerical methods.

Under these circumstances pulse type forcing in space or time has very bad effects on the numerical approximations, i.e. there is an immediate transfer of energy to the smallest scales where the numerical dissipation
must artificially remove that energy.

After my lengthy discussion about the difference between a model using air and one using molasses,
you seem to still be confusing the two, i.e. does a volcano in the atmosphere (air) behave over time as one in molasses? I suspect that the forcing used in the molasses model is much bigger than reality in order to get the molasses model ground temps to look like ground temp obs. And how does one incorporate a volcano into any model – by tuning parameterizations (artificial forcing)?

The Gibb’s phenomenon is an interesting point. I have been bitten in my lower body by this in an entirely unrelated field. As you imply (I think), one has to band limit any forcing function before applying it so that it is not not aliased. The question arises, then is whether the forcings need to be represented as impulsive, or I guess strictly a step, given the low frequency behaviour of the model in question. This also raises the question of sampling frequency.

Even in a purely hyperbolic system, there is some temporal smoothing of the forcing. But numerical methods require a number of spatial and temporal derivatives to exist in order that truncation errors (and subsequently total numerical errors) can be estimated. Of course if one does not care if the numerical solution approxinmates the PDE,
then one can do anything one wants.

Let’s look at your question from a climate modeler’s point of view.
The CM wants to run the model for a period much longer than is possible for a forecast model at high resolution. What are his options? To reduce the computer time required for the long run, the CM must reduce the number of computations (grid points), say to being 100 km apart. But he knows that fine scale structures that cannot be resolved by the coarser grid can and will appear if he uses air as the fluid. To prevent this from happening, i.e. the model blowing up, he uses molasses instead of air as the fluid (i.e. increases the viscosity of the fluid). Now any finer scale structures that appear can be resolved by the coarse grid. But now
what happens when a physical parameterization says that it will rain. Does the CM have it rain over the whole 100 km square or only over part of it? (What does it mean to rain in molasses?) The CM
arbitrarily chooses a fraction and runs the model for 100 years to see what happens. If the model blows up, he reduces the fraction
and tries again. Trial and error and no scientific or physical basis for the tuning.

You only need to look at the documentation for the NCAR coupled climate model dissipation coefficient and compare that to the coefficient for a spectral forecast model with the same type of dissipation.
Dave Williamson also wrote a manuscript that compared the sensitivity of the model to different coefficients. Note that the coefficient had to be tuned as the grid spacing changed – i.e. it is just another tunable parameter. Look under google scholar to find the manuscript.

Thanks for the reference. I was thinking there might be something more recent – that Williamson et.al. paper is 16 years old. I believe there have been some changes in NCAR’s model as well as atmospheric models generally since then. Last I looked, NCAR was up to something called “CAM5” and wasn’t using spectral methods any more.

Sometimes the old manuscripts are the most honest.
Although Dave is a climate modeler, he always reported the results in an honest manner even if they were adverse.

Many thought that parallel computing was going to
provide enough computational power to solve all the
resolution problems. Unfortunately the parallel computers ran into a snag, namely the communication problem between processors. Because of that issue, adding additional processors does not necessarily increase the overall power in a linear fashion.

For many years NCAR, ECMWF, and many other orgs pushed spectral models. Now they have evidently switched to other numerical methods because
they reduce the communication times between processors. How amusing. Kreiss and his students saw the value in things like the cubed sphere years ago.

What is the physical distance between grid points in NCAR’s most recent climate model? The problem is that if the grid size is halved, the computational
burden increases by a factor of 8.

Cam 5 is moving towards the finite volume method which is “a reasonable compromise between efficiency and accuracy”. In other words they have dropped the order of accuracy so that the numerical method will parallelize more easily. Taking out of one pocket and putting in the other. It will be interesting to see how well the physics mess works given that clouds are not over every grid point.

The model is still hydrostatic and thus ill posed. If the grid spacing is reduced along with the unphysically large dissipation, this will become obvious.

Dr. Browning:
More recently, convective adjustments have been developed that adjust to empirically based lapse rates, rather than adiabatic lapse rates, while still maintaining energy conservation… from the AMS glossary.

My question is in a model that otherwise got the physics and math correct, except for an unlikely cancellation of error, this one parameterization would force error into the output. Correct? And worse, rather than just a tendency towards exponential growth of error, it would reduce the system to n equations with n+1 unknowns, the +1 unknown effect due to its conflating itself and the other responses within the system. Correct?

You are correct that even if the physics and math were correct (a big if), convective adjustment alone is enough to destroy any accurate
numerical approximation of the Navier Stokes
equations (with gravitation and Coriolis forces). Convective adjustment was invented
because the physical parameterizations would locally cause physical overturning in a column,
i.e. the fluid would no longer be hydrostatic.
To enforce that the fluid remain hydrostatic,
the fluid in the column would be redistributed
to ensure that it was still hydrostatic and some additional ad hoc adjustments were made to compensate for the rearrangement.
Just another gimmick in the many used to get the climate model to run longer than any theory states that it should.

Note that the enstrophy in a forecast model can cascade down to its smallest resolvable scale in a matter of a day. This is not a problem that is going to go away. Also I suggest you look up convective adjustment that is part and parcel of all climate models. This is an ad hoc transfer from the smallest scales to the global scale and destroys all the numerical accuracy of the approximation of the NS equations.

James G. on costs related to regional forecasts mentioned by Willis, who said, “We all know that climate models suck at regional forecasts and hindcasts. You ask why 1-D representations are what we discuss? Because the 2-D representations (aka regional forecasts) are so flawed …”

My comment: That doesn’t stop the Asian Development Bank from financing regional forecasts to promote lending for billions in infrastructure in Vietnam and other low income countries that can ill afford this flawed science. The cost of developing and running the models is a drop in the bucket compared to the cost of infrastructure that is not needed.

The main beneficiaries are not the academics who act as ADB consultants and advisers, but the corrupt politicians and civil servants in developing countries who pocket a percentage of contract costs while their governments repay 100% of the loans plus interest.

Excellent presentation from Scripps on difficulties of modeling clouds in the global climate models. Clouds are the largest source of uncertainty in the models. Grid boxes are much too large to handle clouds, their behaviour operates on a much smaller scale.
Models disagree on the effect of clouds, not sure if the effect is
negative or positive. The confidence level in the understanding of
several forcings is very low for solar, aerosols-indirect, biomass burning and fossil fuel soot. Sulphate forcing and ozone are low.
Surprisingly, CO2/CH4 and Halocarbon forcing confidence level is high.

Here is a review of the volcano parameterizations versus obs for the NCAR CCSM climate models by Judith Curry:

Figure 12 shows timeseries of globally-averaged surface temperature anomaly from observations and the ensemble mean from five 20th century runs of CCSM3 and CCSM4, plus the CCSM4 ensemble spread. The model results track the data quite well up to 1970, except for three instances. The first two are when the models have a large dip in tem- perature due to the Krakatoa eruption in 1883 and vol-canic eruptions in 1902 that are not apparent in the data at all, and the third is when the models do not show a temperature decrease in the 1940s that is clearly evident in the HadCRUT3 data. These discrepancies have been present in all 20th Century runs done with CCSM. After 1970, CCSM4 surface temperature increases faster than the data, so that by 2005 the model anomaly is 0.4◦C larger than the observed anomaly. [JC emphasis] This too large increase in surface temperature occurs in all CCSM4 20th century runs. It is interesting to note, however, that if CCSM4 20th cen- tury runs had ended in 2000 rather than 2005, then the comparison would have looked much better. Over the last 5 years of the run, the model temperature increased significantly whereas the earth’s temperature over that period did not change much at all.

…

My main critique of this is the experimental design and interpretation of model results. Given the substantial uncertainties in 20th century forcing from solar, volcanoes, anthropogenic aerosols, why aren’t sensitivity studies conducted to assess the impact of these uncertainties on the attribution?

For the entire review see

judithcurry.com/…/ncar-community-climate-system-model-version-4/

Now I am not as big a fan of Judith Curry as Steve, but here she
at least mentions the deficiences with the climate models.

I find it disturbing that error plots of the difference between
the obs and model runs are never shown. Then it would be much easier to see the locations of the maximum errors and any strange error behavior.

This entire exercise is an excursion into conjectural curve-fitting. A truly physical model based on first principles would not require boundary-value specification. And the GST series produced by GISS is not the climate itself.

Add Delta T to the previous year’s GISSTemp. then use the two temperatures in the Stefan-Boltzmann equation to calculate the value of Delta F (W/m^2) to compare with the Model E forcing.

The funny thing is that until 1943/1944 the Model E forcing and the forcing as calculated above are almost identical. Then the Model E forcing diverges so that by 2003 it is about 0.43 w/m^2 greater than the “S-B” result (2.75, 2.32).

Another way is to use the equation from Myhre et al GRL 25:14 (1998) which Table 3 says is (for CO2) Delta F = 5.35 ln ( c / c0 ) which looks a lot like the Beer-Lambert formulation. Use CO2 from the above source, apply the formula and compare. The value in 2003 is 1.35 W/m^2 (cf 2.25 for GISS Model E).

No. The well-mixed GHG forcings are CO2 plus CH4 etc, but they don’t include all the other forcings – solar, BC, aerosols, volcanic etc. My numbers are just for CO2 However, I simply cannot see that there has been such a change in the GHGs other than CO2 to account for the divergence since the mid-1940s. I also think the other GHGs had no discernable effect (at least in the GISS Model E), prior to that time – compared to CO2.

Also, up to 1943/1944, GISS didn’t even need a model to calculate the GHG forcings – just a TI calculator and teh Myhre equation. That means they can fiddle the GHG forcings just as they fiddle the other forcings so as to fit the hindcast to the historical data. If that is what they did, why did they need to dramatically increase GHG forcing after WWI?

A few years back the thought struck me – since my background is economics – that the similarities to climate study are more striking, IMHO, than most people seem willing to admit. At the time I thought it might make sense for climate analysis to diverge, like economics did, into micro and macro.

Call me crazy… but we have an entire area of expertise and study that’s been through a lot of this already. Micro-phenomenon, even when completely dependable and predictable, often times are just noise in the grand (macro) scheme of things.

[…] I have stolen his pun for my title. More recently, Steve McIntyre wrote a piece in Climate Audit here where he summarizes some of the history (with the external blog references for those who want […]