Berkeley and the Long-Term Trend

On a recent thread which was not about the temperature trend, but about Judith Curry’s mischaracterization of it, “Dan H.” stated that what mattered was the long-term trend, which was a steady increase at a rate between about 0.006 and 0.0075 deg.C/yr, and that the Berkeley data reinforced this idea. He later said that it was a steady increase plus a cyclic variation with period about 60 years. Let’s examine those ideas closely, shall we?

Here’s the Berkeley data (minus the final two “don’t belong” data points), together with a lowess smooth on a 10-year time scale:

Here’s just the smooth, together with 10-year averages:

Visually, it certaintly doesn’t seem to be a steady rise plus cyclic variation. But let’s try those models anyway. First let’s fit a long-term linear trend by least squares. We’ll compare that (plotted in blue) to the smoothed curve (plotted in red) and the raw data (in black):

Note the sizeable departure of the data from the linear trend over the last several decades. Let’s expand the scale for a direct comparison of the linear fit (in blue) to the smoothed curve (in red):

Indeed the long-term linear model isn’t very good. In fact it isn’t right, which is easily confirmed statistically. The biggest difference between reality and the Dan H. model is the rapid upward trend over the last 30 years. Could it be … global warming?

The linear model doesn’t hold water. But what if we include a cyclic variation with period about 60 yr? It turns out that the best-fit linear-plus-cyclic model has a period of 70.9 yr. Here’s that model (plotted in blue), compared to the smoothed curve (in red) and the data (in black):

Note the sizeable departure of the data from the linear-plus-cyclic trend over the last several decades. Let’s expand the scale for a direct comparison of the model (in blue) to the smoothed curve (in red):

Indeed the long-term linear-plus-cyclic model isn’t very good. In fact it isn’t right, which is easily confirmed statistically. The most important difference between reality and the Dan H. model version 2 is the rapid upward trend over the last 30 years. Could it be … global warming?

We could also test the long-term linear model, and the linear-plus-cyclic model, by fitting them only to part of the data — say, data up to 1975 — then extrapolating that model to see how well is validates what followed. Care to guess how that comparision looks?

If the linear model is correct, then the warming rate will be constant over time. If the linear-plus-cyclic model is correct, then the warming rate will be constant-plus-cyclic with the same period. Let’s compute the warming rate using each 30-year segment of the Berkeley data, together with the estimated uncertainty in that rate, using an ARMA(1,1) model for the noise just to feed the “uncertainty monster.” Here’s the result, with a horizontal dashed line in red indicating the mean rate of the linear-plus-cyclic model over the entire time span:

The long-term linear model is nonsense. The long-term linear-plus-cyclic model is nonsense.

Thanks, Tamino. This is a talking point I’ve been seeing quite a bit from denialists who like to pretend to be scientific. I think their idea is that if they can minimize the trend due to climate change, that they can get away with saying warming won’t be significant. The thing is that we have about a dozen separate lines of evidence that all favor a sensitivity of 3 degrees per doubling. If this estimate were wrong, it is far mor likely to be too low than too high. It is as if they think the only evidence is the temperature trend, and if they can come up with a Rube Goldberg model that explains that, the problem goes away.

If they look to the physics rather than astrological fits, then they have to attribute the changes to something. Mechanisms like land use, albedo, and 7 billion souls worth of karmic crystal pyramid energy would still blame the Anthro part of the denied AGW. What mechanisms are left if you start with CO2 sensitivity being a hoax? Dark matter cloud nucleation? Phlogistonial inertia?

Question: Is it possible that the apparent cyclicity in the early part of the record is due to the station distribution? Those big swings have bothered me since BEST came out. The BEST animation (http://berkeleyearth.org/movies.php) shows that the “global” temperature anomaly is anything but globally defined between 1800 and 1850. Most of the records appear to be eastern U.S. and western Europe, all of which would presumably be influenced to some degree by the NAO (for example).

With true global coverage later in the record the surface warming trend overwhelms any regional temperature cyclicity, as you’ve very convincingly demonstrated.

That sort of thing is an inevitable consequence of having higher uncertainties at those years (in addition to some real non-constant behavior, undoubtedly). The larger the uncertainties, the further you’ll occasionally deviate from the mean in a 10-year moving average and the longer it’ll take to get back to the mean, resulting in a time series with an apparent periodicity that’s related to the length of the moving average and the noise spectrum.

“Our analysis shows warming underway by 1800, large variations up and down throughout the 19th century, and that variability on the 3-15 year scale has been dramatically decreasing over the past two centuries.”

Apparently he believes that the variability in the early data is real. What can be rigorously concluded about 19th century variability from BEST data? Can one demonstrate that that variability is an artifact of smoothing choices, as indicated above by Zach?

[Response: Or perhaps, that the high level of variability is regional. The early data have far less geographical coverage, so they can exhibit high regional variability, while later data are more global and therefore reflect far less global variability.]

Tamino posits a change in the long-term linear trend around 1975. I did a Chow test on the Hadley CRU global annual dT data. Call b the linear trend, the coefficient of the year term, where I refer to linear regressions of dT on the calendar year.

The Chow statistic is therefore F = 55.16399459 with 2 and 157 degrees of freedom, respectively. The significance is off the scale. There was indeed a break at this time, and the linear trend is significantly different in the two time periods. For 1850-1969 we have 0.026 K per century. For 1975-2010 we have 0.17 K per century.

Doing single Chow tests is extremely risky. If a linear fit to the entire dataset is poor, a linear model with breakpoint will often fit better, whether it’s actually a reasonable model or not. And even if the data [i]is[/i] described well by a linear model with breakpoint, testing only one year doesn’t tell you where the breakpoint really is. It can produce positive results even if the true breakpoint is a decent distance away. To get robust results, you need to test a range of potential breakpoints and look for a peak in significance. In doing so, you also need to account for the fact that you’re performing multiple tests, of course, and adjust your significance threshold accordingly.

Tamino, do you have a default lowess bandwidth that you use for most of your examples, or do you vary it to suit the data at hand? I tend to do the latter, which also means I need to specify bw(__) in the legend or somewhere so others can replicate a graph if they want.

[Response: I generally “default” to a 15-year time scale, but for this analysis I switched to 10.]

Tamino, just as the uncertainties of the last two data points are two large to be included in any meaningful analysis, I would have thought the uncertainties before about 1880, and certainly 1850 are also to large. Antarctica is not the world, which is why the last two data points are no good. But neither is Europe, and hence the early data points are not meaningful either.

[Response: I quite agree. But the task at hand was to test Dan H.’s claim that the Berkeley data reinforce the characterization of temperature change as “long-term linear trend” or “linear-plus-cyclic.” One can safely eliminate the final two data points without impeding that, but removing early data would hamper testing his hypothesis.]

The positive slope of anomalys show that the rate of warming is accelerating, not that warming is occuring, warming is proven to be occuring by all Annual anomalys simply being positive for the last twenty years.

The anomalys slope could be flat or negative and warming would still be occuring, as long as the anomalys are positive.

Why is it nobody realises that this discussion is focused on acceleration of warming, not warming? From Watts through to Phil Jones through to Ken Trenberth, everybody….everybody… has the wrong end of the vector.

The focus on anomalys has distracted from the most relevant metric, Global Annual Average Temperature, which has been increasing every year for the last 10 and longer, meaning no ‘Plateau’.
.

criminogenic: No…The anomaly is simply the temperature minus the baseline temperature. An anomaly is not a derivative…It is simply a temperature but a temperature that is referenced to a baseline (such as a 30-year average) for that particular station.

There are important advantages of using anomalies over using temperatures: One is that the anomaly field has better behavior than the temperature field. For example, anomalies are correlated over fairly large distances: If it was warmer than average in NYC this year, it was also likely warmer than average in Boston and Montreal and Philadelphia even though the actual average value in this different cities might be fairly different. (In a particularly dramatic example: One would find a huge variation in temperature over just a few miles if one compared the temperature at the top of Mt. Washington to the temperature in a nearby valley; however, if you looked at temperature anomalies, the two stations would likely show similar behavior.)

Another advantage of anomalies over absolute temperature is with temperature, you have to be worried about how representative your stations are. For example, some “skeptics” who don’t understand anomalies are worried that the dropout of lots of cold weather stations in Siberia over the last few decades has biased the record warm. This could indeed be a problem if one was trying to measure average global temperatures; however, if you measure global temperature anomalies that is not an issue.

The baseline is the average annual Global temperature, between 1951-1980 for GISS, etc.

My understanding is that an Anomaly is the average temperature variation over a period relative to a baseline, if the temperature is greater it results in a positive anomaly, if it is equal to the baseline then there is zero anomaly and if it is less than the base line it is a negative anomaly.

It is better to say the anomaly is the average temperature DIFFERENCE, not the temperature variation.

If the temperature is less than the baseline, then the anomaly is negative. If the anomaly is reducing, then the temperature is reducing. Your original claim that a positive anomaly means increasing temperature was wrong. It only means that the temperature is above whatever baseline is chosen.

No, it’s as Joel Shore said: “The anomaly is simply the temperature minus the baseline temperature.” There’s no term in there for time.

So, for instance, it is meaningful to say that “The anomaly for Metro Atlanta at 3 PM today was ___.” (Or it would be if I knew what the actual value is.)

You might object that time is cached in there somehow, since in practice it’s the mean of 30 years absolute temperatures, normalized to zero. But in principle it could be anything convenient, say absolute zero (as in the Kelvin example) or the freezing point of water (which, however varies inconveniently with salt content) or for that matter your birthday, suitably formatted–39.85, say, if you were born March 9 and are now 26. (Arguably convenient for you–but for me, not so much.)

So, no time in there at all. Just a difference from the temperature, period.

:
:
An Anamoly describes the sum of difference over a year, when this sum is added to the baseline Temperature, average annual global Temperature for the year is described, when this figure is added to the population the average is increased, if the Anomaly is positive.
:
:

By the same logic, a group of sequential temperatures would, too, or a group of any suitable measurements which occur over time [assume light-hearted examples here–kids’ heights as they grow, or something.]

In which case, the nature of “anomaly” would seem to be irrelevant, and we could make dhogaza happy by letting this go–since (I think) the main question, that of ‘acceleration,’ has been adequately dealt with.

Why are you persisting in this misunderstanding? Why complicate something fairly simple? An anomaly is a simple difference from a baseline average and is not a rate of change. Using your analogy, an anomaly is a displacement, not a velocity.

I see how you are saying Anomaly is analoguous to displacement, it is a measure of difference at a moment of time. Thus a series of Anomaly over time is analoguous to a series of Displacement over time, velocity.

The reason I am persisting with this is that I think there has been no plateau in Global warming over the last 10 years, as every year has been above average thus the average of the population is increasing, meaning Annual Global Average Temperature has been increasing. Meaning there’s something wrong with analysis of Anomaly as ‘flattening’ is used to say Global warming has stopped.

If Anomaly is seen as Displacement, Anomalies can be seen as Velocity and change in Anomalies can be seen as Acceleration, thus the ‘Platueau’ describes no acceleration.

I agree with your intuition that “warming hasn’t stopped,” and have argued that it’s ridiculous to claim it has, just because the same sorts of (relatively high) temperatures have been coming up repeatedly for a few short years.

However, should we see the same anomaly numbers–off the top of my head, roughly .3-.6 C per NCDC algorithm–recycle quasi-randomly for another decade or more, it likely would approach statistical significance. Then we really would be justified in calling it a “pause” or something similar. “Displacement” would be remaining more or less constant, “velocity” would would have reached zero within the bounds of measurement/variability, and (to paraphrase an old comedy bit) acceleration would be “right out.” We’d know something significant (and, given the political near-stasis on mitigation, very helpful) had happened.

But given other physical indicators–the state of the cryosphere being a leading one to my mind–that “ain’t gonna happen.” I’m not a betting man, but I would bet that the present decade will be significantly warmer than the last. (Not least because I expect to see ice-free Arctic summers within that span–“ice-free” defined as sub-1 million km2 extent–and I expect that there will be a warming feedback from that event. Indeed, I believe there’s some thought that we’re seeing feedback from the declining extent already.)

You seem to think that even a constant anomaly would indicate a steady increase. Not so, as shown by a fictional alternative to the above:-
Year,Anomaly,Absolute
2001,0.50,14.50
2002,0.50,14.50
2003,0.50,14.50
2004,0.50,14.50
2005,0.50,14.50
2006,0.50,14.50
2007,0.50,14.50
2008,0.50,14.50
2009,0.50,14.50
2010,0.50,14.50This is what that would look like (obviously I’ve tricked WFT because they aren’t real figures). Can you see that there is no increase for a constant anomaly?

They can still invent different cycles with periods such as 120yr, 200 yr, 437yr and so on. It seems to be tamino’s work to check those hypothesis for them. But they still have a large bag “full of cycles”

Thus far unremarked to my knowledge (and without going into the cyclic stuff very early) is the major divergence between CRUTEM3 and Berkeley in the common overlap period before the other two data records join the party in 1880. Recognizing that in this era many of the instruments were not yet standardized stevenson screen measures (largely being N/S facing wall exposures (N/S depends upon the hemisphere, so mainly North) and unshielded) *if* Berkeley are doing better at identifying and accounting for the non-climatic influences the implication is that there wasn’t some mid-to-late nineteenth Century cooling as implied by CRUTEM3 and that therefore the long-term trend since pre-industrial times by the imperfect linear trend metric since 1850 is potentially (and I stress potentially) much greater than previously reported and understood. Somewhere in the range of maybe 30% larger because of the end-point influence of this potentially spurious cooling tendency very early in an 1850-2010 trend period calculation in CRUTEM3.

As an aside as a science community and society we are always far too asymmetric in our view of end-point effects endlessly obsessing over the recent data impacts rather than the impact of the choice of start point.

Certainly there are papers from a couple of recent studies replicating the old instrumentation that suggest these early instruments were substantially warm biased which would lend a degree of credence towards this possibility. This would then have obvious implications for our understanding of things such as the climate sensitivity and natural climate variability.

More like this which is the OLS trend periods for the two datsets from 1850-2010 assuming that interface does it properly. Certainly the difference between the two slopes looks reasonable given the multi-decadal timeseries differences.

Failing that the OLS with AR1 DOF correction according to Santer et al. approach is 0.61+/-0.06K/century for CRUTEM(not 3v – this will be a secondary effect) and 0.86+/-0.06K/century for Berkeley over Jan 1850 to Dec 2010. That is a substantial difference driven almost entirely by the mid-to-late 19th Century differences between them.

“This would then have obvious implications for our understanding of things such as the climate sensitivity and natural climate variability.”

It would certainly help in terms of natural climate variability, but I don’t think it is going to help with climate sensitivity. According to the Keeling curve I can find on wikipedia, CO2 levels were less than 320ppm in 1960. I can’t seem to find out when CO2 levels went above 300ppm [about 1900?].

I would have thought that the pre-1900 surface temperatures are not likely to be important for determining climate sensitivity, because CO2 levels would still not then have risen that far from pre-industrial values.

Looks to like it’s between between 1910 and 1917 on the above graph. I am curious where this is going as Fred Moolten has said on Climate Etc. that Issac Held thinks anthropogenic atmospheric CO2 may have had a more significant role earlier than current science indicates (hope I haven’t botched this.)

PT: these early instruments were substantially warm biased which would lend a degree of credence towards this possibility. This would then have obvious implications for our understanding of things such as the climate sensitivity and natural climate variability.

BPL: That doesn’t sound right. Reading too warm is not the same as warming up faster.

If your early records were warm biased and the previous records did not adequately adjust for it then they would have been warm biased early in the record yielding an under-estimate of the *real* warming trend. In the same way as modern cold biased records would yield an under-estimate. Think of a see-saw …

Here, they show that global temperatures show a positive correlation with ENSO, and an even stronger correlation with the AMO, which they characterize as an ~70 cycle. Of course, they are not the first to show this correlation. Others have shown this previously:

[Response: I guess you missed the part about them only showing correlation with short-term fluctuations — nothing at all to do with any “70-yr cycle” — in temperature over land only. You also seem to have missed my having posted about that paper. Twice.

All of which illustrates another problem with the global warming “debate.” When your nonsensical ramblings are shown to be nonsensical, rather than simply nurse your wounded arse you just throw up some more nonsense, this time on a paper which you appear not to understand.]

They also identified a 9.1-year period, which can easily be seen in this graph:

[Response: It shows no such thing. You’ve been fooled by chronic mathturbators — this is even more nonsensical than your “trend-plus-cyclic” model.]

BPL,
The trend was accelerating, which will occur repeatedly in any oscillating system. The maximum temperature slope (60-month linear regression) occurring in Jan, 1996. Since then, the temperature rise has been decelerating, with a current 60-month slope of 0 (using your similar GISS data).

Contrary to Tamino’s early statement, this in no way implies that global warming is not happened. Only that other forces have been overlooked, and in Muller’s own words, “the human component of global warming may be overestimated.”

[Response: It’s abundantly clear that your net contribution to reasoned discussion is less than zero. You do not discuss, you babble.]

Dan H., I have given this example before. Look at the periodicity of the following series of ordered pairs:

1,2
2,7
3,1
4,8
5,2
6,8
7,1
8,8
9,2
10,8

Now predict the next entry. If you said 4, you are correct, as that is the 11th digit in the base of Napierian logarithms, e. One needs to be very careful attributing periodicity to a time series unless one has many, many periods or a good reason for there being a periodicity at the specified period.

In the case of temperature time series, we do not have enough data to establish a period of ~70 years, and the physics says you are wrong. Me, I’ll go with the physics.

Another reason you can tell that “60 year cycles” are mathematical babble is by understanding some of the properties of the Fourier transform. In particular, in the presence of a linear trend, low frequency (long period) cyclical features are obscured.

Here I plotted fourier transforms of GISTEMP (1880 to present) vs. a linear trend (0.57 K/century). The fourier space points corresponding to 130, 65, and 44 year cycles are equally well-explained by a linear trend.

You will see 1/f^2 at the low frequency limit (e. g. red noise) and at the upper limit 1/f^0 (e. g. white noise), in between it goes from red -> pink -> white.

Also, this is very important, all corresponding phases are the same in incremental value.

The only way to construct a linear trend using FFT is to carry both the amplitude and the phase.

FFT are also periodic, and the assumption of a stationary ergodic time series holds, although that has never stopped anyone from misapplying that basic assumption.

In real world harbor resonance data you first need to remove the tidal cycle, from the raw data, a linear detrend is minimum standard practice for any wave measurements exceeding ~17 minutes (4096 @ 4 Hz) in the nearshore.

Also, using an FFT as a digital filter (by zeroing out coefficients) is very bad practice vs a proper time domain IIR filter (e. g. sine Butterworth).

Tamino, and others — if you’ll allow a brief aside from the serious statistics — I’ve found showing amateurs the woodfortrees site, including the site’s caution about fooling yourself with trends, does encourage people to poke at this themselves. Often people will cherrypick at first, but if willing they can convince themselves fairly quickly that cherrypicking _is_ deceptive.
The experience may open some eyes and ears.

Got a moment to criticize choosing this chart/set of intervals?
On the screen the result has some shock value.

I don’t know any statistical test can be done to check the appearance:

It’s very similar to the graph I posted here several weeks ago. At the time TrueSceptic taught me ho to find raw data on Wood for Trees!

If you start a 30-year trend starting in 1970, and add another 30-year trend at 1980 (picking up “the pause” decade), and then add a 20-year trend at 1990 (almost half of which is “the pause”), then something surprising, to me anyway, happens.

I might also have been responsible (in an epic Deltoid thread) for introducing a certain Girma Orssengo to WFT, but the thing is that we can clearly see what anyone’s done: it’s all in the linked page.

(Not that WFT compares with what Tamino does, but quite often we can get similar results with limited knowledge of statistical techniques. It’s a shame that some who could benefit, Stephen Wilde being one recent example, seem unable to do so.)

Nice. If you click the ‘raw data’ link at the bottom of the graph, you will get a text file showing the datapoints. Since you only have linear trends, you get the slope of each. Put enough of these – perhaps a 50 year interval starting every 25 years? – into a spreadsheet and you ought to be able to get some statistics.

I agree whole-heartedly on the sentiment.
Pretty graph.
If I may add my own, this one tends to cut through some clatter, or rather, I find that conversations tend to stop or go off on another track after I post it.

If you are going to leave off the last two data points as their uncertainty is too large, you may also want to leave out some of the earlier plot points as well (especially in the early 1800’s). Some of the uncertainties are over 3! (best to be consistent in these things)

This is OT, but I’m hoping for an open thread topic on temperature time series (either here or perhaps over at Nick Stokes website).

Both surface and satellite (land only) time series, both as anomalies and as the actual absolute temperature time series (from which the anomalies are calculated). If not the actual time series than a monthly anomaly array (with the base period specified (e. g. 1901-2000 (NCDC), 1981-2010 (UAH), 1979-2004 (RSS, or whatever RSS is currently using for their base period), BEST (1950-1980 (don’t know if the end years are inclusive as this would make the base period 31 years)..

What with all the various discussions of the BEST temperature time series and analyses thereof, I’ve become interested in a comparison of NCDC, UAH, RSS, and BEST (all land only, given that’s all BEST provides to date)).

Using a common time base (e. g. 1980-2009 (or 30 years), or 1979-2008 (again 30 years)).

So, for example BEST states in Full_Database_Average_complete.txt that;

And in looking at the UAH data, the TLT absolute cycle temperatures are way below the surface temperaure time series because they are actually measuring a few KM (But exactly how many KM’s?) above the surface so those measurements are quite a bit colder in absolute terms.

This all eventually boils down to the differences in trend slope between UAH/RSS and BEST/NCDC, the unexplained ~0.1C difference, and the denier (totally unfounded) conjecture that that difference is all due to a UHI bias in the surface temperature measurements. Something I have argued strongly against, if a proper area weighting scheme is used. As I’ve now done some (now getting rather serious from a rather simple start) urban vs rural population/population density estimates, as urban population area, as a proportion of total land surface area, is well below 1% (closer to 0.5% than it is to 1%).

But I think that the various anomaly time series with a common time base and the absolute temperature added back into the respective anomaly time series, would clearly expose the denier BIG LIE since it has become quite obvious that the satellite and land surface datasets, while interesting to compare (given we only see anomaly time series comparisons) are in fact measuring two entirely different sets of temperatures (surface vs a few KM above the surface).

Sorry for going OT, and I will persue this on my own regardless (it just will take a few emails to obtain the anomaly curves and/or absolute temperature time series).