William M. Briggs: Numerologist to the Stars

William M. Briggs, numerologist to the stars, has posted the claim that we don’t know whether temperature was cooler in the 1940s than in the 2000s. Apparently he objects to this graph:

Briggs’ reasoning boils down to this:

All you have to remember is these dots are estimates, results from statistical models. The dots are not raw data. That means the dots are uncertain. At the least, Plait should have shown us some “error bars” around those dots; some kind of measure of uncertainty.

Apparently Briggs thinks that computing an average means using a “statistical model,” and that entitles him to smear the result by association with the evil of “models.” That’s stretching the definition of “model” to the breaking point, for no other reason than to exploit the denialist tactic of denigrating anything and everything associated with the word “model.” Even, apparently, computing an average.

By the way, the plotted data are from the Berkeley project, and they included error estimates in their computation. They look like this:

Briggs goes on to say:

Now—here’s the real tricky part—we do not want the error bars from the estimates, but from the predictions. Remember, the models that gave these dots tried to predict what the global temperature was. When we do see error bars, researchers often make the mistake of showing us the uncertainty of the model parameters, about which we do not care, we cannot see, and are not verifiable. Since the models were supposed to predict temperature, show us the error of the predictions.

Notice that in the first paragraph Briggs said explicitly that the dots are estimates. Now he calls them “predictions.” Predictions? How is computing an area-weighted average a “prediction” rather than an “estimate”?

Briggs is just playing word games in an infantile example of novice sophistry. He wants you to believe that since he has called them “predictions” and claimed the come from some evil “models” they can’t be trusted. It’s not the “tricky” part, it’s the “tricksy” part.

Even if you leave out all adjustments (including unimpeachable ones like correction for time-of-observation bias which will reduce the uncertainty) so that the average is nothing more complicated than an area-weighted average of raw data, you still get the same, unambiguous warming pattern. And that warming pattern is statistically significant. Overwhelmingly so. Despite Briggs’ further sophistry.

Yes, folks, that’s it. Create a bogey-man made of straw, call it “uncertainty,” apply no analysis and no data, all just by waving of hands. That’s all he’s got.

But it’s enough for the numerologist to the stars! He uses his fiction about “prediction uncertainty” to conclude:

I don’t know what the prediction uncertainty is for Plait’s picture. Neither does he. I’d be willing to bet it’s large enough so that we can’t tell with certainty greater than 90% whether temperatures in the 1940s were cooler than in the 2000s.

What’s almost as impressive as the stupidity of Briggs’ post, is its condescending tone. Perhaps he should be nominated for an award.

122 responses to “William M. Briggs: Numerologist to the Stars”

“Notice old Phil (his source, actually) starts, quite arbitrarily,
with 1973, a point which is lower than the years preceding this date.
If he would have read the post linked above, he would have known this
is a common way that cheaters cheat. Not saying you cheated, Phil, old
thing. But you didn’t do yourself any favors.”

Actually 1973 appears to be *higher* not lower, than the years
immediately preceding *and* succeeding it. That year was presumably chosen by John Cook (Skeptical Science) because it marked the beginning of the first of many “pauses” that one can identify if one cherrypicks the right subintervals. The choice of 1971 or 1972 would have had little effect on the overall trend to 2009, but it did raise the slope a bit.

For greater certainty here are the BEST annual values for 1970s with trend slope to 2009:

Exactly right – 1973 was just a good start date to find one of those short-term cooling trends. Plus the current warming trend started in the 1970s (after the mid-century aerosol cooling effect had been overcome), so it was just a reasonable place to start.

Regarding my previous comment:
Note that 1973 is a local *maximum* not a local *minimum* as implied by Briggs. If one were cherrypicking the starting date to show maximum trend, it obviously would not have been 1973, since 1971, 1972, 1974, 1975 etc all give higher trend values and 1970 is barely lower. In fact, starting any time *after* 1973 the trend never gets that low again.

Important extra point. The SkS graphis is an Animated GIF. Briggs converted it to a JPG to remove the animation and removed the caption. So not only are Tamino’s points correct, Briggs is trying to make some statistical case from a graphic used to make a totally different point about the tactics and psychology of the Deniers. So he proceeds to demonstrate the very sort of thing the graphic is highlighting – misrepresentation
In football that is called an Own Goal!

Actually, of course, an average is a model—at least if you want to attach any meaning to it. It at least assumes the data that went into the model is measured without error.

[Response: Wrong. So wrong it’s astounding.]

I’ve also pointed out that the points represent different models (i.e. averages), and that the uncertainty in these are not accounted for. This is true.

[Response: Wrong.]

On my site (under the Start Here tab in the relevant climate posts) I go into great detail about how to treat temperature and time series, particularly how to speak of uncertainty. If you can trouble yourself to read these, you’ll understand why I use the word “prediction” (and the other terms: see my All of Statistics thread from this week). It’s just too much for me to re-type everything here.

[Response: We already know why you used the word “prediction.”]

Now, I’ve been asked by others to examine some of your other statistical work. It’s quite poor, so I suggest you spend some of your time boning up on the meaning behind the terms you use?

And my dear Tamino, condescending?

Rattus—she hasn’t, but boy the stuff she’d learn!

Deep Climate–try this from the 1940s.

Glenn—now, now. You know better. The graph is exactly the same as Plait presented.

That is not quite true either. If you click on Plait’s graphic it takes you to the original at SkepticalScience. When one clicks on William’s graph it goes, well, nowhere. Try it.

I’ve never heard someone manage to utter such nonsense about a simple OLS regression graphic as Briggs has. It is an epic incoherent word salad. The D-K is strong on this one, especially when they feel compelled to tell a professional statistician [referring to their work] “It’s quite poor, so I suggest you spend some of your time boning up on the meaning behind the terms you use?” Uh huh. I’m surprised he has not advised the Nobel committee that he is long overdue for his prize.

Briggs, “try this from the 1940s.”
Briggs continues to stubbornly miss the point. The point being that “skeptics” can cherry pick back-to-back short period trends of “cooling” (that have not statistical significance) when the statistically significant long-term trend is UP. Someone who allegedly specializes in time series analysis would know that– Briggs does not.

Briggs, “Rattus—she hasn’t, but boy the stuff she’d learn!”
Given that Curry appears to know diddly squat about statistics, Briggs saying that does not count for much.

Briggs, “I go into great detail about how to treat fudge temperature and time series, particularly how to speak of inflate uncertainty.”
There fixed.

Not one of William’s silly assertions has passed muster. “Skeptics” assure us that they do not deny that the planet is warming, yet from his diatribe one can only conclude that Mr. Briggs is a denier of AGW, and by extension so is Anthony Watts for posting his drivel on his pseudo-science blog.

Unfortunately, we all know that it is neigh impossible for a person afflicted with D-K to admit error on their part. So Mr. Briggs will no doubt be entertaining us all with more word salad, and Watts will just post said word salad on his blog without thinking twice to feed his fans some chum. The stuff stinks, but they seem to love lapping it up in feeding frenzies in the threads that follow.

Briggs said:
“Notice old Phil (his source, actually) starts, quite arbitrarily, with 1973, a point which is lower than the years preceding this date. If he would have read the post linked above, he would have known this is a common way that cheaters cheat. ” [Emphasis added]

When I showed that in fact 1973 (anomaly = 0.386) is a local maximum, Briggs attempted a rebuttal thus:

I assume Tamino meant “numerologist” flippantly, rather than literally, because Briggs doesn’t appear to adhere to any tenets of numerology. According to his credentials, he is an actual statistician. Whether you think his writings convey such knowledge is something else altogether…

It is clear that William M Briggs does not realise that the linear model is not being used to predict anything, but is intended to summarise a feature of the dataset (the long-term trend) that is of scientific interest. We are interested in a parameter (the slope of the regresssion) and we are also interested in its uncertainty (its error bars). We would quite like to know how fast the Earth has been warming, and we would like to know the plausible range for the rate of warming (although technically we would want a credible interval for that).

Now rather than give ambiguous verbiage about uncertainties, why not take a leaf out of Tamino’s book and get the data, perform the analysis that demonstrates that the issues you raise actually have some merit and publish the analysis (assumptions, equations, graphs, R code) on your blog.

Most of the statisticians that are interested in climate science are capable of writing an article with statistical terminology that will pull the wool over the eyes of the general public, and only the statisticians will notice that it is obfuscatory waffle. That would be easy, what is more difficult is stepping up to the plate and demonstrating that there is some substance to your position. Of course if you don’t want to step up to the plate, we will know why.

It is nice to see that William has stepped (almost) up to the plate:http://wmbriggs.com/blog/?p=4368
The next step is to apply the menthod to the data and see whether it makes a diffierence to the conclusion. I rather doubt it, decadal trends are too short to be meaningful whether the analysis is frequentist or Bayesian.

Normally I can figure out where someones thinking has gone awry – but here the thinking is so muddled and contrary to established practice that I have no clue!

I also completely fail to understand the rationale behind his focus on uncertainty. The last thing one should expect from highly uncertain data is a highly statistically significant linear trend. The existence of said trend essentially proves that uncertainty in and of itself is not an issue here. Of course, the real issue in interannual variability.

This just seems to be one more bit of random bar noise to blot out the real substantive conversation happening over there in the corner.

Stephen’s comment gives me a — hopefully — useful idea, but implementing it is unfortunately beyond my expertise:
There should be a simulation tool available on the web so that amatuer skeptics can fiddle with inputs to see the effects on the temperature record. If, for example, measurement error was quadrupled, what would happen? Let’s see how strong the urban heat island would have to be to account for the average observed warming. How many of the stations that show the most warming would have to be selectively dropped to make the warming trend disappear? I think it would be fun to let them fill their boots with ways to eliminate the warming, and maybe some of them would see how dishonest they’re being when they actually evaluate justifications for those ‘adjustments’.
Perhaps a little utility like this has already been built for linear regression? That would be a good start.

Well, I’m not sure you can fully appreciate Briggs’ position by reading this article alone. He abhors things like confidence intervals, p-values, and statistical significance because he believes that these don’t properly treat all forms of uncertainty and ultimately lead to overconfidence in a computed result.

Eli B. Were he a proper Bayesian of any sort he would provide a proper analysis, with model structures, priors on parameters and error structures explicitly defined. Then he could make a legitimate case. I still think it would be very hard to argue that that increasing trend is a random event – even with non-informative priors (which wouldn’t make sense given the preexisting knowledge of the physical system).

But the real point is that he doesn’t even try to conduct such an analysis. Instead he develops a verbal argument that barely makes sense and defends in comments above that make even less sense. My guess is that such an analysis is either beyond him, or would say exactly the opposite of what he would want it to.

Eli B.,
In perusing Brigg’s website, I really found very little that was useful. It’s all well and good to say there is more to uncertainty than confidence limits (e.g. systematic errors due to model choice). That’s known. There are ways to deal with it. They work. What he’s doing here is bullshitting–just like Judy, just like McI, just like Roy Spencer.

The emphasis of all these “skeptics” is on what we don’t know. That’s not science. Science is about using what we do know to find out more. Whining about “uncertainty monsters” is just an excuse for failure to make progress–in this case due to ideological blinders.

The bits quoted also fail to achieve basic syntactic coherency. For instance:

When we do see error bars, researchers often make the mistake of showing us the uncertainty of the model parameters, about which we do not care, we cannot see, and are not verifiable. Since the models were supposed to predict temperature, show us the error of the predictions.

I really don’t know what this is supposed to mean. He’s got parallel clauses that aren’t parallel, strange suppositions about who ‘cares’ about what, undefined terms (‘uncertainty of the model parameters,’ ‘error of the predictions’) and ‘logic’ that really doesn’t deserve the name. (If they are predictions, what ‘errors’ can possibly be identified, other than those due to ‘model parameters’ or other model characteristics? Or does he refer to hindcast ‘predictions’? I can’t tell what is intended here at all.)

It’s incredibly shoddy writing, and the George Orwell fan in me suggests that it’s a reflection of incredibly shoddy thought.

Note also Briggs’ sleight-of-hand towards the end of his post, where he says “…just as the WSJ‘s scientists claim, we can’t say with any certainty that the temperatures have been increasing this past decade.”

Of course, that’s not what the WSJ’s scientists claimed. They claimed that there was a demonstrable lack of warming, and that this was evidence against the IPCC’s projections. Briggs’ own (nonsenseical) claim that the temperature record is hugely uncertain flatly contradicts the claim made in the WSJ piece. But admitting that woud require him to concede that Phil was correct, of course.

@Briggs. This is so below moronic idiocy that I don’t have words for it. Average is a model!!!! so is counting too!!! Numbers!!!! Arrggh…a model too..let’s call every convention a model to get done…..And you say that with a straight face?? My god will your teachers tear their diplomas and certificates… So mind boggling.. If you actually believed anything you wrote… Most likely a deceiving lie…

Exactly! And my eye is projecting a “model” of the world in front of me in light against the back of my eyeball. The depths to which these people will redefine established principles to fit their predetermined outcome is fantastic. It’s going to be stuff future historians will study with great fascination.

I think that is the view Briggs took. It’s an understandable initial temporary argument (especially if he, as is the case with me, were not a statistician), but, as others have pointed out, eventually you have to either bury your head in the sand or get behind *some* model.

Once you start trying to quantify things, you realize that perhaps — just maybe — the models being used by others do in fact have merit and utility and are in fact suggesting something. Philosophical talk if fine for a bit, but you aren’t going to convince serious people if you can’t come up with a model to back up you doubts and hand-waving and which can notch in a few predictive wins to make it stand out somehow.

I have not really studied statistics, but my guess is that calling 10 year insignificant may have something to do with the number of “similar” 10 year patterns that exist over the past 150 or so years of data that we have vs the much more stand-out 40 year period since the early 1970s.

It’s always important to look at the big picture now and again when you have spent some time (accurately or inaccurately) arguing about details. The big picture here is not statistics (for Briggs has presented no math or model) but the underlying climate change story. Here, scientists have found a remarkable association, between rising CO2 levels and temperature trends, via trusted physics modeling of such CO2 effects. That reality should be taken into consideration in any statistical war of words on any study of climate change if the person is honest about producing the best analysis and advice. [Is there a better alternative model being presented than what is current used by the professional climate researchers? Not even close.]

Briggs is most entertainingly foolish. “Averaging is a model” is one of the funniest bits of nonsense I’ve heard in a while. Makes me wonder if Briggs has done any computation of averages or of errors, or if he’s just not dealing with reality. There are appropriate methods for computing an average plus error margins, given a series of measurements which have certain measurement errors. But the computation of an average is not the same as the computation of errors associated with the average.

Maybe reality is on the blink for Briggs, but the rest of us will remain in the real world, thanks.

Let’s see if I follow. Since all measurement in the real world has some degree of error, it is all dependent upon those evil statistical models, like taking an average. Therefore, we can happily ignore any measurement whatsoever if it happens to disagree with our prejudices.

I’m curious to know by what methodology one obtains a 0.5-1.0˚C error range (plus or minus!) for the temperature data spanning the satellite record (which is more or less close enough to what we see here, pardon me if I’m mistaken in that case though). The sheer size difference between the two is one of a whole standard deviation on either side from the smaller estimate; I recall from your own paper Tamino that you had found confidence intervals of ~<0.1˚C over the satellite record.

I myself was fooling around with a bootstrap Monte Carlo on Excel across the whole GISS temperature series (which is very inappropriate I know since the noise isn't white, but let an amateur have his fun), and I can't help but think that to have been off by a whole magnitude in scale – at LEAST – for the confidence is outrageous, as Briggs would suggest is the case.

If you look at, for example, the HadCRUT3 data, they break up the uncertainty estimate into a number of terms. The largest by far is sampling uncertainty – this is that the global average is an estimate formed by averaging order 1000 measurements of surface temperature.

If one compares this to taking an average with a measurement taken every 100m then it is easy to see that this will introduce an element of uncertainty – but the extent of this uncertainty can be estimated, and in general will decline when you take longer time averages.

In fact, one of the fundamental jobs of statistics is to compute averages of measurements, and to quantify the error in the resulting averages. This is what statistics was invented to do, more or less. This is the core statistical process behind just about every science experiment, every QA system, and every engineering measurement in the world.

Notice that in the first paragraph Briggs said explicitly that the dots are estimates. Now he calls them “predictions.” Predictions? How is computing an area-weighted average a “prediction” rather than an “estimate”?
——-
I interpreted the prediction thing as referring to the straight line fit. So I am not certain Tamino’s interpretation is correct.

However even that interpretation still has the smell of contrivance about it. My thoughts are a bit vague, but off hand the straight line fit is both a summary of the data and a confirmation of a physical model.

To say it’s a prediction means in effect the straight line fit’s purpose is extrapolation alone. That’s too weak a statement.

Alas, I fear Willie’s appearance here was probably a driveby. I rather doubt he has the cojones to put in a prolonged appearance, and this way he can brag to his acolytes about sticking his head up over the barricade.

The thing is that Willie’s assertions are just so flat stupid that I wonder whether the problem might not partially be one of communication. Certainly, the mean is NOT a model. It is simply the first moment of a distribution or of a numerical data set (which, itself, can be viewed as a discrete distribution). Perhaps what Briggs is trying to say is that whether the mean gives a more meaningful estimate of trend than any particular measurement depends on whether the errors on the measurements are random. If the measurements are plagued by systematic errors, these will also be reflected in the average. However, for this to be at all relevant to Briggs’ point, the temperature data would have to be subject to a time-dependent systematic error that explains or exaggerates the signal. There simply is zero evidence of this–indeed, there’s plenty of evidence against it.

I was not familiar with Briggs before this, so I spent some time perusing his site. I believe we are dealing with someone who suffers from severe delusions of adequacy.

Am I missing something here? Maybe it’s my geology background? Suddenly the numbers representing temperatures don’t seem to mean anything? Fine, then look at the abundant physical systems that also clearly reflect a warming climate: Start here; Waring, R. H. et al., Predicting satellite-derived patterns of large-scale disturbances in forests of the Pacific Northwest Region in response to recent climatic variation. Remote Sensing of Environment (2011), doi: 10.1016/j.rse.2011.08.017, or try this from the Bureau of Reclamation http://www.usbr.gov/climate/, or you could just ask a mountain climber since we see it first hand in the mountains.

I find it to be thoroughly amusing when clowns like you try SO hard to be high and mighty, as you are here Mr. Briggs. Of course, most people tend to feel bad for the feeble minded idiot who thumps his chest to feel important, but I am not most people. If you want to make an argument that real, thinking people will take seriously, drop the condescension, present your side, and move on. With your first “Old Phil” remark, the rest went right out the window.

Except that the red line isn’t an actual temperature, and neither are the dots.

You’re averaging likelihood and claiming that the average is somehow more likely! Rolling a 3 is no more likely a number to roll on a six sided die, despite it being the average value from a series of rolls!

You could plot many different red lines through that graph and they would all be equally valid. That’s kind of the problem when multiple x correspond to a single y.

Will, methinks you don’t know what an average is. Average is one way of defining central tendency–the first moment of a distribution. And I have no idea what you could mean by “averaging likelihood”. Look up the Central Limit Theorem.

For each dot in the temp graph we are reporting an average estimate of temperature. The true temperature could be higher or lower, and the graph with the error bars shows this. Yay for error bars.

The true temp is just as likely to be any point along the error bars. Taking the center, or average, doesn’t make a guess of true temperature any more likely. Picking a different spot along each error bar you could fit all kinds of red lines, each showing very different trends, and each being equally likely.

And yes an average is a model. You take a number of input parameters and get an output that is less precise than the original measurement . Sounds like a model to me.

[Response: Taking the average does yield a more likely estimate. Not every trend is equally likely. The output of an average is more precise than the raw data on which it’s based.]

Will,
Again, you don’t know what you are talking about. We’ve known since Gauss that errors in many problems tend to be normally distributed about the true value (hence the name “Gaussian”). We know that errors on the mean converge with sample size–so indeed, the mean is very likely to be closer to the actual value. Now THIS does assume a model for the errors, but it is a model that is often close to true in parctice. Note that by assuming such a model, you can assign a probability to every possible curve you can draw through the error bars–the one through the means will be the most probable.

Thank you for the response Ray. I do know what a Gaussian is, and get the basics behind the idea of random sampling. But that is not what this is. This is about using the mean of a set in conjunction with a linear regression.

There are likely other lines that would fit just as well (I do not know this, but suspect this) especially if you run the fit without doing an average beforehand.

If you ran a monster weighted least squares, regressing all of the individual station data on time, you should get basically the same trend line as you do regressing the stations’ weighted average (e.g. GISTEMP) on time.

An n-sided die, with sides numbered 1 to n, has a long term average of (n+1)/2, so for n=6, the long term average is 3.5, not 3.

What do temperature readings and die time series have in common? Well they are both a series of numbers, the first has spatial-temporal correlations (e. g. is not pure white noise), the other does not have spatial-temporal correlations (e. g. is pure white noise).

A lot of people see graphs and graphs online of climate temperature over the past 100 years, 10000 years, 10 million years, etc, and they see up and down and up and down and trend after trend after trend. I think it’s understandable the average person would be skeptical, especially since error bars are missing and there are competing graphs with some being more inaccurate than others.

If you dig deeper, you might see evidence that scientists have models that rely on accepted physics and which can approximate these temperatures from the past and present. These models include CO2 levels. To get an idea of why CO2 was added to the models, see this http://en.wikipedia.org/wiki/File:Temp-sunspot-co2.svg

Briggs signed the Manhattan Declaration but I don’t know whether he joined Spencer and signed both declarations.

By these declarations’ contents, those who sign even one of them have made up their minds ahead of time with respect to what is true and so prove themselves to be fake skeptics.

And their being fake skeptics in the name of conservative political and/or religious ideology and/or faith explains why even though trained in the science in question they still deny what the science in question says (whether it’s denial of evolutionary science, climate science, or any other science [I don’t know whether Briggs joined Spencer in denying evolutionary science – but almost all climate science deniers have since it’s almost exactly the same bunch]).

Finally, this message below by Briggs last year is reply to the BEST results shows that, in combination with his recent message that is the present object of discussion, he is speaking out of both sides of his mouth – it’s the same reply from so many other fake skeptics:

My understanding would be that 95% OLS confidence bands, superimposed on a linear trend, would include, on average 19 of 20 data points.

N = 50 in both plots, so on average, 47.5 data points would be inside the 95% OLS confidence bands.

Now if you extend his Baysnian polygon box, to include the entire dataset, you will find that 2 data points are outside of the box, or 48 of 50 data points are inside the box, giving 96%, which if you were to ask me, is no different than a proper display of 95% OLS confidence bands.

Further, if one draws a line fron the upper left corner of the Baysnian box (at the lowest x-axis data point) to the lower right corner of the Baysnian box (at the highest x-axis data point), the minimum possible trendline, is definitely positive, indicating statistical significance.

“Actually, of course, an average is a model—at least if you want to attach any meaning to it. It at least assumes the data that went into the model is measured without error.”

In the context of a parametric statistical model yes an average can be represent a model like the example you gave. It seems to be a highly restrictive idea of a statistical model in that it should include the covariances of the distribution to conform to the idea of a statistical model. Tamino addressed the general problem with Briggs “model as an average concept” with this incisive remark from the post:

“Apparently Briggs thinks that computing an average means using a “statistical model,” and that entitles him to smear the result by association with the evil of “models.” That’s stretching the definition of “model” to the breaking point, for no other reason than to exploit the denialist tactic of denigrating anything and everything associated with the word “model.” Even, apparently, computing an average.”

Stretching the definition of model!

The real complaint in Briggs comment comes from from its second sentence:

” It at least assumes the data that went into the model is measured without error.”

On the surface that seems absolutely absurd. What is the connection between performing an average and the stated assumption of no measurement error in the data. It would seem he is using two different contexts for model in the first and second sentence. In the first he means average, in the second he means who knows what. In my opinion the context would appear to require clarification to understand his intended meaning.

I can’t put words in Tamino’s mouth, but at least try to read and understand.

Briggs uses the word “model” as a dog whistle, as a means to provoke, to use the guilt by association informal logical fallacy.

To a fake skeptic, the only good model, is a dead model.

Briggs also uses the word “prediction” in a very similar fashion, yet he started out with an intersept, a slope, and standard deviation to generate a time series in the first place, but somehow loses that initial information, in suggesting that he cannot reconstruct a new statistically identical time series, given only the trend line estimates (slope, intercept, and the confidence interval of the trend line slope).

Replacing the y-axis standard deviation (units of temperature) with the trend line error estimate (units of temperature/time);

Q, you have assumed a model–no such model is required for the concept of an average. Again, it is merely the first moment of the distribution. In some cases (e.g. unimodal distributions), this will be related to a location parameter. With a few additional assumptions, the Central Limit Theorem will hold, and then your model may be relevant. None of this is implied in the definition of a mean.

One wonders if Briggs dares to get out of bed, walk around and find the coffee cooker in the morning, as obviously one cannot trust the model inside his visual cortex that converts the light patterns on his retinas (upside down, no less!) into a perception of three-dimensional reality.

I couldnt find a button to respond to a response from Tamino, so I am posting here.

In reference to an average; averaging a set of measurements does not provide a more accurate measurement. Each measurement on its own is as accurate as possible, given that it is _the_ data.

Averaging, no matter what field of science you specialize in, gives you less information than what you started with. You cannot reverse an average and get the original measurements out. In this sense, information is lost, and not gained.

The mean-as-average is a model. So is a weighted average, a running average, Euclidean distance (also an average), etc.. The choices are almost arbitrary as information is lost no matter which technique is employed.

I cannot speak for Briggs, but the fact remains that many many lines could be fit to the data and all would be equally likely to be “true”. As I said earlier, this is always the problem when multiple y correspond to a given x.

[Response: Unlike William Briggs, you might be a nice guy. But you’re every bit as wrong as he is.]

Will, who said people are fitting a trend line without understanding the uncertainty of the points? Maybe this is what armchair scientists are doing on excel, but nothing prevents “real” scientists from doing this. One should also end up with an uncertainty number for the calculated trend. Funnily enough, I’ve see many graphs just like that here on Tamino’s blog.

If you seriously think people are calculating global warming trends without considering uncertainty, please write up your concerns and publish them. You’d probably have to pick a specific paper to criticize, then calculate the trends “properly” with the resulting uncertainty and show that this differs significantly from the published results. If you could do that you would actually be contributing to the pool of human knowledge. Think you can do this?

BTW, have you thought about why people use temperature anomalies and not absolute temperature to calculate trends? The reason is that the random errors in the temperature measurements are likely much much smaller than systematic errors. But if the systematic errors are constant at a given site then anomalies naturally screen them out.

By taking multiple measurements, you do not increase the accuracy, you increase the precision. You are losing information in the individual measurements, but this can be easily and usefully summarized in a measure of central tendency. Most of what you are losing, by taking an average, is noise. This is because random noise in many measurements will tend to cancel out. I have no idea what Will is talking about, but statisticians know how to separate signal from noise in the data — it’s what the whole field is about — and it’s a worthwhile endeavor if you want to understand the world.

Will, pretend I am doing a counting experiment to measure the background gamma rate in my lab. I have a gamma counter set up to register every hit above a set threshold. It will count for 1 minute then report the counts in that minute, then reset. Etc. I start with the counter inside a set of lead bricks so my count rate is low. In the first minute I get 100 counts, the next minute 110 counts, the next 93 counts, the next 102 etc. After two hours I see values from a low of 90 to a high of 112. Should I say “Each data point is sacred, I cannot know my background rate to better than +/-10 counts/min”??? Or should I average my measurements and thus say my background is 100 counts/min +/- 0.9 counts?

By averaging I am losing any time varying signal, but I am increasing my precision by integrating for a longer time. For this measurement I don’t expect any time varying signal anyway.

Of course averaging is a type of filtering and you “lose” information. (That information is not really “lost” of course — you still have that data and can go back and analyze it in a different way if you want.) Averaging, or filtering of any type, is simply a way to increase the signal to noise ration for some bit of information that you are actually interested in.

The issue, as I see it, is when one fits a line to a series of estimates without accounting for the fact that the estimates contain uncertainty.

Let’s go back to your lab and run another, very similar, experiment.

Let’s say you measure gamma radiation on three consequtive days. Each day, starting at noon, you take 10 one minute averages. This gives you a mean background radition per day. Your numbers end up being 102, 97, 98. Is the background gamma radiation in your lab declining?

Without including the knowledge of uncertainty in your estimate (std error around the mean, for each day) we could easily say “it sure looks that way”. But you and I both know that is not the case. We know that those numbers are not actual measurements, and must be used cautiously.

In this toy example we are well aware of what +/- 10 means, and how important it is to include it when talking about the mean of anything.

Will,
Your example is a straw man bearing no resemblance to the question of what is done with global temperatures. First, a linear trend has 2 parameters–thus you couldn’t even do a proper fit to linear trend, let alone compare it to a constant count rate with only 3 points. Second, counts are subject to Poisson errors–a completely different error model from that of temperature measurements. As long as the errors are random, the means of several measurements will give you a better measure of the actual value than will any single measurement–that is simply a consequence of the central limit theorem.

Please try to state your thesis clearly, because right now you aren’t making sense.

Ray, the example was Gators and there is absolutely nothing wrong with it.

10 years of temperature data from thousands of sites: umm, this is even less “fitable” than three points. Of course there are going to be multiple good fits! That’s my whole point. This is entirely independent of climate science and is true of all data sets.

This matter of what error _model_ you want to attribute to a parameter is irrelevant at this point. It doesn’t change anything I have said.

Will, consider extending your example. You measure gamma counts every day for a year. Counting data is subject to Poisson errors, so the errors on a particular day’s count may be large. However, if your counts average 100 in month 1 and average 10 in month 12, you can safely assume the halflife of your source is less than 1 year. You could even FIT the data using a GLM with Poisson errors to determine the best fit to the half-life.

More data decreases uncertainty in the vast majority of cases. Crack a stats book and look up
Central Limit Theorem
Law of Large Numbers

Please make sure that you explain why the time series I presented (102, 97, 98), given the experiment Gator described, is in fact showing a negative trend.

It’s “showing” a negative trend in the same way that two points “show” a perfect correlation.

This latest digression recalls the chromium release work that I conducted years ago. I wish that I still had access to the raw (and replicated) counts – they would most excellently provide demonstration of central tendency, and of what biases do (and do not) shift from the central tendency actual results indicated by replication. Any cell biologists reading…?

If Will really is struggling with the relative benefits of what are essentially descriptive statistics, perhaps he might care to peruse the comments here:

Will,
I don’t get what you have against the mean of a set of measurements–it’s a measure of central tendency of the dataset, and it has very nice convergence properties. Again, state your thesis clearly, because you aren’t making sense. Your example illustrates the folly of fitting a trend with inadequate data, not anything profound about statistical analysis. It is a straw man.

Interesting tone. Why so hostile? You’ve misinterpreted how measurement error works and are assuming a whole bunch about filters and signal processing in general.

Average the pixel intensities from 10 different TVs each watching a different program and come tell me that the average pixel intensity is somehow more informative than the actual raw data.

You lose information when you use the average. This is a fact. If you model (regress) only the averages then you need to accept that your model isnt of the original data and contains a great deal of built-in uncertainty.

Averaging the temperature of 10 different points inside a refrigerator gives a valuable quantity that can probably be used in lieu of the 10 points to great success (depending on what is the goal). We can for example get a good idea of whether the compressor should be turned on or off in order to avoid the spoilage of most of the food in there. We can also use the average value to likely know if the refrigerator is working or not.

Your example of the average pixel value of 10 TVs can be useful if you want to estimate how much power TVs are consuming in total or to know how many people likely watch any TV at certain times.

The year is 2100, and the collective governments of the world had pitched in a few pennies here and a few pennies there to buy about 500 trillion thousandths of a degree accurate thermometers and associated supporting infrastructure to have the temperature of every meter squared on the surface of the planet recorded for near optimum coverage and accuracy in climate forecasting.

Is this what you are asking for? If we throw information out or fail to record every spot, is the endeavor destined to failure?

1. In excel you can generate a series of x data, and then generate y data according to a linear trend between x and y.

2. Now offset these y values by random values above and below the line using normdist or some such assuming a variance s^2.

3. From each of these y values, you can then generate random data assuming each of these y estimates is the mean of a distribution with variance s^2. Call this variable z.

4. Then throw out those y estimates, and keep the z and x values values.

5. Finally, calculate means for each subset of z values.

If you repeat this process 100 times, and then calculate slopes of regressions of z on x for each simulated data set, the average slope will approximate the slope originally used to general the y data from the x data. Linear regression is actually good at recovering the underlying trend in an unbiased way.

If you then regress the mean z values on x, you will find that the average slope also approximates the original slope between x and y. Taking the averages of the z values has no impact on ability of regression to estimate the trend.

The difference is that there will be less variability in the residuals around each regression of mean(z(i)) on x(i) that for the regresssions of the raw z(i) values on x(i). That is because averaging produces a mean that is actually closer to the underlying average (y(i)) than each of the data points using to estimate the mean.

Everything we see is a model, of how we perceive the universe around us.

We came into this world knowing literally nothing beyond our own biological urges.

Now there are stocastic (statistical) models, deterministic (numerical) models, mathematical (analytical) models, and physical (laboratory) models. But even here, we could disagree on these catagories and their definitions.

My informal career as a modeler began in 1953 (the year of my birth), my formal career as a modeler began in 1973, my career as an informal and formal modeler will end some day, as all things must end one day. But there will always be modelers, of that I am certain.

Some people do not believe in deterministic models at all, ofttimes those people who only wish to model things as stocastic processes.

Some people do not believe in stocastic models at all, offtimes those people who only wish to model things as deterministic processes.

But in either of these two cases, a false dichotomy is presented. As either of these modeling choices excludes the other. One single modeling choice cannot work alone, given the availability of other equally compelling modeling choices.

Yet we may choose one model over all other models, to better fit our own worldview, when. in fact. all models taken together provide the best overall view of the world as we see it, from the past, to the present, and into the most probable futures yet to come.

Or, in the worst case, we could reject models altogether.

Everything we know, everything we think we know, everything we hope to know, rejected out of hand.

Denial.

To model, or not to model.

I am not afraid to model, have never been afraid to model, will never be afraid to model.

I would like to have the best modeling tools in my toolbox, and yes, that toolbox includes deterministic (numerical) models.

Now each of us develops our own modeling toolbox, most are filled with informal tools and some are filled with formal tools.

Will, unless there is a systematic error that affects all measurements, the mean of an average of several measurements will be more accurately known than with the error on any single measurement. This is a consequence of the central limit theorem. If you do not understand this, you don’t understand Gaussian error measurement. What you are saying really makes no sense.

Any measurement will have an appropriate error model associated with it. Counts have Poisson errors. Continuous quantities tend to have errors that are normally distributed about the actual quantity. I strongly recommend the Jaynes text David recommended by the way. You need to read it.

It a restrictive idea of a statistical model? I was hoping it’d broaden people’s understanding of regression models.

Furthermore, in general, we are mostly trying to estimate the mean (average) of an independent variable given all available dependent variables in statistical regression. For example, y = a + b*x +e. The error term is assumed to have a mean 0 (if not zero, it’d be absorbed into the constant term a), so the mean E(y|x)= a + b*x. We are interested in estimating the unknown parameters a and b and, of course, the variance/standard deviation of e. In other words, we’d like to estimate E(y|x) and then predict y accordingly.

If everyone here is only keen on attacking Briggs’s intentions, sorry, I am in the wrong place. I shall snap my fingers and disappear. Q.

But now, for what I think is the first time, I find myself the target of an attack. And I have to admit, I welcome it: it’s a textbook case of denialist sleight of hand, of distraction, distortion, error, and misdirection.

What’s really amusing about all this is that the GHCN land-surface temperature record is so heavily oversampled that it’s almost ridiculous (at least in terms of computing global-average temperature anomalies). Anyone who claims that you can’t get decent estimates of global-average temperatures from the existing data really deserves to be laughed out of the room.

I’ve played around with the GHCN data a bit (haven’t done anything that approaches the sophistication of tamino’s work) and have found the global-warming temperature signal to be incredibly robust to processing/station-selection variations.

My latest “back of the envelope” exercise involved selecting a tiny number (less than 50) rural GHCN stations, roughly uniformly distributed (*very* roughly) and seeing what kind of results I would get. Well, I ended up getting results surprisingly close to the NASA land-temperature index results. Total rural station count=45; however, the stations selected didn’t all report data every year. So the actual number of stations I selected that reported data for any given year was 12 (pre-1900) and no higher than 44 (during the 1951-1980 “baseline”) period.

Selection procedure? Just divvy up the planet into very large grid-cells (30×30 deg at the Equator, with appropriate adjustments as you go north/south) and select the one station with these qualities: I could compute a decent 1951-1980 baseline, and it had the longest temperature record of all the stations in the grid-cell. That’s it.

“… His biggest claim: that those points aren’t measurements at all, but estimates.
Here’s the thing: he’s wrong. Those point are in fact measurements, though they are not raw measurements right off the thermometers. They have been processed, averaged, in a scientifically rigorous way to make sure that the statistics derived from them are in fact solid. The Berkeley team describes in detail how that was done (PDF), and does actually call them estimates, but not because they are just guessing, or using some arcane computer model. They are technically estimates, in the sense that any measurement is an estimate, but they are really, really good ….”

If I look at the thermometer on the ‘stat in my living room, and it reads ’68F’ – that measurement of 68F is an estimate. And a measurement. It tells me that the temperature is 68+/- some unknown amount – given that I have no data upon which to make any kind of determination of the accuracy of that measurement.

Now, if I cast several dozen cheap thermometers around the room, and then read them all, I might get values ranging from, say, 65F up to 72F.

This immediately tells me that there is significant potential error in any one of those measurements – up to several degrees F.

But if I take the average of those several dozen values, subject only to the assumption that the errors are randomly distributed – that average is actually going to be a quite precise measurement of the temperature of the room. Even more, I can calculate the expected range of error of that more precise measurement – and it will be much, much less than the range of error of the raw measurements.

This is basic statistics – like, first week of Stat 101 basic. I’m having a hard time believing that it has to be explained to people who claim some kind of statistical authority.

There is one interesting question buried in this discussion – to what extent does the uncertainty of the individual measurements in a time series, have an impact on the uncertainty of a trend fit to that time series. I suspect little to none, given an assumption that the error of each point is independent. But I’d be interested in seeing Tamino work this out for us, or point us to a good citation.

I agree with you. The uncertainty of individual measurements in a time series may affect the uncertainty of the slope of a linear regression. Furthermore, if measurements are not normally distributed, or even equally distributed with constant variance, one should be careful. This may not be Stat 101 material. Ironically, this was one of Briggs points (February 1, 2012 at 11:17 pm: “the uncertainty in these are not accounted for”), to which Tamino simply replied “Wrong”. Why?

[Response: Departure from normality will not invalidate the result from linear regression (ever hear of the Gauss-Markov theorem?). Heteroscedasticity will only only do when it’s extreme, which — if you study the data (as I have done) you’ll discover — it isn’t. Your objections are invalid.

As for my straightforward “Wrong!” reply to Briggs, it was prompted by his first paragraph including the truly idiotic and probably despicably dishonest statement that “It at least assumes the data that went into the model is measured without error.” Imbecilic mendacity like that deserves no better response, as it demonstrates that Briggs is not interested in exchanging information or insight — instead he’ll say anything, no matter how false, which he thinks will cloud the issue in the minds of the naive. Congratulations on having been suckered.]

Please cut me some slack—I did not intend to object to anything. You’re of course right about normality. However, in general I would say the assumption of homoscedasticity needs to be justified. If this assumption fails, simple regression analysis may under-estimate the uncertainty in the regression parameters. This was the one point that caught my interest.

If you say this is of little relevance to global temperature time series, I’ll take your word for it—I haven’t looked at the data. No need to be rude.

I’d have to agree on “congratulations on being suckered” being rude,not that it matters. I’d be rude too if I was doing Tamino’s “on the side” job I’m sure.

I think it can get hard to separate less common honest inquiry from the onslaught of typical BS, which may unfortunately lead to rudeness toward some with honest inquiry.

Search for:

Support Your Global Climate Blog

You can help support this blog with a donation. Any amount is welcome, just click the button below. Note: it'll say "Peaseblossom's Closet" and the donation is for "Mistletoe" -- that's the right place.

New! Data Analysis Service

Got data? Need analysis?
My services are available at reasonable rates. Submit a comment to any thread stating your wishes (I'll keep it confidential). Be sure to include your email address.