Trying (Unsuccessfully) to Replicate Lewandowsky

Is Lewandowsky et al 2012 (in press) replicable? Not easily and not so far. Both Roman M and I have been working on it without success so far. Here’s a progress report (on the first part only).

Lewandowsky carried out factor analysis on three groups of questions: free markets, CO2 and conspiracy, each of which will be discussed below. He used structural equation modeling (SEM) on the other science questions (cause and consensus).

Factor analysis is a standard and relatively simple technique, very closely related to principal components. A “factor”, like an eigenvector, is a set of weights; a “latent variable”, like a principal component, is nothing other than a linear combination of the underlying data using the factor weights (loadings). This needs to be kept in mind before over-reifying these things.

Economic Questions
Lewandowsky carried out a factor analysis on the 6 economic questions in his survey, reporting:

For free-market items, a single factor comprising 5 items (all but FMNotEnvQual) accounted for 56.5% of the variance; the remaining item loaded on a second factor (17.7% variance) by itself and was therefore eliminated.

I tested this using the R function factanal, the loading report from which is shown below. The loading on the first factor so calculated (48.7%) is somewhat lower than Lewandowsky’s report (56.6%). In addition, while the second factor is indeed dominated by the second question, there is also some loading from the third question onto this factor. Such a simple first step should be directly replicable. The calculation here is close, but only close.

I’m not sure that this (or the other factor analysis calculations) matter in the end. My interpretation of Lewandowsky’s sketchy description is that the “latent variable” used downstream is a weighted average of the 5 retained series (with relatively even weights – see his Table 2.)

CO2 Variables
Questions 7-10 and 29 of the online data were grouped by Lewandowsky as a “climate science” variable. The questions themselves are a mix of science and WG2: for example, question 10 ( the belief that CO2 has caused serious negative impacts on climate in the past 50 years) is certainly not one that can be answered through mastery of radiative physics. Further, it is one on which reasonable people can differ. Personally, I don’t think that there has been a serious negative impact in the past 50 years, though I also credit this to good luck, rather than good management: it might have been a lot worse than it was. It seems highly plausible to me that one’s attitude towards this “negative impact” is associated with one’s attitude towards public and private sector. So while I don’t think that this particular question is evidence on an attitude towards “science”, I can see how one’s attitude might be affected by one’s attitude towards whether public sector employees were overpaid or underpaid, overworked or underworked, though the questions were not framed this way.

Back to the factor analysis. Lewandowsky stated:

The 5 climate change items (including CauseCO2) loaded on a common factor that explained 86% of the variance; all were retained.

I wasn’t able to replicate Lewandowsky’s claim at all. I got explained variance of 43.5% in the first factor(versus Lewandowky’s 86%). I notice that the explained variance for two factors was 86%: maybe Lewandowsky got mixed up between one and two factors. If so, would such an error “matter”? In Team-world, we’ve seen that even using contaminated data upside down is held not to “matter”; perhaps the same holds in Lew-world, where we’ve already seen that use of fake and even fraudulent data is held not to “matter”.

Again, I’m unclear as to whether the factor analysis is carried forward into the latent variables. The weights for the four CO2 question series shown in Table 2 are relatively even
(0.941, .969, .982, .921); no weight is shown for the CauseCO2 variable (question 29), though the running text says that it was used. The weights are so even that it’s hard to see the benefit of the weighted average relative to a simple average or count.

Conspiracy Ideation
Lewandowsky’s questionnaire included 14 conspiracy questions, of which one related to exaggeration by climate scientists. Another question related to the Iraq invasion. In this case, the majority of warmist respondents did not accept the official story. Perhaps for this reason, Lewandowsky did not include this question in his results (nor in the data, a highly questionable bit of censorship as Tom Fuller, an experienced online surveyor, has pointed out).

The conspiracy questions ranged from wacko (The Moon Landing was faked) to ones on which fairly reasonable people might disagree (The JFK assassination). The warmist audience at Deltoid and Tamino were sympathetic to the possibility of neo-Nazi involvement in the Oklahoma City bombing and surprisingly receptive to the 9/11 conspiracy that Lewandowsky had pooh-poohed in his editorial.

Lewandowsky did a factor analysis on 12 conspiracies (excluding Iraq for reasons not reported in the article and excluding CO2 exaggeration). He reported:

For conspiracist ideation, two factors were identied that accounted for 42.0 and 9.6% of the variance, respectively, with the items involving space aliens (CYArea51 and CYRoswell) loading on the second factor and the remaining 10 on the frst one (CYAIDS and CYClimChange Items loading on each factor were summed to form two composite manifest variables. The two composites thus estimate a conspiracist construct without any conceptual relation to the scientic issues under investigation.

In my calcuation, I got explained variance of 18.4% in the first factor (Lewandowsky – 42.0%); cumulative 33.1% through two factors (Lewandowsky – 51.6%). What accounts for the difference? I have no idea.

In Table 2, the corresponding latent variable is said to have been calculated by an even weight of 0.742 on all conspiracies except the alien ones (Area 51, Roswell) and a higher weight of 0.891 on the alien conspiracies (excluding AIDS and CO2 exaggeration on the grounds that they were related to other propositions.) Again, it’s hard to see the benefit of this weighting relative to a simple count.

Other Science
Given the later reliance on “science” in the paper, the questionnaire was remarkably sketchy on “science”, having only two questions (tobacco and smoking; HIV and AIDS), neither of which have been issues on the skeptic blogs. There were a vanishingly small number of respondents (5) who disagreed with both propositions – all probably fake responses; and only 17 from either camp who disagreed with one of the propositions, again probably liberally populated by fake responses.

Given the subsequent attention paid to this very small population of respondents, one would also like to be sure of the responses. Scott’s WUWT questionnaire included a comment line: some respondents made the narrow technical observation that not all people who smoked proceeded to lung cancer, saying that smoking dramatically increased the risk of lung cancer but argued that there was more to the epidemiology than mere smoking. This answer seems pointlessly pedantic for a public survey, but on the other hand, since the respondent agreed entirely when re-phrased as risk, it doesn’t seem to me that such an answer should be classified as a disagreement.

For each question, there was a related question in which respondents were asked to estimate the proportion of doctors or scientists who held the respective theory. Some of the responses here (e.g. 0% of scientists believing that HIV causes AIDS) appear to be intentional provocation in which the respondent is making fake responses in order to embarrass warmists or skeptics (the respondent having previously impersonated the opposing party elsewhere in the survey.)

Despite this very small population of disagreeing respondents (and consequent limited utility), Lewandowsky attempted a rather elaborate methodology, described as follows. (For reference, “manifest variables” are actual data, while “latent variables” are reified linear combinations of data e.g. sort of like a principal component.)

The acceptance-consensus item pairs (e.g., CauseHIV { ConsensHIV ) were entered into an SEM with two correlated latent variables, one capturing the common variance of all \belief” questions, and the other representing all \consensus” questions (see Figure 1). Pairwise correlations between the residuals for each belief-consensus pair represented content-specic covariance.All SEM’s [Structural Equation Models] were performed with M-PLUS using ordinal coding of the manifest variables, with the consensus responses binned into 9 categories with approximately equal numbers.

The acceptance-consensus item pairs (e.g., CauseHIV { ConsensHIV ) were entered into an SEM with two correlated latent variables, one capturing the common variance of all \belief” questions, and the other representing all \consensus” questions ( ordinal coding of the manifest variables, with the consensus responses binned into 9 categories with approximately equal numbers. [SM- it is not clear how to implement this instruction since the population of 100% views much exceeds a 1/9 bin].

Items loading on each factor were summed to form two composite manifest variables. [this sounds like making a principal component from eigenvectors]

The fact that acceptance of climate science (CauseCO2) and perceived consensus among climate scientists (ConsensCO2) loaded onto their respective latent variables together with other very different scientific propositions suggests that respondents did not gauge consensus among climate scientists, and evaluate climate science, independently of their views of other, unrelated domains of scientific inquiry.

The acceptance-consensus item pairs (e.g., CauseHIV { ConsensHIV ) were entered into an SEM with two correlated latent variables, one capturing the common variance of all \belief” questions, and the other representing all \consensus” questions (see Figure 1). Pairwise correlations between the residuals for each belief-consensus pair represented content-specic covariance. All SEM’s were performed with M-PLUS using ordinal coding of the manifest variables, with the consensus responses binned into 9 categories with approximately equal numbers.

The model fit reasonably well, chi2(5) = 53:71, p < :0001, CFI= :989, RMSEA= :092 (90% CI: .071 { .115). People's content-general inclination to accept science was associated with content-general perceived scientic consensus; r = :43, Z = 12:76, p < :0001, over and above the content-specific links represented by the pairwise correlations. The fact that acceptance of climate science (CauseCO2) and perceived consensus among climate scientists (ConsensCO2) loaded onto their respective latent variables together with other very different scientific propositions suggests that respondents did not gauge consensus among climate scientists, and evaluate climate science, independently of their views of other, unrelated domains of scientific inquiry. Rather, their perception of consensus and their endorsement of scientific findings about the climate reflected in part a content-independent disposition to perceive scientific consensus..

Figure 1. Latent variable model for the relationship between perceived consensus among scientists and acceptance of scientic propositions. All correlations and loadings are significant and standardized. Manifest variable labels are explained in Table 2. See text for further explanation.

It’s hard to say what meaning, if any, can be attached to these statements. Roman Mureika observes that Structural Equation Modeling has a lot of settings, making results not easy to interpret. Lewandowsky didn’t explain why factor analysis was an acceptable exploratory method for the other variables, but not in this case.

In addition, as I noted in connection with factor analysis, the construction of the latent variable used in the downstream analysis is only loosely related to the exploratory analysis and the ultimate latent variable appears to differ little from simple counts.

Solution of Environmental Problems

The fifth latent variable was solution of environmental problems- corresponding to two survey questions. I did not locate any discussion of this latent variable.

Pairwise Correlation
As noted above, the five “latent variables” are closely related to (lightly weighted) averages of agreement with propositions in five areas (conspiracy, free market economics, environmental solved so far, climate “science” and other science.

I presume that the tests described here arise from SEM (Structural Equation Modeling), a technique that is fashionable in psychology. There are only a limited number of ways that one can manipulate relatively simple datasets, but the vocabulary differs somewhat from field to field. Roman and I are spending a little time parsing this paragraph. For now, my understanding is that Table 1 is nothing other than a correlation matrix between the five “latent” variables. Simple pairwise correlations have identical robustness problems as OLS regression. In a situation fraught not merely with outliers, but with actual fake data, at a minimum, one would expect robust statistical methods to be applied (“robust” used here as a technical term in statistics as opposed to Mannian ventilation). However,
even this minimum level of statistical sophistication appears to have eluded Lewandowsky.
Linear Regression (?)
Lewandowsky then describes his final model as follows:

We next sought to predict acceptance of climate science and other sciences from the remaining three latent variables while simultaneously simplifying the model by removing non-significant correlations and regression weights. This final model fit very well, chi2(82) = 182:1, p :10. Figure 2 shows the model at the level of latent variables, displaying only weights and correlations that were statistically significant (p < :05).

I think that this paragraph means that he did linear regressions of his “climate science” latent variable (and respectively his “other science” latent variable) on the other three: economics, environmental so far and conspiracy though the operation is expressed in terms of Structural Equation Modeling.

Again, because of the severe problems with “outliers” and even fake data, the relevant technique is one where which is resistant to outliers and fake data i.e. robust regression, a technique which has had many developments in the past 40 years or so (e.g. especially Huber, but going back to Tukey) and which should be mandatory for analysis of such defective data. My guess is that SEM either does not permit “robust” techniques or that Lewandowsky didnt know how to use them. Nor do I see precisely what purpose is served by using SEM methods in Lewandowsky’s style as opposed to robust regression. (Needless to say, robust correlation and robust regression yield very different results, a point that I’ll pick up in a forthcoming post.)

114 Comments

Our favourite statistician chimes in!http://wmbriggs.com/blog/?p=6164
“Everything that could have been done wrong, was done wrong. Every bias that could have been manifested, was manifested. Every fallacy pertinent to the matter at hand was made. The conclusions, regurgitated from unnecessarily complicated statistical procedures, did not follow from the evidence gathered, which itself was suspect. In its way, then, the paper is a jewel, a gift to the future, a fundamental text to how easy it is to fool oneself. “

Steve – I think it is a mistake to go down this route. The whole thing was so godawful tendentious from the start. You don’t want to start haggling about the numbers now. It will only give Professor Gleick, sorry, Lewandowsky, ammunition to dispute this and that calculation.

Yes, there are a great variety of settings available when running a factor analysis procedure. You can start by choosing a basic method of doing the calculation – maximum likelihood or principal components. When more than one factor is calculated, the answer is not unique so the factors can be rotated using a choice of rotation method, Furthermore, one can use different relative weights for the observed variables. When calculating the scores – the unobserved latent variables, there are several methods for doing that aspect as well. The default choices will vary depending on the software chosen for the purpose.

For the Lew data, the 1 to 4 questionnaire responses can be treated as numeric (in my opinion, usually not a great choice without some intermediate action) or as ordinal categorical variables so that involves changes in how the necessary correlations used in the factor analysis are calculated.

All of these choices involve implicit assumptions about the data. A proper analysis would involve checking the intermediate results along the way in order to evaluate whether those assumptions are reasonably satisfied and not just state the results without having done so.

Steve’s initial attempts not exactly matching the Lew results is not surprising under these circumstances. The onus is on Lewandowsky and his co-authors to provide proper information about what they did and the data that was collected. Anything else is not science.

Did you have any luck replicating his correlation matrix in Table 1? I tried for a little while back when I was trying to see the effect of removing nonsensical responses from the data set, but I wound up giving up.

Using my present best guess as to his calculation of latent variables, here is his Table 1 and my estimated correlation matrix. The two resemble one another except for conspiracy where the sign is reversed. {SM- sign in transcription of Table corrected below] OLS methods (of which a correlation matrix is an example) are VERY poor methods for this sort of data set. Lewandowsky may set a sort of incompetence landmark in this respect that will take many years to surpass.

I don’t think he has enough cases in each cell to permit statistical analysis. Steve, in your opinion am I correct in that impression? That this is a qualitative study at this point, not quantitative analysis?

I actually think that’s the crucial point to be made. Cells with 5 or 17 cases just can’t be dealt with quantitatively, IMO.

First, it appears he only had 78 items, and a basic SEM rule of thumb would be 200 items. Second, the Chi-squared test for a SEM is actually a “badness of fit” and he didn’t pass. The “CFI” is probably the “GFI” (Goodness-of-Fit Index), and looks good, but his RMSEA is too low. So it really doesn’t look like a good SEM fit.

I’m just learning this stuff, so could be wrong. An excellent R package for SEM’s is lavaan, which lets you make models with a very nice notation. A good reference is “A Beginner’s Guide to Structural Equation Modeling” by Schumacker and Lomax.

The bottom line is that SEMs require much more judgement and skill than Lew has. He’s way in over his head.

RomanM: Thanks, I’ll look at the sem package. I really do like lavaan’s notation and its tie-in to the qgraph package, which makes nice graphs of models. (Hadn’t looked at the psych package at all, thanks for the lead.) The John Fox document looks very helpful.

On the other hand, after experimenting for a while with SEM on a particular problem, I couldn’t find any advantage over other methods that I’d used previously. At first, I was intrigued by SEM since it contains several other techniques as special cases, but in the end it really didn’t seem to offer anything over what I already had, so I dropped it. On the positive side, it may hold some kind of record for the highest number of diagnostic measures of any technique I’ve ever seen. 😉

I don’t think it’s valid to say that answers you disagree with are fake. I know people who did undergrad and masters at a top engineering school who would take the positions on tobacco and AIDS. And their majors were in the science/engineering side.

Steve: one respondent to the test has already said that he faked the answers.

There are actually people out there believing everything and its opposite.

I wonder if this taking a look at the data and throwing some out, would qualify as post-hoc reasoning… Shouldn’t methods be defined before the data has been seen?

Besides, I think no amount of statistic torture will bring us much insight, as the problem is that the questions were not designed by someone with an interest in the subject he studies. Both the climate questions and the conspiracy questions lack profound insight into the topic.

E.g.: What would have been interesting is to find out how many people are “lukewarmers” (what ever definition), how many are “no warmers”, how many reject climate science completely (“its all bonkers/an hoax”). On both sides it would have been interesting to see the actually level knowledge of consensus climate data (e.g. how much has CO2 risen since the industrialization, how much has the temperature risen, how much has the sea level risen, and so on).

And as for conspiracy theories: There are so many variations, I find it doubtful he did catch it all. E.g.: 911 – LIHOP vs. MIHOP? Or JFK: Shooter(s) on the grassy knoll vs. Oswald was alone but worked for the CIA? Or just recently I found a conspiracy theory about the moonlandings, that NASA landed humans on the Moon, but supposedly hides something (I dunno, an Alien outlet center on the Moon maybe?) And I’m sure he missed some “important” (as in wide held) conspiracy theories like HAARP/chemtrails/FEMA death camps. And what about fluoridation of the drinking water?

Honestly, one needs to have at least some interest in the subject, if one wants to study it – Lew doesn’t show any interest. That reflects in the survey and a chance was missed to actually learn something. (The conspiracy angle is a red herring anyway.) Using better statistics will not make the survey better.

And I always seen the “soandso is an activist” talk on WUWT and dismissed it – now with Lew I understand it with great clearity. Lew is an activist, not an scientist.

Lewandowsky’s survey is constructed from a sample of convenience and so cannot be used to draw any useful conclusions about anything. To perform a survey properly you identify the population you want to study, carefully construct a sample of that population, and contact your sample members to elicit a response. Then you follow up aggressively with non-responders to get as many as you can since non-response bias is usually very large.

Even if there were no “fake” responses and the calculations were all done perfectly, it wouldn’t matter because of the defective sampling scheme. There is no sampling scheme! I can’t believe this paper was accepted to any serious journal.

This study is on the same lines as what they sometimes do on television news: invite viewers to call in and respond to a question, tabulate the responses and show them on the screen with a big “non-scientific” disclaimer. An interesting curiosity but no more. If the Lewandowsky paper had a “non-scientific” discliamer I wouldn’t have a problem with it.

I recommend this paper, “Statistical Models and Shoe Leather” by the late David Freedman

The paper shows just how much work is needed to validate that your inferences from data are correct. The point of the title is that successful statistical work is less about fancy models than the big investment of shoe leather needed to gather useful data.

This is along my line of thinking. I would extend the line to the very motivation of the questions asked. No matter how well you follow the rules of scientific procedures, if 1., you are not acting in a sincere manner, and 2., your hypothesis is not plausible, then you are going to produce nonsense. Science is far more than choosing the correct statistical test. The heart of science is in the understanding that goes into your hypothesis.

The hypothesis here is that climate skeptics are lunatics. This is ideological advocacy, not science. For a proper analysis, both ‘sides’ of this question should be studied. Do those who trust ‘the consensus’ believe that corporations are out to get them? That Monsanto is trying to poison them with GMOs? It’s perfectly reasonable to ask questions about correlations between climate change understanding and other ideological issues. That is not what was done here.

I see nothing wrong with tearing apart Lewandowsky’s statistics, but I find it the lesser failing in the paper.

Yes, you have said exactly what I have been thinking. All of Lew’s analysis and all of the support he claims assume that ACGW is fact–therefore, by definition, if you disbelieve, you are a conspiracy theorist and a nut job.

I work for big pharma. If we attempted to prove a drug worked by holding up a QOL survey done like this, we’d be shunned. Not only that, but we have folks claiming our data doesn’t show what we say it does, because of misuse of statistics–and those are data sets with hundreds or thousands of inputs.

This guy is on a mission to discredit anyone who disagrees with him about whether climate data say what he claims they do.

Not to mention, since when is psychology really science? It seems to me it would be very hard to account for human deviousness and desire to trick. Even if I were chosen as part of a carefully done, non-biased survey like this, I might be in the mood to mess with the guy. Surely, you would need large numbers to overwhelm the intentionally deviant.

Yep, I’ve said this several times. It’s just putting lipstick on a pig.

All open internet surveys, like the TV polls you mentioned, are junk. I’ve studied this stuff at post-graduate level and run surveys for a living. It’s rubbish, and there is no useful information whatsoever to be gained from analysing the results.

Yes. I think Steve and company appreciate this. It seems Steve’s point in trying to replicate the analysis is to show that even if the data was good for anything (it’s not) the analysis is badly done. From what I’ve seen this is probably true. The methods are more complex than what is needed for the problem and I think this is a case where simple crosstabs of the data should show you what is going on. The use of OLS regression in this context…oh dear. Here is a remark from Freedman’s paper (page 298) on a simple contingency table used by Snow to establish that cholera is a wateborne infectious disease:

“As a piece of statistical technology, Table 1 is by no means
remarkable. But the story it tells is very persuasive. The force of the
argument results from the clarity of the prior reasoning, the bringing
together of many different lines of evidence, and the amount of shoe
leather Snow was willing to use to get the data.”

Anytime I see a really fancy model, my antennae go up. When there is powerful evidence for something it can usually be seen without the fancy model.

Yep… lipstick on a pig alright. I think Steve is interested in this because of it’s governance issues (i.e. failure of peer review). Also the possibility of forcing a retraction of an SkS related paper must surely be luring prospect.

The key statement for all of this comes as the first paragraph in their “Method” section:

“Visitors to climate blogs voluntarily completed an online questionnaire between August and October 2010 (N = 1377). Links were posted on 8 blogs (with a pro-science science stance but with a diverse audience); a further 5 “skeptic” (or “skeptic”-leaning) blogs were approached but none posted the link.”

Exactly which of the “pro-science” blogs have a diverse audience in any sense, much less the specific sense that outsiders would join in a survey obviously intended to show that they wear tinfoil hats? Most of them have such draconian and arbitrary moderation that “skeptical” visitors are reduced to passive viewers rather than participants.

Back to your replication study:

1. What are D1 and D2 in the model of Figure 2?

2. Am I correct in understanding that Lew uses phrases like “the acceptance of other sciences”, and “rejection of other scientific propositions”, to describe answers to precisely two questions: “Does HIV cause AIDS?”, and “Does smoking cause lung cancer?” It’s remotely possible that someone might be rejecting *medical* science by rejecting those two facts, but rejecting “other sciences” (plural)? I could understand if he asked a bunch of questions from mathematics, chemistry, physics, astronomy, and medicine, but he greatly overreaches in this part of his analysis.

3. Given that they make a big deal of removing multiple surveys from the same IP address, why did they allow duplicate IP addresses (apparently overriding the survey site’s defaults) in the first place? Were they actually able to reliably tell the IP addresses after the fact?

4. If I am reading their Figure 2 correctly, it appears that Conspiracists were much more likely to reject “other science” (HIV/Aids and Smoking/Lung Cancer, in actuality) than “climate science”. Which means that climate science skeptics are significantly less likely to be conspiracists, right?

Any thoughts in my 4th question? Do the correlations in his paper, interpreted in a similar manner to his other interpretations, indicate that climate skeptics are less likely to be conspiracists, or that conspiracists are more likely to accept climate science? If I have the logic right, his paper in fact suggests the opposite of what he claims, which would be hilarious. If it’s true, I think Steve should stop trying to reproduce and should instead make it a front-page item indefinitely.

Steve: his discussion of correlations is beyond bizarre. He has so little understanding that it’s hard to know where to begin.

Steve: Too true. Still, if he has “proven” that conspiracy lovers are MORE likely to accept “climate science”, that’s front-page-worthy. And that’s my take on it from a glance at his model diagram (Figure 2), though perhaps I have some things confused.

Oh, a useful quote from a guide to SEM, by Victoria Savalei and Peter M. Bentler says: “The logic of SEM goes like this: If we assume that the specified set of causal relationships generated the data at hand, then the covariance matrix of the data must have a certain structure. If we refute the hypothesis that it has a given structure, we refute the assumption that the specified set of causal relationships generated the data. Unfortunately, failing to reject the hypothesis of the specified structure (the desired outcome unless you are examining a rival theory) does not imply that the specified set of causal relationships holds. [b]In particular, in many models, the causal arrows can be reversed without affecting the fit[/b] (see e.g. Hershberger, 1994; MacCallum et al., 1993). More globally, a completely different structure may exist that also explains the observed covariances.

I’m with ‘jfk’, above. No-one has the ability to extract meaningful information from the pile of garbage that is the Lewandowsky Survey. The flawed methodology renders any statistical analysis irrelevant.
Still, there is no reason why you cannot have a bit of fun with numbers …

Talking of fun, the fact jfk evidently survived Dallas gives a whole new spin to that area of conspiracism. Or is it too soon for humour in that area? (My more serious view is that the assassination and its aftermath acted as if designed to give rise to the growth of conspiracism – and more broadly, decline of trust in authority – in the US and West generally in the last fifty years.)

But I disagree with jfk in any case. Although the paper will fail because of its lack of rigour in the definition of both sceptic and conspiracy theory, it’s entirely right for Steve, Roman and co to go the whole hog with the replication and auditing of the stats. The fact it’s fun, as you say, is last but not least in my own read-eval-print loop.

Oh, it took me a minute to figure that out. Although I have the same initials, I am not the 35th President of the United States, whom I definitely believe was shot and killed in Dallas in November 1963. Unless he’s being cryogenically preserved with the aliens at area 51…

An analysis of the methodology that Lewandowsky (et al) employed to turn some crafty online questionnaires into a paper in Psychological Science (and some remunerative grant applications) will be extremely interesting.

I am looking forward to the self contained, but transparently open R script, in Steve’s great tradition.

And if the methodology can be suitably decoded, unemployment will effectively become a thing of the past. Everyone will be able to become a cognitive psychologist, turning questionnaires, dubious correlations, and topical causes into grant applications for the good of society, forever.

I think you’re missing a bit of code in that file. When generating your version of his Table 1, you have the line, U=get_latent(lew). That’s seems like a call to a custom function, but there’s no corresponding function provided.

Those questions about the proportion of different professional groups that believe the different statements are fascinating in themselves, but very hard to understnad what they actually mean. On the smoking causes lung cancer question, I would *HOPE* that 100% of medical students think that, but even if they didnt, it would still be irrelevant (ie. truth is not determined by the size of the group, nor their status).

It seems to me that the aim of those questions was to trap people – Look, Medical students are the people who are experts on their area and you accept thier views, so for you to not accept the views of climate scientists in their area makes you irrational. The flaws in this argument are manifest. Somehow in Australia, this notion that we should defer to experts seems to have taken hold rampantly. It’s a very worrying phenomenon and utterly emasculating to the general population.

“On the smoking causes lung cancer question, I would *HOPE* that 100% of medical students think that, but even if they didnt, it would still be irrelevant (ie. truth is not determined by the size of the group, nor their status).”

I would hope that medical students would recognize that smoking sometimes causes lung cancer, that smoking therefore increases the risk of developing lung cancer, and would therefore recognize that the statement “smoking causes lung cancer” is not always true.

Steve: An overly pedantic answer is not suited for a survey like this,

His paper specifically says he used Mplus for his SEM analysis. Given that R has three or four SEM packages, I’d say his use of R may be spotty. (Or perhaps Mplus is the SAS of his field and he uses it to add credibility to his papers.)

We additionally show that endorsement of a cluster of conspiracy theories (e.g., that the CIA killed Martin-Luther King or that NASA faked the moon landing) predicts rejection of climate science as well as the rejection of other scientic findings, above and beyond endorsement of laissez-faire free markets.

(my emphasis)

This looked odd to me from the time I first looked at the data – if free market opinions show a stronger correlation with the CO2 questions, and the CY and CO2 correlations are only marginally significant when considered alone, would this latter correlation persist ‘above and beyond’ free market opinions? I assume L et al. meant by this that they have somehow corrected for free market opinions in calculating the CY – CO2 relationship.

Could one do a multivariate analysis ‘allowing for’ free market opinions? I was planning to try this but have not had time (or maybe expertise!).

Mplus is a very widely-used and IMO excellent statistcal package for SEM and a whole host of other multivariate methods. That is the only positive aspect of the Lewandowsky research I’ve thus far been able to detect.

He also misrepresents how sceptics view with a global warming, with a graph produced by ‘Skeptical Science’ no sceptic as far as I’m aware has ever produced such a graph. ie a total misrepresentation of sceptic views (with zero evidence provided that this is a true view)

I have added the following comment, lets see how long it survives at @STWORG :
—————————

1# Barry Woods at 20:06 PM on 17 September, 2012
I am unaware of any sceptic producing a graph that you show that describes how they see global warming.

In fact, I notice you reproduce a graph of how the ‘sceptic debunking’ blog Skeptical Science’ ‘thinks’ sceptics view global warming. Perhaps you could ask a sceptic, rather than produce this graph where [it] provides no evidence that any sceptic thinks this..

A graph that might be more closely looked at, is the IPCC graph that is frequently described as showing accelerated global warming. Where the trend lines were inserted and the graph produced after the expert IPCC reviewers had seen a totally dfferent trend line..

Profs Lewandowsky and Oberbauer have a post up providing a statistics lesson for all the sceptic blog contrarians. These people are apparently not up with the statistics in their paper and haven’t progressed above a naive reliance on Excel.

This fact frequently becomes apparent in the climate arena, where the ability to use pivot tables in Excel or to do a simple linear regressions is often over-interpreted as deep statistical competence.

As they point out, and your can almost hear the exasperation in their voices –

…no one who has toyed with our data has thus far exhibited any knowledge of the crucial notion of a latent construct or latent variable.

There is something so terribly depressing about the tone of Lew’s posts. I’m a scientist (a real one with lots of qualifications who does experiment, stats and stuff). I’d never address people who criticised my work in the way he deals with folk asking questions of his work.

The reason I wouldn’t behave that way is not through ethics or principles. Rather that behaving that way is unscientific and achieves nothing. Writing from the elevated position of his own heavily moderated forum he’s simply indulging his ego.

That said, were I in his position, I’d be worried now. Steve is tenacious and has an intuitive feel for data analysis and stats. Lew had better hope his work is reproducible. Then again maybe he doesn’t care. One reason for me becoming what is termed a “skeptic” wasn’t that Mann et al. screwed up their PCA with a biased algorithm (this type of thing happens), its that they refused to acknowledge it even when the evidence was irrefutable. A healthy scientific field should self-correct and evolve. This field encourages its members to man the barricades and shoot outwards.

Ok, much of the above is speculative thinking and a little off-topic. However, I think the general theme of this post i.e. that members of a field ought to behave in a mature way with critics, facilitate reproduction and acknowledge errors (should they occur) is a valid one for this thread.

Minor comment, James. It’s not his own forum if it’s paid by public funding. Do I also see the contradiction that the journal “Psychological Science” is a free enterprise effort run by Sage Publishing, involving a mate of Lew’s, when Lew’s survey is harsh on people who promote free enterprise thinking? It’s too jumbled for my mind. Must need psychanalysis.

I agree. To clarify: there is a tendency to accept very low explanatory power (e.g. R2=.2) if the p value is “significant” (howls of outrage from Briggs here), to use vague and contradictory terms, and to get all into factor loadings gobbledigook as if they had some meaning. This is particularly bad in the use of such survey data, less bad in experimental areas (but in experimental areas they tend to overgeneralize simplified setups to the world).

1. I think it’s quite obvious what was going on. The Lew-Cook SkS bunch got a cunning plan to produce a meme that could be used all over and over again – that the skeptics are scientifically proven to be conspiracy loonies. The problem is – the “science” was not that easy to produce, so they were, let’s say creative…

2. In the good spirit of the Climategate de Freitas case (claimed afterwards by the Team to have just been a case of a justified action against an editor accepting junk science) – should the same fate not be expected and demanded by the Team for the Lew et al publisher?

Steve, thanks so much for this work. When it comes to scientific replicability, you’re into climate science like a chainsaw is into wood.

It seems like a trivial exercise to come up with a survey methodology that would better test some honest hypotheses about science and the respect thereof than Lewandowsky has done with this tendentious smear. Lew, if you have any questions, give me a holler.

In breaking news from the WorldShaping meme factory, it now transpires that the title of the Lewpaper is in fact “a joke, intended to convey the paper’s finding that -some- people have a tendency to substitute fables for facts and that this fallibility can be found spanning different topics.”

JerryM (Sep 17 13:25),In breaking news from the WorldShaping meme factory, it now transpires that the title of the Lewpaper is in fact “a joke, intended to convey the paper’s finding that -some- people have a tendency to substitute fables for facts and that this fallibility can be found spanning different topics.”

I notice that Lewandowsky is essentially saying too himself in the “Drilling” post:

We put the “(climate)” in parentheses before “science” because the association between conspiracist ideation and rejection of science was greater for the other sciences than for climate science.

But it seems to me if you take out the “climate” parantheses term then – “NASA faked the moon landing – Therefore Science is a Hoax:” is an even more extraordinary claim 😉

This reminds me of Lewandowskys criticism about the ensuing “focus” on the mentioned, yet un-named, skeptic blogs in his paper.

One might therefore presume that attention would focus on those blogs that provided entry points to the survey, not those that did not, because it is entirely unclear how the latter might contribute to the results of the survey.

So you see that point is trivial now and we are silly to focus on it!;)

But Lewandowsky gets to have his fun tweaking the nose of his targets by making these jokes! Take this with his prior stance and his article laying out his beliefs it seems to me that there is ample material here for making a claim for prior psychological bias not been considered by his team – I maybe naively assumed that psychologists would be on tenterhooks worrying about this claim more than any other 😉

The SEM system he has applied is predicated on prior assumptions and models that, from my layman perspective, I can’t see anything that looks like a clear grounding of how he tested for their reality or any bias in his choices in his paper.

“What about climate? The correlation between the Moon item and the “CauseCO2” item is smaller, around -.12, but also highly significant, p < .0001." Lewandowsy et al.

The authors are certainly well aware that the "significance" of this correlation is beside the point. With very large degrees of freedom virtually any correlation is likely to be "significant". The relevant question is whether the size of the correlation is at all interesting. Even accepting the doubtful premiss that this is a sensible way of computing the correlation in question, this one suggests that about 1.5 percent of the variance between the two variables is shared. Most sensible people would see this as not very much. Add to this the fact that it is a computation over a data set comprising strings of ordinal values with a very truncated range and you start to worry. Add to this the acknowledged extreme vulnerability of the data set to the effects of rogue data points (and there are many) and you start to worry more. Add to this the fact (I think "fact" is justified) that the data were collected in a sloppy way, and worry segues into despair. I am a professional psychologist. Psychology has contributed a great deal to the practical application of statistics and is justifiably proud of this fact. It is extremely unfortunate (I am tempering my words) to see my discipline made into a laughing stock by "green" activists. The more so, because the authors of this paper with their professional hats on are very well aware of all these points and chose to ignore them because of their somewhat warped perception of the greater good.

My experience of academic life in Australia was that fellow psychologists had a commendably low threshold for detecting nonsense and a robust way of showing it. Where are they now?

Steve: Lewandowsky observed today that there was a negative correlation of -0.25 between CYMoon and CauseHIV, sneering at those who looked at contingency tables (in academic parlance, “Excel spreadsheet” – which, BTW, is not used by me or Roman or other main commenters use for statistical analysis – is a term of contempt). But in this case, it’s Lewandowsky’s analysis that is contemptible as the contingency tables shows the problems with trying to represent the relationship between these two very sparse variables with a linear correlation relationship.

I think there’s a far more damning criticism on this point. As I just commented over on that blog post (it will be interesting to see how moderators respond to it):

The signal turns out to be there and it is quite unambiguous: computing a Pearson correlation across all data points between the moon-landing item and HIV denial reveals a correlation of -.25.

This sounds much more meaningful until you realize you get a similar negative correlation even if nobody believes the moon landing was faked. That’s right. Remove the ten responses that agree or strongly agree the moon landing was faked, and you get a correlation of -.21. And if you only remove the two most conspiratorial responses (the ones many people say are fake), the correlation is -.23.

The same pattern happens with “the Moon item and the ‘CauseCO2’ item.” The full data set gives a correlation of -.12. That drops to -.08 if you remove the two most conspiratorial answers. It then only drops to -.07 if you remove every answer claiming to believe the moon landing was faked.

Keep this in mind when hearing about correlations in this post. You can find correlations between people’s belief in a conspiracy and other things even if nobody believes in the conspiracy.

I’ve updated the blog Retraction Watch, it will be interesting to see if they start to probe the matter now? Lewandowsky has had his “science by PR” and news media already but has the journal said anything yet?

The website of the Psychological Science journal http://pss.sagepub.com/ where this is or was presumably going to be published has a section called “Online First” where subscribers can download upcoming articles. It was last updated Sept 13 and Lewandowsky’s paper is not on that list, nor does it appear on the list of articles in the September issue.

I wonder if this paper was just intended as a prank, never submitted or even meant for publication?

An organization called Psychology for a Safe (sic) Climate, with Skeptical Science and a handbook for “Debunking myths”, and featuring an About Us of:

“We are based in Melbourne, Australia
“Our Co-ordinating Committee has the following members:
“Carol Ride (convenor) – a psychologist and couple psychotherapist, who has been involved in the climate movement in Melbourne for the last few years
“Libby Skeels (treasurer)
“Rosemary Crettenden – a psychologist and psychoanalytic psychotherapist in private practice in Melbourne
“Sue Pratt”

I do seriously wonder if they are putting this paper out there to distract Steve from a more important paper somewhere else.

And since, as far as I can tell, Lew’s paper proves that conspiracists are more likely to believe in climate science, the tie in to SkS makes a LOT of sense.

Are you guys paying attention to what Lew is doing on his “drilling myself a deeper hole” blog article?

In response to the criticism he is taking over using fake data from warmists posing as skeptics on his survey, he puts up this article which makes use of that fake graph purporting to show “what skeptics believe” that was created by his warmist buddy Cook. Its fake turtles all the way down, apparently.

And get the icing on this cake: A commenter to that post (“Barry Woods”) points out that no Skeptic actually believes what that graph claims.

Another commenter (“GrantB”) disagrees, saying “Barry, you are mistaken. The graph comes from a true sceptical science blog,” and links to Cooks SKS website as evidence. There is a Moderator Response and editing notation on that post, but he lets the misidentification of Cook’s website as a “true sceptical science blog” remain uncorrected and without comment.

Barry, you are mistaken. The graph comes from a true sceptical science blog. One that has no connection with this site whatsoever. Although they were kind enough to email me my forgotten password for a login here (-:)

“One that has no connection with this site whatsoever.” was moderated out leaving the impression that I do believe SkS is a true sceptical science blog whereas SkS is a sceptical science blog in name only.

I was surprised however that they left the bit about the forgotten password for shapingtomorrowsworld (from Skeptical Science BTW)

This snipping of one sentence to change the meaning is arguably worse than disappearing Tom Fuller’s comments after others had responded to them. But in all this I detect (from the likes of Eli Wabbit) that there is a pretence that they are only doing what the CA or WUWT moderators do. Steve sometimes chops out part of a post here but what remains is a faithful record of that part of what was said. And he lets critics have their say. This is utterly different. What is being practised is pretence at every possible level.

Yes, the faked ‘skeptic graph’ represents a new low. It also seems to be a desperate attempt to change the subject.

The deletion and deceptive editing of comments is just what is to be expected from John Cook. Don’t forget this occasion where he completely re-wrote the blog post after comments had been added, then inserted responses under the comments to imply that the commenters hadn’t read the post properly.

I was reading about the severe down-ratchet of engineering standards in modern ship design the other day when I came across the quotable quote: “Without replication it isn’t science, it’s just advertising”.

“It is very difficult to believe that the title is anything other than a deliberate attempt to be offensive so as to draw attention to a paper of poor quality, but which is thought to be useful for “messaging” in the climate wars. Steve McIntyre has incorrectly attempted to infer a moral condemnation of Lewandowsky from certain of my comments (now corrected). Let me leave no-one in any doubt. In choosing the title of his paper, Lewandowsky not only acted unscientifically, but immorally as well. It was a despicable act.”

When I brought to attention Cook’s quote mining of Pat Michaels’ article to make him look stupid, one of their members said, in the private forum:

To underline the seriousness of the issue, the missquote that first raised my awareness of the problem is as serious, in my mind as the Wegman plagiarism issue (excepting only the intended audience in that the Wegman report was a report to Congress, while SkS is only a blog). The only reason I can continue to associate with SkS given the extent of misquoting involved is my conviction that the misquoting was done in error.

@Wayne2
The paper is also available at uwa. In both places, the cover page says: “In press, Psychological Science”

I don’t understand how a paper that’s “in press” at a supposedly major journal is already widely available online in other places. And I don’t understand how, if it’s already “in press” at the journal, it doesn’t appear in the “forthcoming articles” list that Psychological Science makes available online to its readers ahead of printing, here: http://pss.sagepub.com/content/early/recent

I am beginning to suspect that this paper was either flatly rejected by the journal, or that it was never even submitted.

On Sept. 14 on his blog (STW run by SkS) Lewandowsky stated that an ‘extended’ supplemental information was being prepared. He also referred to “typesetting” for the article which was not yet complete. One wonders about the process with the journal leading to the additional info being provided — did the journal editor(s) request or require it? Has the article’s publication in fact been delayed (since it does not appear as of the Sept. issue) or is it on the original schedule? Evidently the normal peer review of the journal did not require the added SI, else it would have been ready when the article was announced to various journalists as “in press”…. can anyone get info from the journal about the process of extended “peer review” (e.g. Climate Audit) which led to the SI “being extended” now at such a late moment? They may pretend it is all normal science, but how many times has the journal Psychological Science recognized any such concerns after an article was already “in press”? Why is it so difficult for Prof. Lewandowsky and their co-authors to honesty acknowledge that the article was not ready for prime time? He also mentions that he is unwilling to release other info that is subject to an FOI request, so presumably he does not want to “pre-empt” (his term) the FOI process in case he can get away with not releasing more?

[emphasis added]

64. Stephan Lewandowsky at 22:04 PM on 14 September, 2012

Questions continue to be raised for further information relating to this paper. My response is threefold:
1. I see little merit in treading over ground that is already clearly stated in the paper (e.g., the elimination of duplicate IP numbers).
2. Several questions concern material that is presently subject to an FOI request. I will let that process run to completion rather than pre-empt it.
3. The supplementary online material for the article is being extended to contain additional information (e.g., the outlier analysis from the preceding post). The online supplement will be released when the typesetting of the article is complete.
Time permitting, I may also write another post or two on topics relating to this paper that are of general interest.

The online supplement will be released when the typesetting of the article is complete.

Hmmm … let’s see … this purportedly “recently published … forthcoming” paper has been “in press” since July. Considering the prices they charge, one would think that Sage could afford to spring for some up-to-date typesetting technology in order that such ground-breaking “research” could be published in a more timely manner.

It does seem to be quite a mystery as to why the article should be so long “in press” — why it should still need “typesetting” now in mid-September 6-7 weeks after it was announced as “in press” — and why an “extended” SI should be needed now on an article already passed on by peer review. Why does it not appear in the Sept. 2012 issue or in the online list of forthcoming articles? The journal’s editor(s) ought to explain whether withdrawal or revisions are being required, whether there will be new peer review or not etc. When the Guardian’s Adam Corner and other journalists trumpeted this paper, was it or was it not in publication status as “in press”? Has that status changed in any way?

I find it hard to believe that, in a world of Troff, TeX, LaTex (which is specifically for the publishing needs of the world of academia) not to mention Adobe’s various applications, Psychological Science needs more than a few days for “typesetting.”

sorry all, I forgot the blockquote to ensure it was clear what is being quoted from the STW@SkS (my way of referring to the Lewandowsky blog, that needs a (TM) trademark and a Twitter account now!):

[emphasis added]

64. Stephan Lewandowsky at 22:04 PM on 14 September, 2012

Questions continue to be raised for further information relating to this paper. My response is threefold:
1. I see little merit in treading over ground that is already clearly stated in the paper (e.g., the elimination of duplicate IP numbers).
2. Several questions concern material that is presently subject to an FOI request. I will let that process run to completion rather than pre-empt it.
3. The supplementary online material for the article is being extended to contain additional information (e.g., the outlier analysis from the preceding post). The online supplement will be released when the typesetting of the article is complete.
Time permitting, I may also write another post or two on topics relating to this paper that are of general interest.

Richard, the article being “In Press” means that it has been accepted for publication; some journals then allow the authors to trumpet the results in the press. However, publication itself can take time.

RayG, it’s not about the typesetting time. “In Press” means that it’s been lined up for publication, but busy journals often have several months of material slotted in already; so you get accepted for publication in, say, March but you won’t come out in publication until August because the journal already has months worth of material in the pipeline. Some journals (e.g. JAMA or BMJ) put a strict embargo on press work before that date; others (like AIDS or the Lancet) publish “on-line first” versions that are low-res but come out months ahead of the print version; some allow pre-publication press releases and preprints but don’t publish the article. Nothing that is happening here is evidence of anything sinister.

Do you understand how damaging it is for your argument that you insist on seeing a conspiracy in every single aspect of this paper? In this case you’re seeing a conspiracy theory in the business model of a private journal publisher. Is this really the best way for you to use your “skepticism”? With every one of these comments (and there are lots) you merely confirm the findings of the article.

Not only am I not qualified to make a comment about our hosts’ use of statistics here, I have to confess that I understand less than an iota of his arguments!
What I do understand, in trumps, and probably in line with the majority of readers of this site, is that Mr mcIntyre positively confers an aura of total integrity with his findings.
In short.I just trust him. I know I’m not alone.
Thank you.

Not only is Lewandowsky incapable of constructing a sample, as jfk notes, he can’t even write a questionnaire which tests his hypothesis. His conspiracy questions do not discriminate between conspiracy and non-conspiracy, but between the official and one or other of many unofficial versions of events. Of course 9/11 was a conspiracy; the question is, who organised it, Bin Laden or Dick Cheney? Did the Royal Family kill Princess Diana or was it MI5? Was MLK perhaps killed by an FBI agent acting alone? and so on.
At the most, his research might have demonstrated that people who disbelieve the government on one thing tend to disbelieve their version of another thing. And he couldn’t even do that properly.

Note: Moderation on Lew’s blog has reached new heights of absurdity, with commenters being told effectively: “Disagreement with Stephan is off-topic. If you don’t want to be deleted, please change your mind”.

One might have hoped that such an improbably titled paper as “NASA faked the moon landing—therefore (climate) science is a hoax: An anatomy of the motivated rejection of science” would have deserved no more than a passing glance before being ignored as obvious junk. Yet this trivial piece of “research” has prompted no less than 9 posts and 845 comments on this blog. Are some of us losing the plot, perhaps?

In Australia, Lewandowsky has influence in what passes for a climate debate in the media, and is regularly quoted. To ignore this paper would have allowed the warmists and their media allies a free hand to denigrate sceptics as crackpots, with a ‘peer-reviewed’ paper as support.

The point of Lewandowsky et al (2012) is to create a media meme of “skeptics about climate alarmism belong in a loony bin not in public debate.”

Yes the research and analysis are garbage, but the issues at stake are of great significance to how the debates and policies proceed.

Allow this fake propaganda meme to e acted widely and skeptical arguments and analysis can be ignored. Lewandowsky knows exactly what he is trying to do (but he did it so badly)’. Steve and other climate science bloggers older stand what is going on with this, and it is important for setting the terms and direction of future debate.

Same when Tom Curtis went nuclear. A stunned silence (esp. from the mods), and then rally the troops (those who aren’t heading swiftly for the hills) for the usual dull & diversionary “Look it up, I’m not going to do your work for you” answers to perfectly polite questions.

As someone posted in response to one of Lew’s YouTube videos “He’s really not very bright”.

That encapsulates this whole episode for me. As my children might say: “Epic FAIL”.

I leter asked them to embed this graphic, an impecable source (but no)

My point being, I have no idea what is going to happen to temps in the next 8 yrs (by 2020)
And scientists and sceptics are very interested in the reason for the: insert acceptable phrase’, in 21st century temps.

On a wider note, if as I do, think it likley that globals in 8yr time have a very high chance of being in the range +/- 0.1C (or 0.15C even)

I think everyone will have moved on,(whilst I imagine some at CRU, are hoping the graphs shoot up)
as what will that CRU graphi look like in 8 yrs – if only +0.1C (or less) to a member of the public or a politician

I would point out at @STWORG amongst climate scientist, that it is perfectly acceptable to talk about/research reasons for the ‘acceptable phrase (stalled, paused,slowed, etc) but they would just accuse me again of ‘posting falsehood’

from a Met Office bio (when it was last updated 2010)

“Part of [I sniped name to spare his blushes] duties involves responding to queries about general climate issues, particularly so called “skeptical” or “contrarian” arguments and views. “

so he is not one of ‘us’ (so caled, not even allowed to be called a proper sceptic, John Cook’s approach)

the scientist at the Met Office continued:

“being part of a team examining the possible reasons for the lack of substantial warming for the 10 years after 1998 and being a contributing author to the IPCC 4th assessment report.”

Does anyone think if I used ‘lack of substantial warming’ for 21stC observed temps, would be an acceptable phrase for the people at Shaping Tomorrows World…….

Try Spearman correlations, or running through the SPSS/MLWiN manuals looking for every form of non-parametric correlation option and implementing it in R. The problem is not that his results are wrong, but that the methods in the document you are working from don’t describe what he did.

No, I’m saying that his methods are buried in an online supplement that Mcintyre has no access to, so until the supplement gets published Mcintyre is going to have to guess at them. Mcintyre could ask nicely, but guess what? He accused lewandowsky of fraud before he searched his inbox, which is commonly referred to as “burning your bridges”. So now he has to make it up his audit as he goes, and I’m dropping him a hint as to how to do it.

If you have no professional standing you need to ask very nicely before you can get access to other people’s work. Accusing them of fraud before you have even checked your email is not the way to do it.

Steve: I did not accuse Lewandowsky of personally being complicit in submitting fake responses. I think that the methodology resulting in his use of fake data was beyond poor. But please do not put words into my mouth or over-charge the situation.

I would suggest that it is not a question of “fraud” as much as an indication of the shortage of statistical competency. Here is what Lewandowsky et al have to say about “faked” responses [bold mine]:

Another objection might raise the possibility that our respondents willfully accentuated their replies in order to subvert our presumed intentions. As in most behavioral research, this possibility cannot be ruled out. However, unless a substantial subset of the more than 1,000 respondents conspired to coordinate their responses, any individual accentuation or provocation would only have injected more noise into our data. This seems unlikely because subsets of our items have been used in previous laboratory research, and for those subsets, our data did not differ in a meaningful way from published precedent. For example, the online supplemental material shows that responses to the Satisfaction With Life Scale (SWLS; Diener, Emmons, Larsen, & Grisson, 1985) replicated previous research involving the population at large, and the model in Figure 1 exactly replicated the factor structure reported by Lewandowsky et al. (2012) using a sample of pedestrians in a large city.

Gaming the survey does not require a conspiracy of coordinated responses. As others have clearly indicated, a single self-selected individual can submit any number of “faked” responses. Choosing the “correct” answers for the purpose is so easy that even a CAGW supporter can do it without conspiratorial direction.

Furthermore, the statement “would only have injected more noise to our data” is both misleading and completely false. The unfortunate signal-noise meme blurs the difference between “noise” which is random and non-directed and “noise” which introduces uncorrectable bias into the sample. The “individual accentuation or provocation” in this case is strongly directed to the extreme answers in the survey given the nature of the blogs which posted links to the survey. Since such answers are chosen by a very small percentage of the subjects, the resultant bias becomes very large. Hence, even those “gamers” who submit just a single falsely answered survey will have an effect.

The link to their “previous laboratory research” is also a prize. Just what is the tolerance limit for “exactly replicated” – did they use the exact same data? 😉

Give it up, Steve. You posted an article on Sep 8th entitled “The anatomy of the Lewandowsky scam.” Accusing someone of running a scam is the same as accusing people of fraud – careful choice of words (“I never accused him of making up the data himself,” blah blah) doesn’t get you off the hook. If you want to audit people’s work you need to start off from a slightly less biased position than that. But it’s part of your shtick isn’t it? Offend scientists you’ve never met, then accuse them of having something to hide when they don’t cooperate with you.

RomanM, your points about noise are largely irrelevant in the context of a non-substantial subset. And Lewandowsky’s paragraph was written in the context of his knowing how many people posted from the same IP address. Or are you seriously suggesting that a single individual submitted responses from more than 2 IP addresses? You imagine some hunched up, bitter little warmist travelling from library to library and filling in the survey on 100+ different PCs?

This comment is the clearest example yet of how ridiculous the position here is: you are saying that this article linking skeptics and a willingness to believe dubious conspiracy theories cannot be right because of a conspiracy of warmists to game the results, with no evidence. When the article finally gets published, please oh please do write a response letter to the same journal in which you outline this evidence-free conspiracy theory. It’ll really help to support the findings.

Your comments about “exactly replicated” are disingenuous. You know that “exactly replicated” in this context means producing the same structure, not the same numbers. Are you even capable of uttering a single sentence that doesn’t either deliberately misrepresent someone, or contain a conspiracy theory?

So that’s the way to do it then. Keep your powder dry with respect to the analysis until you collect the data. Don’t even give a prior hint of how you might analyse the data. Then collect the data and attack it from all directions. Pearson, OLS whatever. If everything you try fails then as you say –

Try Spearman correlations, or running through the SPSS/MLWiN manuals looking for every form of non-parametric correlation option and implementing it in R

Sooner or later something will pop up.

BTW are you the same faustusnotes that was having a monumental whinge over at shapingtomorrowsworld@SkS about your comments never getting a guernsey at Climate Audit?

No, GrantB, the way to do it is you write a methods section for the paper, but if the journal has a word limit and you have a lot of results you have to use an online supplement for most of your methods. This is called “normal practice” in publishing. But you would know that, right, because you’ve published lots of articles in social sciences journals?

Incidentally, Pearson and OLS are the same thing – but you knew that too right? And your reading ability is to such a high level that you understood immediately that I’m trying to help Mcintyre with that comment about MLWiN manuals, not giving a comment on how Lewandowsky did his work. Right?

I’ve just responded to another two comments that show exactly teh tendency Lewandowsky describes in his paper – leaping onto conspiracy theories before even trying to understand the science. REally, if you wanted to prove him right you couldn’t have done a better job.

Regarding the article by Lewandowsky et al, I examined the dataset of 1145 cases available online, and I presume actually used in the paper. My focus was narrow – the 5 Climate questions (CO2TempUp, CO2AtmosUp, CO2WillNegChange, CO2HasNegChange, CauseCO2) and how they relate to the moon landing hoax question (CYMoon). I removed 8 cases because the responses to the Climate questions were illogical, most likely a result of error. In my opinion the nature and quality of the data would suggest a more simplistic and restrictive analysis than undertaken by Lewandowsky et al. Accordingly I recoded the 5 climate questions into new variables where disagreement (1 or 2) and agreement (3 or 4) were rescored as 1 and 2 respectively. The 5 climate recoded variables were summed, and respondents scoring 5 (DG, n=113) and 10 (AG, n=876) identified. The remainder scoring between 6-9 were labelled UG (n=148)

When variable CYMoon is analysed with respect to the DG (mean 1.14 +- 0.048 [sem]), AG (1.08 +- 0.011) and UG (1.08 +- 0.23) groups, there is no statistical difference between the 3 groups based on either the Kruskal-Wallis (P=0.485) or the less appropriate Oneway Anova (F=1.921, P=0.147) test. The lack of statistical significance between the groups’ Climate change perceptions (no-DG, yes-AG, changing-UG) and respective score on the moon landing ‘hoax’ is clear. Moreover, irrespective of statistical significance, that all groups – DG, AG and UG – had mean scores at the ‘strongly disagree” level (and all group median scores were 1) suggests that the title of Lewandowsky et al’s paper should be re-examined. A similar result was found when only respondents who consistently adopted the strongly disagree (n=42) or agree (n=472) positions for all 5 climate questions. That is the individuals at the extremes of the debate.

I believe the latter analysis is more meaningful than reporting a linear relationship exists between Climate vs CYMoon, with Pearson r = -0.087 (P=0.003), and which only accounts for 0.8% of the total variance; this reveals very little.

Mcintyre, are you using some kind of “elbow” criterion for choosing the number of factors? Because Lewandowsky is not, and as a result you have really, really embarrassed yourself here.

By not stating your methods clearly in this post, you have also opened yourself to accusations of dishonesty.

By using the wrong assumptions in constructing your analysis, you have also shown that you know nothing about how social scientists work, and probably should desist from your feeble attempts to audit them.

Also, in future, when you’re doing these audit posts, I suggest that you consider adopting an Introduction/Methods/Results/Conclusion framework. That will help you order your thoughts better, more clearly describe your process (avoiding accusations that you are using arbitrary or unclear methods) and give people with less skill in stats the ability to skip the hard parts and get to the conclusions.

I would have thought this system for presenting research results would be clear to you, but as it becomes increasingly apparent that you’re out of your depth with this stuff, I guess it would help you to revisit some of the basic principles that you should have learnt in your undergraduate days.

The overly pedantic respondent was probably me..:-) Sorry.. But it did seem that the with some exceptions (Moon Landing, HIV) it did seem that the more knowledge you had (or think you had) about one of the areas, the harder it was to give an honest answer to the question that didn’t involve making assumptions about what the questioner meant. On the 5 point slider you could at least claim to not care, on a 4 point scale that must skew things a bit.

Thanks very much for publishing the R scripts and providing access to the data. Statistics is not my field, but I have begun an academic study of R, and these are very helpful. Factor analysis has previously seemed a ‘black art’ to me, but you are making it comprehensible!

6 Trackbacks

[…] is drilling for noise, McIntyre has tried to get the same results as Lewandowsky’s paper by taking Lewandowsky’s noisy data and applying the same techniques listed in the paper. Replication doesn’t appear possible. It looks like the paper is a dry hole even though it is […]

[…] scientific results — haven’t been able to be replicated from the paper itself. As usual, Steve McIntyre’s careful approach to this is enlightening, and the code for his own attempts are published and […]

[…] components as explained variances from factor analysis, a very minor peccadillo in comparison. In a recent post, I observed inconsistencies resulting from this misdescription, but was then unable to diagnose […]

[…] components as explained variances from factor analysis, a very minor peccadillo in comparison. In a recent post, I observed inconsistencies resulting from this misdescription, but was then unable to diagnose […]