Since I became involved in the question of vaccines and autism — and more specifically the question of mercury in vaccines and autism — every week I’ve received a few identical e-mails from anti-vaccinationists that consist of a list of references. It’s always the same references and I’ve come to think of it as “The List.” Always on the top of The List is Desoto MC, Hitlan RT. Blood levels of mercury are related to diagnosis of autism: a reanalysis of an important data set. Journal of Child Neurology. 2007;22:1308-11. I read the DeSoto and Hitlan paper back in April and was skeptical about the reported results then. However, I heard from an epidemiologist friend of mind that Dr. Catherine DeSoto was extremely courteous and forthcoming in answering questions about the paper, so I decided to let my skepticism simmer for a while.

Then a few days ago an important paper in Science was published on Identifying Autism Loci and Genes by Tracing Recent Shared Ancestry. Naturally the Science paper was reported by hundreds of newspapers and other media outlets. One of the best newspaper stories was a Washington Post article entitled, “Mental Activity May Affect Autism-Linked Genes.” Unfortunately, the comment section after the Washington Post article was completely hijacked by antivaccinationists who insisted that vaccines cause autism and that genetic studies of autism are part of a cover-up of the truth. And once again, one of the commenters presented “The List” with the DeSoto and Hitlan on the top.

My simmering skepticism boiled over and I decided to to take a closer look at the DeSoto & Hitlan paper. Obviously you need a lttle background here, especially since the history of the Desoto & Hitlan paper actually involves at least three publications.
(1) In June 2004 the Journal of Child Neurology published an article by Patrick Ip, Virginia Wong, Marco Ho, Joseph Lee, and Wilfred Wong of the University of Hong Kong. Ip et al. performed a case-control study to compare the hair and blood mercury levels of 82 children with autistic spectrum disorder (ASD) and a control group of 55 normal children. (Important note: I am NOT going to discuss the analyses of hair mecury levels, per DeSoto & Hitlan’s statement, “The hair analysis data is, in fact, interesting. But it is of secondary importance.”) The ASD cases included all ASD children actively folowed up from April to September 2000 in the Duchess of Kent Children’s Habilitation Hospital in Hong Kong. All ASD children were assessed by Virginia Wong. The diagnosis of ASD was made only if they fulfilled the DSM-IV diagnostic criteria for autism and undergone a structured interview using the Autism Diagnostic Interview-Revised. The control group consisted of “normal children who had mild viral illness and were admitted to the pediatric ward of [Hong Kong’s] Queen Mary Hospital.” Ip et al. reported that there were no differences in mean mercury levels. The mean blood mercury levels of the ASD case and control groups were were reported to be 19.53 and 17.68 nmol/L, respectively (P = 0.15), a difference of 1.85. Ip et al concluded that “there is no causal relationship between mercury as an environmental neurotoxin and autism.” The authors also noted: the “blood mercury levels of both autistic and normal children in Hong Kong were elevated [compared to other populations around the world];” and “this study is limited by the sample size and culture because Hong Kong Chinese are famous for eating seafood.” (Ip P, Wong V, Ho M, et al. Mercury exposure in children with autistic spectrum disorder: case-control study. J Child Neurol. 2004;19:431-4. Erratum in: J Child Neurol. 2007;22:1324.)

(2) In May 2007 Dr. Catherine DeSoto wrote to the Editorial Office of the Journal of Child Neurology expressing concern about what appeared to be obvious inconsistencies in the data analysis of the results section of the Ip st al article. Dr. DeSoto’s specific concern related to the statistical interpretation of the data. Dr. Roger Brumback, the Editor-in-Chief of the Journal of Child Neurology contacted Virginia Wong, the corresponding author of the Ip et al article, and requested the original data. Professor Wong provided a spreadsheet of the original data. So all of the original data can be found in two tables at: Brumback RA. Note From Editor-in-Chief About Erratum for Ip et al Article. J Child Neurol 2007;22:1321-1323. These are the data that I used for my analyses below.

(3) At the request of the Editor-in-Chief of the Journal of Child Neurology, Dr. Catherine DeSoto and Dr. Robert Hitlan performed an analysis of the original data, which was published as a special article in the November 2007 issue of the journal. According to the abstract, DeSoto & Hitlan “found that the original p value was in error and that a significant relation does exist between the blood levels of mercury and diagnosis of an autism spectrum disorder.” A few details about Desoto & Hitlan’s analysis would be in order here, but there aren’t many details. (This was the first reason that I was skeptical about the paper.) The authors do mention that they excluded two outliers that were greater than 3 standard deviations above the mean. I have absolutely no problem with this — in fact, I agree with DeSoto & Hitlan that it was a good idea. What I find unusual is that the authors mention only one of the outlying values — a blood mercury level of 98 nmol/L in the ASD group. I had to go to the original data to figure out that the other outlier they excluded, which was a value of 74 in the control group.

Here is the total extent of the results section regarding blood mercury levels: “Logistic regression was performed using blood mercury level as the predictor and the autistic/control group as the criterion. Results of this reanalysis indicate that blood mercury can be used to predict autism diagnosis. Data included: r = .20, r2 = .04, F(1,133) =5.76, P = .017. This finding indicates that there is a statistically significant relationship between mercury levels in the blood and diagnosis of an autism spectrum disorder.” That’s it for results. I’m going to skip any discussion of the r and r2, since they’re not immediately relevant to this discussion and they’re just complex enough to confuse a lot of people (but see below). This leaves us with an F-test from a logistic regression and a highly significant P-value. The authors don’t say which logistic regression statistical package they used. The F-test seems to be a test of whether the mean blood mercury levels of the ASD case group and the control group are different — the same hypothesis Ip et al. were testing — but this is unclear. Again this seems most unusual to me, but DeSoto & Hitlan do not provide the reader with means for either of the two groups. Fortunately, my epidemiologist friend (mentioned above) e-mailed Dr. DeSoto and she responded almost immediately with the missing information (which I’ve confirmed in my own analyses): With two outliers removed:

I just don’t undertand why DeSoto & Hitlan didn’t provide these data in their paper.

In any event, now we’ve learned a little bit more, but I was still skeptical of these analyses for another reason. Ip et al. state outright that they performed a student’s t test to compare the means of the two groups. DeSoto & Hitlan never come right out and say that they’re interesting in comparing means, but it’s certainly implied. However, a comparison of arithmetic means, and certainly the use of the t test, assumes that we’re comparing two normally distributed samples. Although I’d never analyzed blood mercury before, I have analyzed blood lead levels. In my experience, blood lead levels are never normally distributed. This is why we use geometric means and percentiles — not arithmetic means — when we report descriptive statistics on blood lead levels. So I was skeptical about whether blood mercury levels would be normally distributed in children from Hong Kong.

The first thing I always do — and I always told my students to do this — is to actually LOOK at the data. It’s tempting to start out by looking at the ASD cases, by my advice is that it’s wiser to check out the control group first. I’ve excluded an outlier, so there’s 54 controls. Since there’s an unequal number of controls and cases, it’s easier to compare the two groups if we use percentages instead of raw numbers. So here’s the percentage distribution of the control group:
Some people don’t like these skinny little bars that PowerPoint provides in its histograms, so here’s the identical data shown in an “area under the curve” type chart:
If these data are normally distributed, or anything close to normally distributed, than I’m Bernadine Healy. In fact, trying to choose a “measure of central tendency” for these data is pretty much hopeless. The arithmetic mean of 13.6 is essentially meaningless. (No pun intended.) There were 54 controls. 10 of the controls have blood mercury values of 5.0 nmol/L, which means 5 is the mode, but that doesn’t help us much either. 6 controls had a value of 8.0, but saying 8 is a second mode would be silly. The best thing to do is look at the data and describe what’s actually there: There’s a cluster of 36 controls with values between 5 and 14 nmol/L that’s very heavily skewed such that the mode of the cluster is 5. There’s a second cluster of 13 controls with values between 17 and 24. Then there are 5 controls scattered across higher values between 33 and 42. One useful aspect of looking at the controls first is that it gave me a opportunity to choose an unbiased cut-off point for my odds ratio analysis. Since the literature doesn’t provide definitive advice for a “high” blood mercury level for Hong Kong children, and these controls have this nice space with no values between 14 and 17, I decided to define greater than 16 nmol/L as a “high” mercury level for my odds ratio analysis. Now let’s look at the data for the ASD cases. Again this is a percentage distribution.
Once again we have a distribution of values that’s not even close to normally distributed. There were 81 ASD cases. 14 cases had a blood mercury value of 5.0. I suppose you could say there was a second mode at about 20 nmol/L, since there were 6 cases with a value of 20 and 4 cases with a value of 23. What does the arithmetic mean of 18.6 signify in a distribution like this? Very little, I think. Here’s a chart showing the percentage distributions of the ASD cases and the controls compared:
There were so many more blood mercury values between 5 and 10 than in any other intervals, I’ve shown these as individual categories. Then I’ve categorized blood mercury levels in 5 nmol/L cetegories. So: Is there a difference between these two distributions? And how would we characterize the difference? It looks like the main difference is that the ASD cases have more mercury blood values at the upper end of the distribution than do the controls. By “the upper end of the distribution” I mean values greater than 25 nmol/L. In fact, that’s just what’s going on. Of the 54 controls, there were only 5 children with blood mercury levels greater than 25 (and the greatest value was 42). Of the 81 ASD cases, there were 21 children with values greater than 25, with 4 values between 41 and 45 and a high value of 59. So how do we go about carrying out a “formal” statistical comaprison of these two groups. First, any analysis involving a comparison of arithmetic means, such as a t-test, or a logistic regression in the form that DeSoto & Hitlan used (with blood mercury entered simply as a “continuous” variable) would be wrong. Why? Because the blood mercury values of these two samples just don’t come from normally distributed populations or anything close. Second, it’s common to calculate geometric means for blood mercury levels. in these two samples, the geometric means were 11.1 for the control group and 14.4 for the ASD cases, a difference of 3.3. A formal statistical comparison of the geometric means would be a bit more complex, because it would involve a logarithmic transformation of the blood mercury values. But the purpose of the log transformation is to make the distributions normal and there’s no way you’re going to make these two distributions normal unless you get somebody to jump up and down on the bars at the value for 5.0 until they almost disappear. (Any nominees for jumpers out there? Certified data fudgers?) So formal comparison of geometric means would also be wrong.

This leaves us with an analytic method that makes no assumptions about the distributions of the cases and controls — the calculation of an odds ratio or odds ratios. Since this post is too long already, I’m not going to explain what an odds ratio is except to say (1) it’s the optimal measure of strength of association in a case-control study and (2) please don’t make the mistake of assuming that a prospective cohort study of blood mercury levels and autism would have found a relative risk, or risk ratio, or rate ratio similar to the odds ratios I’m about to show. To learn more about the odds ratio, read the article in the British Medical Journal series on medical statistics or Google odds ratio. The Wikipedia article on “Odds Ratio” is okay, but not great. For an explanation of confidence intervals, see “Statistical Criteria in the Interpretation of Epidemiologic Data” and “Beyond the Confidence Interval.” So here are the results of my analysis:

Odds Ratio
(with 95% Confidence Interval)

Using blood mercury cut-off point of 17 nmol/L
(above 16 considered high mercury level)

But wait. I felt a sudden disturbance in the Force, as though thousands of biostatisticians are writhing in agony because I used only two categories and “didn’t take advantage of all of the data.” So let’s do a trend analysis, using the value 5 nmol/L as the reference category (where OR = 1.00):

Blood Mercury

Odds Ratio

5.00

1.00

6 to 10

0.63

11 to 15

0.98

16 to 20

1.07

21 to 25

1.00

This isn’t a complete trend analysis, obviously. When I stop at 25 nmol/L, the chi-square for linear trend is 0,378 and the p-value is 0.54. One of the great things about entering data by hand and actually LOOKING at the data while you do it is that you can stop and notice certain things. Like, for example: in these data there’s no significant difference between the two distributions under 25 nmol/L. So any difference between the blood mercury distributions of the cases and controls is being “driven” by an excess of ASD cases with values above 25.

In order to do a proper chi-square analysis for trend, one really needs at least 5 individuals in each cell. So I had to group all the higher values together in one category at 26 nmol/L and greater:

Blood Mercury

Odds Ratio

5.00

1.00

6 to 10

0.63

11 to 15

0.98

16 to 20

1.07

21 to 25

1.00

26 and greater

3.00

Chi-square for linear trend = 5.897
p-value = 0.015

So the linear trend is statistically significant, but it’s completely “driven” by the 21 ASD cases with blood mercury levels of greater than 25 nmol/L. At this point there’s a real temptation to analyze the data using a cut-off point of 25. This is post-hoc analysis based on what we’ve seen in the data, so it’s questionable, but I’ll go ahead with it anyway:

Odds Ratio
(with 95% Confidence Interval)

Post-hoc analysis
Using blood mercury cut-off point of 25 nmol/L
(above 25 considered high mercury level)

Odds Ratio = 3.4

95% Exact CI: 1.1 - 12.4

Logistic regression analysis

Now that we have a much better picture of differences between the cases and controls, I think it’s okay to run a logistic regression analysis. These are the results:

Chi Square= 5.9955; df=1; p= 0.014

Odds Ratio = 1.04
95% Confidence Interval: 1.005 to 1.075

The odds ratio can be interpreted as follows: For every 1 nmol/l increase in blood mercury, the difference between ASD cases and controls increases by an odds of about 0.04. Note that this effect size is “on average.” There’s obviously no way of knowing simply from this effect size estimate (OR = 1.04) that all of the differences between ASD cases and controls occurs at greater than 25 nmol/L.

Conclusions

1. I want to emphasize that this post is in no way meant as an ad hominem attack on Dr. DeSoto or Dr. Hitlan or the Editor of the Journal of Child Neurology. I ask commenters to refrain such attacks in the discussion.

2. Indeed the main point of this post is that data analysts should “look before they jump.” Look at the data carefully using visual methods like the charts above, or carry out detailed cross-tabulations, before you jump in and start running logistic regressions, etc.

3. I’m not making any assumptions about what DeSoto & Hitlan did or did not do in exploratory or preliminary analyses. But all I have to work with is what’s in the published paper. The paper is four pages long, yet only one 8-line paragraph is devoted to the main result. On the other hand, three relatively long paragaphs are devoted to lecturing Ip and colleagues on why they (Ip et al.) should have used a one-tailed test.

4. This is a relatively small data set with weird and unstable distributions of blood mercury . Unfortunately, there are very few data sets with information on blood mercury that include both autism cases and a control group. Unfortunately, we therefore must to consider it an “important data set.”

5. The analysis of Ip at al. (2004) and the analysis of DeSoto and Hitlan (2007) in which the mean blood mercury levels of ASD cases and controls were compared were statistically inappropriate. Any argument that the statistically significant p-value found by DeSoto & Hitlan just goes to show the “robustness” of the t-test is absurd.

6. DeSoto and Hitlan (2007) concluded that “a significant relation does exist between the blood levels of mercury and diagnosis of an autism spectrum disorder.” I disagree. In my opinion, this statement is too strong.

7. What is my conclusion about what this data set tells us about the association between blood mercury and autistic spectrum disorder? Not much. I don’t think it shows a significant relationship. On the other hand (and this is important), I don’t think that it shows that there is not a relationship either.

In my pre-planned dichotomous analysis above, I found an odds ratio of 1.86, with a lower 95% confidence limit of 0.86. An odds ratio of 1.86 is of moderate strength, but this is clearly not statistically significant. The trend analysis shows that odds ratios are stable (i.e., consistently close to 1.00) until we reach blood levels higher than 25 nmol/L, when the odds ratio is 3.00. In a post-hoc analysis using 25 nmol/L, I found an odds ratio of 3.4, with a 95% confidence interval of 1.1 to 12.4. You can see the logistic regression findings above, but my opinion is that these are the least important findings of the entire series of analyses. We did find a “statistically significant” odds ratio of 1.04 (95% CI: 1.005 to 1.075; p = 0.014), but this tells us much less than the graphical analysis and the trend analysis of odds ratios.

Given these results from a case-control study with such a small sample size, these are really of the “more research is needed” variety. Again, my opinion: I don’t think there’s a significant relationship. Nor do I think there’s definitively not a relationship.

8. DeSoto and Hitlan (2007) report an r of .20 and an r2 of .04. They then devote part of the last paragraph of their paper discussing why an “effect size” of .04 is important. This would have to be a subject for a whole other post, but like most epidemiologists (and sociologists and econometricians), I consider correlational statistics like r’s and R2’s essentially useless as measures of effect. Class: for tomorrow, read the classic paper, “The fallacy of employing standardized regression coefficients and correlations as measures of effect.” I’ll probably do a post on the subject anyway, but be ready for a pop quiz.

9. We can conclude absolutely nothing about the association of ethylmercury in vaccines to autism from these data.

10. As usual, your questions and comments are welcome. Agree, disagree, or whatever, but be civilized.

Important note and apologies to Drs. Desoto and Hitlan, Ken, efrique, and my readers: The original article that I posted on Wednesday, July 16th, has been revised on the afternoon of Saturday, July 19th. I was somewhat puzzled by Ken and efrique’s comments. Then I realized that I had not published the final version of my post on July 16th, but an earlier draft. In other words, I screwed up. That’s what happens when you blog at 4:00 in the morning. Thank you, Ken and efrique for your comments.

Essentially the changes are these: I have performed my own logistic regression analysis, but I have NOT changed any of my conclusions. There are also a few changes in the paragraph in which I describe DeSoto and Hitlan’s Results section.

I only understood half of that on first reading but I get reason for your conclusion about Ip and Desoto/Hitlan being too keen to rush to make bold statements. Hopefully with a few more re-reads I’ll get it all!

I would however like to say what an utter pleasure it is to see someone doing what they do best.

Comments:
1. Logistic regression and F-test. How ? There is no error variance in a logistic regression.
2. r^2 with logistic regression is generally regarded as a waste of time (it is also difficult to define), so even more useless than r^2 usually is.
3. Confounding. Did I miss something because I didn’t see a mention ? Anyway mercury is related to seafood consumption. Now does seafood consumption relate to socioeconomic or ethnic factors and do these correlate with probability of autism diagnosis ?
4. For logistic and any other regression the distribution of covariates doesn’t matter. What does matter is the scale or need to transform that is defined by the relationship of the response to the covariate. Y=Ax b is making a statement about the relationship between y and x that is independent of the distribution of x. I can change the distribution of the covariates simply by changing the selection criteria, but this wont change the relationships. In practice right skewed data often needs a log transform but that is only a coincidence.

For visualization, I suggest you make a cumulative distribution plot showing how many autistic and control children have a specified mercury level or higher. This would better show how the bump for the high mercury levels is only a fraction of the total autism cases.

Interesting. Just speculating here, but the difference in the distributions could easily be indicative of a dietary confound. A good fraction of autistic children (the 21 there) might consume more fish than normal. It’s not hard to imagine that’s the case in Asia.

I understand there are studies underway where, after controlling for diet, no difference is found. The fact that DeSoto & Hitlan - and Ip et al. for that matter - don’t control for diet, limits the value of their data.

In answer to you questions:
1. Correct me if I’m wrong about this, but I think we have to assume that DeSoto & Hitlan used a logistic regression program that had an (unweighted) least squares option. The F test, etc. would be similar to what you would have gotten in the (very) old days before “logistic regression” packages even existed and we used “discriminant analysis” when we had a “continuous” independent variable and a dichotomous outcome. In other words, the calculations were not maximum likelihood. As I say in my post, a problem here is that they don’t even come close to saying precisely what they did or what program they used.

2. I agree with you about the r-squared most emphatically. Should be avoided especially when one (or both) variables is dichotomous (two categories).

3. Confounding: Since several commenters have asked about this, see my separate comment on this.

4. Yes and no. Yes, technically you’re correct. For example, one really CAN do a robot-like analysis like DeSoto & Hitlan, don’t describe the intricacies of the data at all, dump the data into a logistic regression program, and the p = 0.017 is technically correct. No, because there’s the danger of missing a lot, which DeSoto and Hitlan did. For example, I’ve been the journal editor on more than one submitted manuscript in which a significant logistic regression result (from a multi-category exposure variable) was completely due to a handful of extreme outliers. (Obviously, the authors had provided tables where I could see this). DeSoto & Hitlan come close to this (but we didn’t know it until somebody else showed the raw data.)

You are absolutely right that cumulative distribution plots would have shown a lot! I should have done that. In fact, I’ll try to post a chart with cumulative distribution plots in this comment section today or tomorrow. Thanks.

AGE & GENDER: It seems that the only data collected besides mercury levels was gender and age. The gender distributions of the ASD cases and controls are very similar: Excluding one outlier from each group, the 81 ASD cases had 73 boys and 8 girls; the 54 controls had 45 boys and 9 girls. The age distribution was similar, too. Mean age was 7.16 in the ASD cases and 7.79 in the controls. Age actually did act as a very minor confounder — the odds ratios I report above are reduced a very tiny bit when I adjusted for age, but it wouldn’t be worth reporting in a journal article. Even though Ip et al. don’t say anything about it, it occurs to me that they must have made some attempt to partially match the two groups on gender and age. Would you expect “normal children who had mild viral illness and were admitted to the pediatric ward” of a major teaching hospital to be about 85% boys?

DIET (SEAFOOD): This is the reason I put in the quote from Ip et al: “This study is limited by the sample size and culture because Hong Kong Chinese are famous for eating seafood.” I agree with all of you that this should be be taken into account in any discussion of these data. As a scientist, it upsets me that that the DeSoto & Hitlan paper made it almost straight from the pages of the journal in November 2007 to the mouth of Congressman Dan Burton in December as firm evidence that ethymercury in vaccines is causing autism in “our” kids.

Did Catherine DeSoto make any attempt to contact Congressman Burton or his staff to tell them that these were data from Hong Kong children and correct any of the misinformation? I’ve dealt with Congressional staff during my career; this an be done.

What about some boxplots? I am really not an expert on statistics but I am quite used to looking at them and made some for my own data a while ago. From what I see there shouldnt be a big difference and for such a small dataset I would certainly dismiss any conclusion.

Yes, box plots (also sometimes called “box-and-whisker plots”) are a great way to do Exploratory Data Analysis, especially of relatively small data sets like the Ip et al. data. If you go to the Wikipedia article on “Box Plot” (http://en.wikipedia.org/wiki/Box_plot), the References are great and the External Links are absolutely suberb. Three of the external links are about how to use Microsoft Excel to create Box Plots.

Thanks, Daniel. If I have the time I may create some Box Plots here in the next couple of days and show them here.

Had they established statistical significance, wouldn’t Ip et. al. and/or
Desoto simply have a strong case that children with the flu do not eat
much seafood in the 48 hours leading up to a hospital visit (as reflected in
slightly lower-than-average Hg blood levels)?

I laughed when I first read your comment. But, you know, this could be a real possibility. As in: upset stomach —-> don’t eat; or vomiting —-> nothing gets digested. Does anyone out there no enough about the digestion and metabolism of mercury from food to be able to say whether this could be a partial explanation for these results? I realize it sounds a bit ridiculous, but…

The protein that has the highest known binding coefficient for mercury is metallothionein. That is normally used to sequester zinc and copper. Metals are coordinated to multiple thiol groups, so it is unlikely there is anything else with a higher binding coefficient. This is the likely form of mercury when it is in tissues such as liver, kidney and perhaps brain. Methyl mercury doesn’t seem to bind to metallothionein, but methyl mercury does cause expression of metallothionein.

Most circulating mercury in the body is present as a glutathione-mercury complex (methyl mercury does form a GSH complex). The liver excretes glutathione (and other thiols) in the bile, and the organic anion transporter that exports organic anions into the bile is powered by a glutathione gradient. Excretion in the bile is a generic excretion pathway for many things that are coordinated to glutathione for that very purpose. Inside cells the GSH level is in the mM/L range. Plasma GSH is normally in the micromolar range.

Once in the gut many of those coordination complexes are broken and metabolized by gut bacteria. That is likely the mechanism by which mercury becomes de-complexed and bound to something that is excreted in the stool (the major mercury excretion pathway).

How metals bind to metallothionein is affected by redox, ATP, GSH and NO. NO is necessary for release of Cu and Zn from metallothionein. An acute infection is a high NO state (due to NO from iNOS). An immune system activation state is a state of oxidative stress. Metallothionein is upregulated under conditions of oxidative stress. Mercury and other heavy metals tend to cause oxidative stress. What that would do to mercury physiology is not well described. I think that a couple of days of being sick wouldn’t change the mercury level by that much. Mercury is used in some Chinese medicines. That might be another confounding factor.

Methyl mercury and elemental mercury tend to pass through the placenta, inorganic mercury tends not to. Typically cord blood has higher levels of mercury than does maternal blood, suggesting that the more reducing conditions in utero increase the partitioning of mercury from the mother to the fetus.

In the Faroe Islands studies of mercury, Grandjean P et al followed a ~1000 child consecutive birth cohort (1986-1987) with 996 cord blood mercury levels. The interquartile cord blood mercury levels were ~65 nM/L and 200 nM/L. That age cohort was tested for autism, and the 1404 children in that age group had 5 cases of ASDs, two of autism and 3 of Asperger’s.

The levels in the Ip et al study and the DeSoto reanalysis are so much smaller than what Grandjean found with no apparent excess of autism, it seems completely preposterous (to me) that levels in the 20 nM/L range could be important.

Upon rereading I find that you did mention the possibility of least squares for the F-test although I wonder why it is in any program. GLIM is over 30 years old and did logistic regression properly so maximum likelihood isn’t some new discovery. Usual suspicion when someone uses an inferior method is that it gave a better p-value.

There is a good explanation for the distribution. It looks like 5 was the lower limit of their assay which explains the excess at 5. It also has a strong appearance of a mixture. A 2-sample Wilcoxon seems as good a way as any of testing the difference between cases and controls, although some of the aren’t met.

You are correct. If your main interest is in statistical significance, a 2-sample Wilcoxon Rank Sum Test (AKA the Mann-Whitney Rank Sum Test) would be the correct test for analyzing these data. Excluding the two high outliers (a value of 97 nmol/L in ASD group and a value 74 in controls), and assuming I didn’t make any errors in my formulas in Excel, I get the following results from a Mann-Whitney test:

Mann-Whitney’s Statistic = 1730.0

Z statistic = 2.06

2 tailed p = 0.0395

Median difference = 3.00
95% Confidence Interval: 0.00 to 7.00

Note, however, that just doing this Mann-Whitney test doesn’t give effect measures like the odds ratios in the trend analysis. And obviously you don’t get the feeling for the data that you get from the above graphical presentations.

D: If you read this, could you check my calculations with the two outliers excluded? Thanks.

For visualization, I suggest you make a cumulative distribution plot showing how many autistic and control children have a specified mercury level or higher. This would better show how the bump for the high mercury levels is only a fraction of the total autism cases.

Bsci, you are absolutely correct.

This chart needs to be looked from RIGHT TO LEFT. As you move your eyes from right to left, you’re seeing the separate percentages of the ASD cases (red) and the controls (yellow) cumulatively added up until they read 100% at the blood mercuty level of 5.00 nmol/L.

One does not have to do an extensive evaluation of the DeSoto and Hitlan analysis to show that they did everything possible to show correlation between mercury and autism. The first problem is that they used statistics that require near-Gaussian distributions of data, but it is obvious that the data is non-Gaussian. I would wager my pension that parametric analysis of log-transformed data or non-parametric analysis of the raw data would show no correlation between mercury and autism.

The journal editor should have given the raw data to unbiased biostatisticians, not to this pair.

The gender distributions of the ASD cases and controls are very similar:

I had noticed previously that the blood mercury difference of autistic vs. non-autistic females in this data set was greater than that of males. I’m not sure if there’s statistical significance on that.

@efrique: I’ve probably done logistic regression hundreds of times in my career and and published numerous papers using logistic regression analyses. Sorry, but because part of the purpose of this blog is for teaching, and your comment will just confuse readers, I’ve blocked it.

@efrique: See my “Important Note and Apology” at the bottom of the my post above. Technically, you’re correct — and you made me realize that I’d posted an earlier draft of my article without my own logistic regression results. I say technically you’re correct that logistic regression “has no distributional requirement.” In practice, for this particular data sets, my opinion is that the simple logistic regression results (which DeSoto nd Hitlan report without even a regrssion coefficient or odfds ratio) are misleading.

But again — thank you very much for your comment — it woke me up to my error.

Thank you also to Ken; I should have realized my mistake when he made similar comments.

efrique: Here’s a question that still has Ken and I partially stumped: Where does the F-test come from in DeSoto & Hitlan’s logistic regression results, which I quote in my article? They must have done some sort of least squares comparison of means to get an F-test.

Even in the case of ordinary regression, the X-variables don’t have a specific distributional assumption - the model is conditioned on them. Only the Y-variable does and then only its conditional distribution. The fact remains that given the hypothesis that mercury causes autism, mercury level is not the response, but whether or not you were diagnosed with autism, and so the distribution of mercury levels is not an issue unless you are trying to do some kind of inverse regression..

[Actually, I some further issues with some of your criticisms of their paper, but I think I should stick to the main points.]

I’m not certain what the source of the “F-test” figure was, but I do see one possibility. Or rather, I see two different possibilities, but they’re just the same thing looked at two different ways; I will discuss one of them.

Be aware that this is mere conjecture - you’d have to ask the authors to be sure.

In GLMs, some people call the ratio of a coefficient to its standard error as a “t-statistic”. In ordinary regression, the square of this t-statistic has an F-distribution.

Note that in GLMs this “t-statistic” doesn’t actually have a t-distribution (it’s still “asymptotically normal from the CLT + Slutsky’s theorem”). With large d.f. it doesn’t matter though, since we’re just using the normal tables anyway, so you end up in the right place whatever you call it, and calling it a t-statistic has the advantage of helping us understand what thing we’re looking at. With large d.f., if you happen to look up t-tables it won’t impact your conclusions, so it’s more a matter of poor terminology than bad practice.

[Some people argue that the numerator is asymptotically normal and the denominator is asymptotically the square root of a chi-squared on its df and “hence it’s asymptotically t”. This argument has two different flaws. It may or may not be asymptotically t but that argument is not sufficient to establish that it is.]

Anyway, if you square that standardized coefficient, you might call it an F-statistic by analogy. Both the squared statistic and the F-table asymptotically approach chi-square (in the same way that the original statistic and the t-table both approach normality), so again, the conclusions should be correct.

If that’s what they did, their terminology may be a little sloppy but the correct impression is generated; if it was done with the aim of presenting information more familiar to users of ordinary regression, I wouldn’t have a big problem with it.

Epiwonk: My apologies. the PLUS character is not showing up when my replies go up with the “Your comment is awaiting moderation” message for some reason. I don’t know if it’s a problem with just the way it’s displaying at that time or if it’s being lost altogether. the second message should have said:

That should say “asymptotically normal from the CLT PLUS Slutsky’s theorem” in the 6th paragraph. I don’t know why the “PLUS” went away (it’s in the original copy of the reply that I wrote in notepad and pasted into the window).

(but with an actual plus symbol where I just typed PLUS)

If you wish, I’m happy if you just make sure the PLUS is present in the original message and delete these two followups.

I now have the data and the original paper, and it gets worse. Hopefully I haven’t made any errors but it is the weekend.

1. The original problem DeSoto had with the data was that using the standard deviations in Ip the t-test should have been significant. The problem is the blood Hg std in Table 1 are wrong, they have lost the first digit and should be 15.65 and 12.49. On the same topic the std for age are obviously ridiculous and should be 2.8 and 3.5 instead of 0.2 and 0.4. The text claims a rage of 4-11 years, which is clearly incompatible with the published std. In fact the age range is 3-16. I didn’t check the rest of the table.

2. For the t-test I get a p-value of 0.057 compared to Ip 0.15. Taking logs (which seems correct based on residuals) reduces this to 0.047. Removing the two outliers gives a p-value of 0.018 for unlogged data.

3. Using logistic I get 0.06 for regression on blood Hg, 0.02 with the outliers removed (close enough to DeSoto) and 0.05 with log blood Hg. Removing the outliers doesn’t seem very conservative as DeSoto claims and this should have been reported. After log transformation they don’t look like outliers any more.

4. Ip based the desicion to not include age and gender in the model because they weren’t significantly different between the groups. Not a valid reason but they don’t make much difference anyway. Logistic regression with log blood Hg, age and sex has a p value of 0.03 for log blood Hg but age and sex are not significant.

5. Still leaves the problem that blood Hg is obviously censored at 5 and the analysis should take this into account, although it probably wont make much difference. A topic for Advanced Data Analysis.

6. Ignoring the censoring a statistician could sensibly fit a logistic with log blood Hg, age and sex and the resulting p value for log blood Hg would be 0.03. Significant but not hugely. So while the statistical analysis isn’t optimal it doesn’t really change the conclusions. Still leaves the problem of the bias which is probably much more important. In addition to those already mentioned is the assumption that current blood mercury is an indication of blood mercury prior to development of autism.

I was more than happy to find this web-site.I wanted to thanks on your time for this wonderful read!! I undoubtedly enjoying every little little bit of it and I’ve you bookmarked to check out new stuff you blog post.

There is not a day which i failed to go to your weblog, it’s just so entertaining and uplifting to be around day in reading your thoughts and your writings. Continue posting you have me hook my friend.

Wonderful beat ! I wish to apprentice while you amend your site, how can i subscribe for a blog website? The account helped me a acceptable deal. I had been a little bit acquainted of this your broadcast offered bright clear concept

Hey There. I found your blog using msn. This is a very well written article. I’ll be sure to bookmark it and return to read more of your useful information. Thanks for the post. I will certainly return.

hello there and thank you for your info – I’ve certainly picked up something new from right here. I did however expertise several technical issues using this site, as I experienced to reload the website a lot of times previous to I could get it to load correctly. I had been wondering if your hosting is OK? Not that I am complaining, but slow loading instances times will often affect your placement in google and could damage your high-quality score if advertising and marketing with Adwords. Well I am adding this RSS to my email and could look out for much more of your respective interesting content. Make sure you update this again very soon..

I just like the helpful info you supply on your articles. I’ll bookmark your weblog and check once more here frequently. I am slightly sure I will be informed plenty of new stuff proper here! Good luck for the next!

correct matching fantastically paramount function starting you within the future as well. Popular truth your creative writing skills has inspired me to start my peculiar BlogEngine weblog instantly.
Many thanks pertaining to discussing the following wonderful content on your internet site. I discovered it on google. I will check back again once you post extra aricles.

Hey there just wanted to give you a brief heads up and let you know a few of the pictures aren’t loading properly. I’m not sure why but I think its a linking issue. I’ve tried it in two different browsers and both show the same outcome.

There are some attention-grabbing closing dates on this article but I donâ€™t know if I see all of them center to heart. There is some validity but I’ll take hold opinion till I look into it further. Good article , thanks and we want more! Added to FeedBurner as effectively

It is possible to certainly call at your skills in the do the job you are writing. The particular segment hopes for even more ardent freelancers just like you who aren’t fearful to note the way they believe that. All the time pursue the soul.

I’ve been browsing online more than 3 hours today, yet I never found any interesting article like yours.
It is pretty worth enough for me. Personally, if all site
owners and bloggers made good content as you did, the
net will be much more useful than ever before.

Hmm it seems like your website ate my first comment (it was extremely long) so I guess I’ll just
sum it up what I had written and say, I’m thoroughly
enjoying your blog. I too am an aspiring blog writer but I’m still new to the whole thing.
Do you have any tips for first-time blog writers? I’d genuinely appreciate it.