Attempts to Get Study 329 Retracted

Beginning soon after its 2001 publication, a number of journalists and researchers spotted the anomalies in Study 329’s data classification and interpretation and raised concerns with the authors, their institutions, and the JAACAP. Despite this, the 329 trial continued to be presented as a “landmark” study demonstrating the drug’s efficacy and safety.

The 2010 article Rules of Retraction notes that retractions of research studies are rare when compared to the number of articles published each year: “In 1990 five out of 690 000 journal articles produced were retracted, compared with 95 retractions out of 1.4 million papers published in 2008.” It appears that retraction is reserved for clear cases of scientific fraud, and errors so serious that they undermine the entire premise of the research.

Many would argue that Study 329 would have qualified for retraction based on these criteria. Yet, despite many requests from concerned scientists and practitioners, the original Study 329 has never been retracted.

RETRACTION ATTEMPTS TIMELINE:

“In view of practitioners’ concerns regarding safety and, even more so, parents’ concerns about side effects and adverse effects of medications, I would like to have seen a more detailed description of the adverse effects found in the 11 patients receiving paroxetine who withdrew prematurely. In particular, I am interested in the five patients who withdrew because of “emotional lability (e.g., suicidal ideation/ gestures).”

As I previously reported in a Letter to the Editor (Weintrob, 2001) and as has been confirmed anecdotally by some psychopharmacologists, there have been instances of adolescents on selective serotonin reuptake inhibitors who have cut themselves. (Whether this is related to induction of a manic state is unclear.) This effect appears to have been causal. A more detailed description would thus be appreciated, particularly whether the “suicidal gestures” included self-mutilation.” Page 363

“I was concerned… by the report that 11 patients in the paroxetine group suffered serious adverse effects. This was in comparison with five in the imipramine group and two in the placebo group. This finding would appear to be statistically significant, though this was not specifically addressed in the study.

I took particular note of the statement that “Of the 11 patients, only headache…was considered by the treating investigator to be related to paroxetine treatment.” I would like to know on what basis the investigator dismissed the possibility that emotional lability, worsening depression, suicidal ideation or gestures, conduct problems, or behavioral disturbance could be due to the paroxetine.

In the past decade I have treated hundreds of adolescent patients with selective serotonin reuptake inhibitors (SSRIs), and in my view all of these mentioned adverse effects have been temporally associated with the prescription of SSRIs…

I certainly believe that paroxetine and the other SSRIs are useful medications, and as stated I am pleased to have a reasonably encouraging study that supports their use. I would value future studies, however, that look specifically at the issue of behavioral or cognitive side effects. Reports of these side effects have circulated since the advent of SSRIs and continue to be controversial. I also suggest that the reviewers of this article should have questioned more closely the dismissal of these symptoms as being unrelated to medication. This is particularly true in light of the fact that this study was funded by Glaxo-Smith-Kline, the makers of Paxil.” Page 363

Regarding Dr Parsons’ question about why only one side-effect, headache, was attributed to paroxetine while none of the negative mood and behavioural ones were, the Keller team responded that they decided that:

“The psychiatric symptoms were chronologically related to a variety of situational factors,” such as arguments with boyfriend and parents, torment by peers, medication non-compliance, and/or untreated comorbid disorders.”

The only comment that the Keller team had on the subject of safety related to cardiovascular events with tricyclics:

“We agree that further studies are needed to understand more completely the role of antidepressants, including the selective serotonin reuptake inhibitors, in the treatment of adolescents with major depression.”

Keller et al simply did not address Dr Weintrob’s issue about self harm, and dismissed without any clear reason the possibility that suicidality and other aberrations could be attributable to paroxetine. “ Page 364

two primary outcome measures as a HAM-D score of ≤8, instead of the commonly used ≤7. Moreover, the most widely accepted criterion of a 50% reduction in baseline HAM-D score was not reported separately, but collapsed with the HAMD score of ≤8 (p = .11).”

While neither paroxetine nor imipramine differed significantly from placebo on either self –rating scales (parent and patient) or nonsymptom measures (functioning, health, and behavior), this negative finding is not detailed in the Results section and the clinical relevance of rating score reductions is not discussed…

Although serious adverse effects occurred with paroxetine (n = 11) more often than with imipramine (n = 5) and placebo (n = 2), only one case of severe headache was considered to be related to paroxetine. However, a potential selective serotonin reuptake inhibitor (SSRI)-induced mood disorder (King et al., 1991) is of concern in those 8 cases (4 requiring hospitalization) with “emotional lability (n = 5), “conduct problems or hostility” (n = 2), and “euphoria/expansive mood” (n = 1), particularly if subjects did not have comorbid externalizing conditions before paroxetine treatment.”

The lead author and the GSK Executive most involved, who is also an author, respond without actually addressing the issues raised:

“The criteria for therapeutic response were defined in the report as a final HAM-D score that was 8 or less or a reduction from baseline of 50% or more. Dual criteria were selected for this study because the scores at entry could range from a minimum of 12 (set by protocol) to a maximum of 53 (highest scores for the 17-item HAM-D). Limiting response to either a 50% reduction or a specified cut point would impede patients at the lower end of the ranges from meeting the criterion… The potential for SSRIs to induce mood disorders is not clear. In the present study, there was a history of behavior problems in several of the subjects that recurred during treatment.”

“We read the report by Keller et al. (2001) with great interest. We wish, however, to raise some concerns. Although randomized controlled trials are the gold standard for proving efficacy of treatments, studies such as this have some limitations and may have hidden biases. The paper would have been more useful to clinicians seeking to apply the results in their practice if the CONSORT diagram (Begg et al., 1996) had been used to report the numbers of potential subjects at each stage of recruitment, treatment, and evaluation. Ability to generalize to clinical populations would be enhanced by a broader range of severity, less restrictive inclusion criteria, and fewer exclusion criteria. More details on the method of randomization to paroxetine, imipramine, and placebo would be appreciated. Are data available on whether the evaluators who completed the HAMD remained blind to the subject’s treatment? The side effects of imipramine, especially at higher doses, may threaten the blindness and introduce a source of bias in clinician ratings.”

Child Psychiatrist Jon Jueidini, M.D. and Prof Anne Tonkin, from the University of Adelaide, alarmed by what they see as potentially serious issues with Study 329, had written a letter to the Editor of JAACAP. They suggest that it might be a good idea to sort out their differences immediately, and “off-line”:

“We write to request that you reconsider your decision not to give us access to Keller et al’s reply to our letter prior to its publication in the middle of 2003. Our letter raises significant concerns and impact on the standing of the journal and profession that we believe cannot wait 6 months for our further attention.

We would not wish to pursue these concerns if Keller et al have sucessfully refuted our criticisms. We think it is important and urgent to know whether we are correct in raising concerns about journal standards. We therefore request that you forward us a copy of Keller et al’s response as soon as possible.”

It is not the policy of the Journal to share responses to letters before publication. We do not have a backlog, and your letter and the Keller response will be published as soon as our production process allows. This is not a debate. We are under no obligation to publish whatever rejoinders you wish to make (or your original letter, either, which was quite rude and accusatory). Your approach to this process leads me to believe that whatever Dr. Keller says, you will disagree. Our readers will see both letters and make their own judgments. Frankly, your haste seems odd, since the article was originally published in July of 2001. I find your adversarial tone and urgency tedious.

You have not been appointed as the guardian of the Journal, or of the profession of child and adolescent psychiatry. Many highly expert child and adolescent psychiatrists were participants in that study, and others that were similar, and others equally contributed to the review process. In addition, the prescription of SSRIs for youth, not only in this country but around the world, far predated any research data on their effects with youth, so this article could hardly be blamed or praised for that trend.

I noticed that you are members of Healthy Skepticism. Is this organization backed by anyone we should know about, for potential conflict of interest?”

Letter: The Jon Jureidini and Anne Tonkin letter to the Editor of JAACAP is published:

To the Editor:

The article by Keller et al. (2001) is one of only two to date to show a positive response to selective serotonin reuptake inhibitors (SSRIs) in child or adolescent depression. We believe that the Keller et al. study shows evidence of distorted and unbalanced reporting that seems to have evaded the scrutiny of your editorial process. The study authors designated two primary outcome measures: change from baseline in the Hamilton Rating Scale for Depression (HAM-D) and response (set as fall in HAM-D below 8 or by 50%). On neither of these measures did paroxetine differ significantly from placebo. Table 2 of the Keller article demonstrates that all three groups had similar changes in HAM-D total score and that the clinical significance of any differences between them would be questionable. Nowhere is this acknowledged. Instead:

1. The definition of response is changed…

2. In reporting efficacy results, only “response” is indicated as

a primary outcome measure, and it could be misunderstood

that response was the primary outcome measure…

Given that the research was paid for by GlaxoSmithKlein, the makers of paroxetine, it is tempting to explain the mode of reporting as an attempt to show the drug in the most favorable light.

Given the frequency with which it is cited,… this article may have contributed to the increased prescribing of SSRI medication to children and adolescents. We believe it is a matter of importance to public health that you acknowledge the failings of this article, so that its findings can be more realistically appraised in decision-making about the use of SSRIs in children.

Without fully addressing the points raised by Jureidini and Tonkin, they imply that turning to secondary efficacy measures is a technicality, saying:

“as scientists and clinicians we must adjudge whether or not the study overall found evidence of efficacy”, the implication being that their study did show this. Their closing paragraph implies that Jureidini and Tonkin have made an unfounded personal attack:

“Drs. Jureidini and Tonkin argue that the reviewers failed to understand and appropriately critique the article (and by extension that the editor was not up to the task) and that the authors of the original article swerved from their moral and scientific duty under the influence of the pharmaceutical industry. By extension, of course, they covertly argue that the reader who agrees with them is intellectually and morally superior while a reader who does not agree with their position shares the cognitive and/or moral failing of the rest of us. We say that this article and body of scientific work is a matter for thoughtful and collegial discussion and say, in addition, that their emperor has no clothes.”

In late 2006, Shelly Jofre of BBC’s Panorama program is preparing the last of a 4-part series on problems with paroxetine (known as Seroxat in the U.K.), Secrets of the Drug Trials.

She conducts an in-depth interview with Mina Dulcan, Editor in Chief of JAACAP. The entire interview is 17 pages, including the following exchanges:

SJ: You don’t think the actual study as it was published overstated the effectiveness and underplayed the side-effects?

MD: Well, all of that is a matter of opinion to some extent how much is over and how much is under…

SJ: But, what’s your opinion?

MD: I mean it certainly listed the side-effects.

SJ: It didn’t list them very clearly did it?

MD: It depends on what you mean by clearly…

SJ: …It took the Medicines Regulator in this country about 3 years to work out exactly what was contained within the data. When they did their own line by line analysis of the data they discovered that the adolescents on Seroxat were 6 times more likely to suffer a suicide-related event than the kids on Placebo. That wasn’t in the published study.

MD: That was a whole different kind of analysis…

xxxxx

SJ: I have got the peer reviewers comments on study 329…

MD: From the journal?

SJ: It says the relatively high rate of serious adverse effects of the drug was not addressed in the discussion. Given the high Placebo response rate are these drugs an acceptable first line therapy for depressed teenagers. The results do not clearly indicate efficacy for the drug. I mean, these are pretty damning comments aren’t they?

MD: First of all I don’t know how you would have gotten that and second we often have several series of reviews and on virtually any paper if you read the reviews that came in on the first version they might have very little to do with the actual published version so I really can’t comment on that.

SJ: Don’t you think they sound pretty damning?

MD: I am not going to comment on how they sound because they could easily be out of context.

xxxxx

SJ: Surely the whole point of randomised control trials is to try and work out quite clearly what the drug is doing and what the drug is not doing.

MD: That is the concept, but it’s not that simple.

SJ: Not trying to complicate things, but what is quite clear is that the kids on the drug were having more psychiatric side-effects than the kids who weren’t taking the drug.

MD: That was reported.

SJ: It wasn’t clearly reported and it wasn’t accurately reported and the conclusion was that this was a drug that was generally well tolerated. It looks like 10% of the children who took Seroxate self-harmed – started to feel suicidal…

MD: I think unless you understand the clinical condition sometimes as people are getting better they appear to be suffering more. That’s how the phenomenon works.

SJ: That’s an argument that certainly the drug companies have put forward for a very long time.

xxxxx

SJ: Are you aware that 329 was ghost written?

MD: I have no way of knowing that. It doesn’t surprise me to know it happens…

SJ: Does it worry you, do you think it matters?

MD: Well, certainly if I were an author I would not put my name on anything that I didn’t feel was accurate. I can’t speak to what those authors, to the extent, how much they saw the data [sic].

SJ: But, ultimately the person who is listed it as the principal investigator really ought to have seen all the data don’t you agree?

MD: Well, again, science is complicated…

xxxxx

SJ: Well, now that you know that there were more serious psychiatric effects for the children who were taking Seroxate compared to Placebo it means that the study published wasn’t accurate. Have you got any plans to publish a correction or even pull it, because it’s in your archives?

MD: …we certainly have no plans to either pull it, you can’t actually pull it. You could issue a retraction…

SJ: Why not issue a retraction, because it’s not accurate? What is reported in that study is not accurate.

MD: I think if we found something that was fraudulent, that data were invented for example, that would be something. This is a difference in interpretation…

According to the article Rules of Retraction published in BMJ in Dec 2009:

“Both academics [Jureidini and McHenry] called for the article’s retraction in December 2009, arguing that the conflation of primary and secondary outcomes represented falsification of data and accusing GSK of intending to deceive by concealing negative data.

Andres Martin, MD, is now JAACAP Editor in Chief. He took over the role from Mina Dulcan in 2007.

The article Rules of Retraction describes the unsuccessful efforts of Jon Jureidini and Leemon McHenry to get Study 329 retracted, in the context of accepted practice.

“The efficacy claim was based on just 15% of the trial’s outcomes, they argue. The academics’ stance is supported by internal GSK documents released during personal injury lawsuits against the company. The documents show that company employees and public relations advisers also saw the trial data as having failed to prove that the drug worked in adolescents.” Page 1246

The paper notes that:

“The International Committee of Medical Journal Editors (ICMJE) advises retraction in cases of scientific fraud or where an error is “so serious as to vitiate the entire body of work,” implying that this approach should not be used in cases of debate as to whether data have been interpreted correctly.” PAge 1247

It appears that the experts in ethics bent over backward to assume good faith, and to err on the side of not retracting without overwhelming evidence of deliberate fraud. Despite the proof that GSK knew that paroxetine was not effective for adolescents, everybody decided to view Study 329 as at worst a case of undue optimism resulting in exaggeration. These experts are so focused on scientific process that those ultimately affected, the young people and their families, are not even mentioned.

“Liz Wager, chair of COPE [the Committee on Publication Research Ethics], declines to comment on the paper but warns that cases should be judged on the transparency standards of the day. “Things have changed in the last few years.”

The US requirement, in place since 2008, for all trials to be registered, including their pre-specified outcome measures, will make cherry picking harder, she says.”

“For the editor who is trying to decide whether, in hindsight, acceptable highlighting of positive results tipped over into unacceptable misrepresentation, there is no authoritative guidance at hand.” Pages 1247 – 1248

Drs Jureidini and McHenry ask Dr Keller to request that JAACAP retract the 329 article, in the interests of scientific integrity. They note that the conclusions have misled clinicians into believing that Paxil (paroxetine) is safe and effective for adolescents when it is not.

As a psychiatrist retired from Emory, Dr Nardo reviews the reasons for Dr Martin’s initial decision declining to retract publication of Study 329, refuting each. He points to the article written by Students at Brown, and their concern with the integrity of their school and their education.

Dr Wing replies to the Oct 4 Jureidini and McHenry request that Brown’s President ask for retraction as follows: “With regard to your request for memos associated with an internal review, any reviews the University chooses to conduct in response to substantive concerns are undertaken on a confidential basis. Memoranda, letters, messages, policy reviews, or other internal documents associated with a review are not available to the public.I would caution you not to confuse the University’s policy of confidentiality with inactivity.”

Response to Wing Nov 11 letter clarifying that they are not seeking to discover details of Brown internal process or confidential information, but rather are seeking to enlist Brown executives in getting the 329 article retracted.

“In 2001, the JAACAP published an article on the use of Paroxetine [Paxil] in adolescent depression, now known as Study 329…

It has been widely seen as the paradigm for a period in psychiatry when we were at our worst – an industry-financed, ghost-written article that made claims of both efficacy and safety that were unsupported by its data.”

The letter states: I write to you as the CEO of GlaxoSmithKline in regard to an on-going complaint about a fraudulent journal article under the lead authorship of Martin Keller…

In light of a recent $3 billion settlement in which your corporation pleaded guilty to misbranding of paroxetine (Paxil), we request that you write to Dr. Andrés Martin, the editor of Journal of the American Academy of Child & Adolescent Psychiatry to request retraction of the Keller et al. article.

Dr John (Mickey) Nardo, blogs about the failure of retraction of Study 329 in a blog entitled Hide-and-go-seek. It starts as follows:

It’s always funny when small children try to play hide-and-go-seek by covering their eyes, but when grown-ups do it, it loses its charm. That’s what Dr. Andres Martin’s has done in his response to Dr. Juriedini’s request that the Journal of the American Academy of Child and Adolescent Psychiatry retract the 2001 Study 329 article. They had a perfect chance to do the right thing. They declined to take it:

“Thank you for your Letter to the Editor, submitted July 20, 2012, regarding Keller et al., 2001. Following the June 27, 2012 settlement between GlaxoSmithKline and the U.S. Department of Justice, the Journal’s editorial team undertook a thorough evaluation of the article, the legal settlement, and related materials. The authors of the article were contacted and asked to respond to the questions and concerns raised by the settlement. After a comprehensive and extensive review, the Journal editors found no basis for retraction or other editorial action.”

Due to the nature of the concerns and serious consideration given to the situation, the evaluation process was quite lengthy, and we appreciate your patience while the editorial team conducted its review. The inquiry is considered complete, and as such, your letter will not be published in the Journal.

I had also contacted the American Academy of Child and Adolescent Psychiatry prior to their recent yearly meeting. Rather than write the Journal, I wrote the outgoing president, the incoming President, and contacted the Ethics Committee. All responded cordially and assured me convincingly that the matter was under review. So I think I was more hopeful than most about what they would do. My logic was that the American Academy of Child and Adolescent Psychiatry itself was responsible for its official journal, but as we now see, it remained in the hands of the journal.

“The authors of the article were contacted and asked to respond to the questions and concerns raised by the settlement.” It goes without saying that contacting the authors seems an odd way to go about an investigation, particularly these authors. For example, there’s a deposition of Martin Keller about this study available on the Internet. It’s 125 pages long, but easy to summarize: “If you think I did something wrong, you’re wrong because I’ve never done any wrong things, and I don’t specifically remember anything I’ve ever done.” That may sound facetious, but if you have the stomach to read it through, you’ll agree with my assessment. It’s maddening. Neal Ryan, the second author is heading up the Back to the Future Project which maps the road ahead for the AACAP [see this comment]. Karen Dineen Wagner, Boris Birmaher, and Graham Emslie continue to grind out articles about psychopharmacology in children and are prominent in AACAP affairs. I doubt that anyone on the author list was excited about the embarrassment of a retraction, but we already knew that. So did the Editor, Dr. Andres Martin.

“He [Dr Andres Martin] reviewed the timeline of the lengthy process used to vet the allegations raised numerous times over many years and to decide whether or not to retract the article, which included consultation with the authors, experts in publications and publication ethics (the Committee on Publication Ethics (COPE)), experts in the field (psychology, child and adolescent psychiatry, clinical trialologists, etc), a whole range of attorneys, and more. By July 2010, Dr. Martin finished his independent assessment. He felt the process had been done correctly. Letters to the editor to retract the article had no supporting information and the letters were rejected.

Also included are “Notes from the afternoon discussion of the motion to refer the question of Study 329 to the ethics committee”