David Tuller’s three-installment investigation of the PACE trial for chronic fatigue syndrome, “Trial By Error,” has received enormous attention. Although the PACE investigators declined David’s efforts to interview them, they have now requested the right to reply. Today, virology blog posts their response to David’s story, and below, his response to their response.
According to the communications department of Queen Mary University, the PACE investigators have been receiving abuse on social media as a result of David Tuller’s posts. When I published Mr. Tuller’s articles, my intent was to provide a forum for discussion of the controversial PACE results. Abuse of any kind should not have been, and must not be, part of that discourse. -vrr
Last December, I offered to fly to London to meet with the main PACE investigators to discuss my many concerns. They declined the offer. Dr. White cited my previous coverage of the issue as the reason and noted that “we think our work speaks for itself.” Efforts to reach out to them for interviews two weeks ago also proved unsuccessful.
After my story ran on virology blog last week, a public relations manager for medicine and dentistry in the marketing and communications department of Queen Mary University e-mailed Dr. Racaniello. He requested, on behalf of the PACE authors, the right to respond. (Queen Mary University is Dr. White’s home base.)
That response arrived Wednesday. My first inclination, when I read it, was that I had already rebutted most of their criticisms in my 14,000-word piece, so it seemed like a waste of time to engage in further extended debate.
Later in the day, however, the public relations manager for medicine and dentistry from the marketing and communications department of Queen Mary University e-mailed Dr. Racaniello again, with an urgent request to publish the response as soon as possible. The PACE investigators, he said, were receiving “a lot of abuse” on social media as a result of my posts, so they wanted to correct the “misinformation” as soon as possible.
Because I needed a day or two to prepare a careful response to the PACE team’s rebuttal, Dr. Racaniello agreed to post them together on Friday morning.
On Thursday, Dr. Racaniello received yet another appeal from the public relations manager for medicine and dentistry from the marketing and communications department of Queen Mary University. Dissatisfied with the Friday publishing timeline, he again urged expedited publication because “David’s blog posts contain a number of inaccuracies, may cause a considerable amount of reputational damage, and he did not seek comment from any of the study authors before the virology blog was published.”
The charge that I did not seek comment from the authors was at odds with the facts, as Dr. Racaniello knew. (It is always possible to argue about accuracy and reputational damage.) Given that much of the argument for expedited posting rested on the public relations manager’s obviously “dysfunctional cognition” that I had unfairly neglected to provide the PACE authors with an opportunity to respond, Dr. Racaniello decided to stick with his pre-planned posting schedule.
Before addressing the PACE investigators’ specific criticisms, I want to apologize sincerely to Dr. White, Dr. Chalder, Dr. Sharpe and their colleagues on behalf of anyone who might have interpreted my account of what went wrong with the PACE trial as license to target the investigators for “abuse.” That was obviously not my intention in examining their work, and I urge anyone engaging in such behavior to stop immediately. No one should have to suffer abuse, whether online or in the analog world, and all victims of abuse deserve enormous sympathy and compassion.
However, in this case, it seems I myself am being accused of having incited a campaign of social media “abuse” and potentially causing “reputational damage” through purportedly inaccurate and misinformed reporting. Because of the seriousness of these accusations, and because such accusations have a way of surfacing in news reports, I feel it is prudent to rebut the PACE authors’ criticisms in far more detail that I otherwise would. (I apologize in advance to the obsessives and others who feel they need to slog through this rebuttal; I urge you to take care not to over-exert yourself!)
In their effort to correct the “misinformation” and “inaccuracies” in my story about the PACE trial, the authors make claims and offer accounts similar to those they have previously presented in published comments and papers. In the past, astonishingly, journal editors, peer reviewers, reporters, public health officials, and the British medical and academic establishments have accepted these sorts of non-responsive responses as adequate explanations for some of the study’s fundamental flaws. I do not.
None of what they have written in their response actually addresses or resolves the core issues that I wrote about last week. They have ignored many of the questions raised in the article. In their response, they have also not mentioned the devastating criticisms of the trial from top researchers from Columbia, Stanford, University College London, and elsewhere. They have not addressed why major reports this year from the Institute of Medicine and the National Institutes of Health have presented portraits of the disease starkly at odds with the PACE framework and approach.
I will ignore their overview of the findings and will focus on the specific criticisms of my work. (I will, however, mention here that my piece discussed why their claims of cost-effectiveness for cognitive behavior therapy and graded exercise therapy are based on inaccurate statements in a paper published in PLoS One in 2012).
13% of patients had already “recovered” on entry into the trial
I did not write that 13% of the participants were “recovered” at baseline, as the PACE authors state. I wrote that they were “recovered” or already at the “recovery” thresholds for two specific indicators, physical function and fatigue, at baseline—a different statement, and an accurate one.
The authors acknowledge, in any event, that 13% of the sample was “within normal range” at baseline. For the 2013 paper in Psychological Medicine, these “normal range” thresholds were re-purposed as two of the four required “recovery” criteria.
And that begs the question: Why, at baseline, was 13% of the sample “within normal range” or “recovered” on any indicator in the first place? Why did entry criteria for disability overlap with outcome scores for being “within the normal range” or “recovered”? The PACE authors have never provided an explanation of this anomaly.
In their response, the authors state that they outlined other criteria that needed to be met for someone to be called “recovered.” This is true; as I wrote last week, participants needed to meet “recovery” criteria on four different indicators to be considered “recovered.” The PACE authors did not provide data for two of the indicators in the 2011 Lancet paper, so in that paper they could not report results for “recovery.”
However, at the press conference presenting the 2011 Lancet paper, Trudie Chalder referred to people who met the overlapping disability/”normal range” thresholds as having gotten “back to normal”—an explicit “recovery” claim. In a Lancet comment published along with the PACE study itself, colleagues of the PACE team referred to these bizarre “normal range” thresholds for physical function and fatigue as a “strict criterion for recovery.” As I documented, the Lancet comment was discussed with the PACE authors before publication; the phrase “strict criterion for recovery” obviously survived that discussion.
Much of the coverage of the 2011 paper reported that patients got “back to normal” or “recovered,” based on Dr. Chalder’s statement and the Lancet comment. The PACE authors made no public attempt to correct the record in the months after this apparently inaccurate news coverage, until they published a letter in the Lancet. In the response to Virology Blog, they say that they were discussing “normal ranges” in the Lancet paper, and not “recovery.” Yet they have not explained why Chalder spoke about participants getting “back to normal” and why their colleagues wrote that the nonsensical “normal ranges” thresholds represented a “strict criterion of recovery.”
Moreover, they still have not responded to the essential questions: How does this analysis make sense? What are the implications for the findings if 13 % are already “within normal range” or “recovered” on one of the two primary outcome measures? How can they be “disabled” enough on the two primary measures to qualify for the study if they’re already “within normal range” or “recovered”? And why did the PACE team use the wrong statistical methods for calculating their “normal ranges” when they knew that method was wrong for the data sources they had?
Bias was caused by a newsletter for patients giving quotes from patients and mentioning UK government guidance on management. A key investigator was on the guideline committee.
The PACE authors apparently believe it is appropriate to disseminate positive testimonials during a trial as long as the therapies or interventions are not mentioned. (James Coyne dissected this unusual position yesterday.)
This is their argument: “It seems very unlikely that this newsletter could have biased participants as any influence on their ratings would affect all treatment arms equally.” Apparently, the PACE investigators believe that if you bias all the arms of your study in a positive direction, you are not introducing bias into your study. It is hard to know what to say about this argument.
Furthermore, the PACE authors argue that the U.K. government’s new treatment guidelines had been widely reported. Therefore, they contend, it didn’t matter that–in the middle of a trial to test the efficacy of cognitive behavior therapy and graded exercise therapy–they had informed participants that the government had already approved cognitive behavior therapy and graded exercise therapy “based on the best available evidence.”
They are wrong. They introduced an uncontrolled, unpredictable co-intervention into their study, and they have no idea what the impact might have been on any of the four arms.
In their response, the PACE authors note that the participants’ newsletter article, in addition to cognitive behavior therapy and graded exercise therapy, included a third intervention, Activity Management. As they correctly note, I did not mention this third intervention in my Virology Blog story. The PACE authors now write: “These three (not two as David Tuller states) therapies were the ones being tested in the trial, so it is hard to see how this might lead to bias in the direction of one or other of these therapies.”
This statement is nonsense. Their third intervention was called “Adaptive Pacing Therapy,” and they developed it specifically for testing in the PACE trial. It is unclear why they now state that their third intervention was Activity Management, or why they think participants would know that Activity Management was synonymous with Adaptive Pacing Therapy. After all, cognitive behavior therapy and graded exercise therapy also involve some form of “activity management.” Precision in language matters in science.
Finally, the investigators say that Jessica Bavington, a co-author of the 2011 paper, had already left the PACE team before she served on the government committee that endorsed the PACE therapies. That might be, but it is irrelevant to the question that I raised in my piece: whether her dual role presented a conflict of interest that should have been disclosed to participants in the newsletter article about the U.K. treatment guidelines. The PACE newsletter article presented the U.K. guideline committee’s work as if it were independent of the PACE trial itself, when it was not.
Bias was caused by changing the two primary outcomes and how they were analyzed
The PACE authors seem to think it is acceptable to change methods of assessing primary outcome measures during a trial as long as they get committee approval, announce it in the paper, and provide some sort of reasonable-sounding explanation as to why they made the change. They are wrong.
They need as well to justify the changes with references or citations that support their new interpretations of their indicators, and they need to conduct sensitivity analyses to assess the impact of the changes on their findings. Then they need to explain why their preferred findings are more robust than the initial, per-protocol findings. They did not take these steps for any of the many changes they made from their protocol.
The PACE authors mention the change from bimodal to Likert-style scoring on the Chalder Fatigue Scale. They repeat their previous explanation of why they made this change. But they have ignored what I wrote in my story—that the year before PACE was published, its “sister” study, called the FINE trial, had no significant findings on the physical function and fatigue scales at the end of the trial and only found modest benefits in a post-hoc analysis after making the same change in scoring that PACE later made. The FINE study was not mentioned in PACE. The PACE authors have not explained why they left out this significant information about their “sister” study.
Regarding the abandonment of the original method of assessing the physical function scores, this is what they say in their response: “We decided this composite method [their protocol method] would be hard to interpret clinically, and would not answer our main question of comparing effectiveness between treatment arms. We therefore chose to compare mean scores of each outcome measure between treatment arms instead.” They mention that they received committee approval, and that the changes were made before examining the outcome data.
The authors have presented these arguments previously. However, they have not responded to the questions I raised in my story. Why did they not report any sensitivity analyses for the changes in methods of assessing the primary outcome measures? (Sensitivity analyses can assess how changes in assumptions or variables impact outcomes.) What prompted them to reconsider their assessment methods in the middle of the trial? Were they concerned that a mean-based measure, unlike their original protocol measure, did not provide any information about proportions of participants who improved or got worse? Any information about proportions of participants who got better or worse were from post-hoc analyses—one of which was the perplexing “normal range” analysis.
Moreover, this was an unblinded trial, and researchers generally have an idea of outcome trends before examining outcome data. When the PACE authors made the changes, did they already have an idea of outcome trends? They have not answered that question.
Our interpretation was misleading after changing the criteria for determining recovery
The PACE authors relaxed all four of their criteria for “recovery” in their 2013 paper and cited no committees who approved this overall redefinition of this critical concept. Three of these relaxations involved expanded thresholds; the fourth involved splitting one category into two sub-categories—one less restrictive and one more restrictive. The authors gave the full results for the less restrictive category of “recovery.”
The PACE authors now say that they changed the “recovery” thresholds on three of the variables “since we believed that the revised thresholds better reflected recovery.” Again, they apparently think that simply stating their belief that the revisions were better justifies making the changes.
Let’s review for a second. The physical function threshold for “recovery” fell from 85 out of 100 in the protocol, to a score of 60 in the 2013 paper. And that “recovery” score of 60 was lower than the entry score of 65 to qualify for the study. The PACE authors have not explained how the lower score of 60 “better reflected recovery”—especially since the entry score of 65 already represented serious disability. Similar problems afflicted the fatigue scale “recovery” threshold.
The PACE authors also report that “we included those who felt “much” (and “very much”) better in their overall health” as one of the criteria for “recovery.” This is true. They are referring to the Clinical Global Impression scale. In the protocol, participants needed to score a 1 (“very much better”) on this scale to be considered “recovered” on that indicator. In the 2013 paper, participants could score a 1 (“very much better”) or a 2 (“much better”). The PACE authors provided no citations to support this expanded interpretation of the scale. They simply explained in the paper that they now thought “much better” reflected the process of recovery and so those who gave a score of 2 should also be considered to have achieved the scale’s “recovery” threshold.
With the fourth criterion—not meeting any of the three case definitions used to define the illness in the study—the PACE authors gave themselves another option. Those who did not meet the study’s main case definition but still met one or both of the other two were now eligible for a new category called “trial recovery.” They did not explain why or when they made this change.
The PACE authors provided no sensitivity analyses to measure the impact of the significant changes in the four separate criteria for “recovery,” as well as in the overall re-definition. And remember, participants at baseline could already have achived the “recovery” requirements for one or two of the four criteria—the physical function and fatigue scales. And 13% of them already had.
Requests for data under the freedom of information act were rejected as vexatious
The PACE authors have rejected requests for the results per the protocol and many other requests for documents and data as well—at least two for being “vexatious,” as they now report. In my story, I incorrectly stated that requests for per-protocol data were rejected as “vexatious” [see clarification below]. In fact, earlier requests for per-protocol data were rejected for other reasons.
One recent request rejected as “vexatious” involved the PACE investigators’ 2015 paper in The Lancet Psychiatry. In this paper, they published their last “objective” outcome measure (except for wages, which they still have not published)—a measure of fitness called a “step-test.” But they only published a tiny graph on a page with many other tiny graphs, not the actual numbers from which the graph was drawn.
The graph was too small to extract any data, but it appeared that the cognitive behavior therapy and graded exercise therapy groups did worse than the other two. A request for the step-test data from which they created the graph was rejected as “vexatious.”
However, I apologize to the PACE authors that I made it appear they were using the term “vexatious” more extensively in rejecting requests for information than they actually have been. I also apologize for stating incorrectly that requests for per protocol data specifically had been rejected as “vexatious” [see clarification below].
This is probably a good time to address the PACE authors’ repeated refrain that concerns about patient confidentiality prevent them from releasing raw data and other information from the trial. They state: “The safe-guarding of personal medical data was an undertaking enshrined in the consent procedure and therefore is ethically binding; so we cannot publicly release these data. It is important to remember that simple methods of anonymization does [sic] not always protect the identity of a person, as they may be recognized from personal and medical information.”
This argument against the release of data doesn’t really hold up, given that researchers share data all the time without compromising confidentiality. Really, it’s not that difficult to do!
(It also bears noting that the PACE authors’ dedication to participant protection did not extend to fulfilling their protocol promise to inform participants of their “possible conflicts of interest”—see below.)
Subjective and objective outcomes
The PACE authors included multiple objective measures in their protocol. All of them failed to demonstrate real treatment success or “recovery.” The extremely modest improvements in the exercise therapy arm in the walking test still left them more severely disabled people with people with pacemakers, cystic fibrosis patients, and relatively healthy women in their 70s.
The authors now write: “We interpreted these data in the light of their context and validity.”
What the PACE team actually did was to dismiss their own objective data as irrelevant or not actually objective after all. In doing so, they cited various reasons they should have considered before including these measures in the study as “objective” outcomes. They provide one example in their response. They selected employment data as an objective measure of function, and then—as they explain in their response, and have explained previously–they decided afterwards that it wasn’t an objective measure of function after all, for this and that reason.
The PACE authors consider this interpreting data “in light of their context and validity.” To me, it looks like tossing data they don’t like.
What they should do, but have not, is to ask whether the failure of all their objective measures might mean they should start questioning the meaning, reliability and validity of their reported subjective results.
There was a bias caused by many investigators’ involvement with insurance companies and a failure not to declare links with insurance companies in information regarding consent
The PACE authors here seriously misstate the concerns I raised in my piece. I did not assert that bias was caused by their involvement with insurance companies. I asserted that they violated an international research ethics document and broke a commitment they made in their protocol to inform participants of “any possible conflicts of interest.” Whether bias actually occurred is not the point.
In their approved protocol, the authors promised to adhere to the Declaration of Helsinki, a foundational human rights document that is explicit on what constitutes legitimate informed consent: Prospective participants must be “adequately informed” of “any possible conflicts of interest.” The PACE authors now suggest this disclosure was unnecessary because 1) the conflicts weren’t really conflicts after all; 2) they disclosed these “non-conflicts” as potential conflicts of interest in the Lancet and other publications, 3) they had a lot of investigators but only three had links with insurers, and 4) they informed participants about who funded the research.
These responses are not serious. They do nothing to explain why the PACE authors broke their own commitment to inform participants about “any possible conflicts of interest.” It is not acceptable to promise to follow a human rights declaration, receive approvals for a study, and then ignore inconvenient provisions. No one is much concerned about PACE investigator #19; people are concerned because the three main PACE investigators have advised disability insurers that cognitive behavior therapy and graded exercise therapy can get claimants off benefits and back to work.
That the PACE authors made the appropriate disclosures to journal editors is irrelevant; it is unclear why they are raising this as a defense. The Declaration of Helsinki is about protecting human research subjects, not about protecting journal editors and journal readers. And providing information to participants about funding sources, however ethical that might be, is not the same as disclosing information about “any possible conflicts of interest.” The PACE authors know this.
Moreover, the PACE authors appear to define “conflict of interest” quite narrowly. Just because the insurers were not involved in the study itself does not mean there is no conflict of interest and does not alleviate the PACE authors of the promise they made to inform trial participants of these affiliations. No one required them to cite the Declaration of Helsinki in their protocol as part of the process of gaining approvals for their trial.
As it stands, the PACE study appears to have no legitimate informed consent for any of the 641 participants, per the commitments the investigators themselves made in their protocol. This is a serious ethical breach.
I raised other concerns in my story that the authors have not addressed. I will save everyone much grief and not go over them again here.
I want to acknowledge two additional minor errors. In the last section of the piece, I referred to the drug rituximab as an “anti-inflammatory.” While it does have anti-inflammatory effects, rituximab should more properly be referred to as an “immunomodulatory” drug.
Also, in the first section of the story, I wrote that Dr. Chalder and Dr. Sharpe did not return e-mails I sent them last December, seeking interviews. However, during a recent review of e-mails from last December, I found a return e-mail from Dr. Sharpe that I had forgotten about. In the e-mail, Dr. Sharpe declined my request for an interview.
I apologize to Dr. Sharpe for suggesting he hadn’t responded to my e-mail last December.
Clarification: In a decision on a data request, the UK Information Commissioner’s Office noted last year that Queen Mary University of London “has advised that the effect of these requests [for PACE-related material] has been that the team involved in the PACE trial, and in particular the professor involved, now feel harassed and believe that the requests are vexatious in nature.” In other words, whatever the stated reason for denying requests, White and his colleagues regarded them all as “vexatious” by definition. Therefore, the statement that the investigators rejected the requests for data as being “vexatious” is accurate, and I retract my previous apology.