Trial by Error, Continued: Is PACE a Case of Research Misconduct?

by David Tuller, DrPH

[June 25, 2017: The last section of this post, about the PLoS One study, has been revised and corrected.]

I have tip-toed around the question of research misconduct since I started my PACE investigation. In my long Virology Blog series in October 2015, I decided to document the trial’s extensive list of flaws, or as many as I could fit into 15,000 words, which wasn’t all of them, without arguing that this constituted research misconduct. My goal was simply to make the strongest possible case that this was very bad science and that the evidence did not support the claims that cognitive behavior therapy and graded exercise therapy were effective treatments for the illness.

Since then, I have referred to PACE as utter nonsense, complete bullshit, a piece of crap, and this f**king trial. My colleague and the host of Virology Blog, Professor Racaniello, has called it a sham. Indeed, subsequent events have only strengthened the argument against PACE, despite the unconvincing attempts of the investigators and Sir Simon Wessely to counter what they most likely view as my disrespectful and vexatious behavior.

Virology Blog‘s open letters to The Lancet and Psychological Medicine have demonstrated that well-regarded experts from the U.S, U.K. and many other countries find the methodological lapses in PACE to be such egregious violations of standard scientific practice that the reported results cannot be taken seriously. In the last few months, more than a dozen peer-reviewed commentaries in the Journal of Health Psychology, a respected U.K.-based academic publication, have further highlighted the international dismay at the study’s self-evident and indisputable lapses in judgement, logic and common sense.

And here’s a key piece of evidence that the trial has lost all credibility among those outside the CBT/GET ideological brigades: The U.S. Centers for Disease Control still recommends the therapies but now insists that they are only generic management strategies for the disease. In fact, the agency explicitly denies that the recommendations are related to PACE. As far as I can tell, since last year the agency no longer cites the PACE trial as evidence anywhere on its current pages devoted to the illness. (If there is a reference tucked away in there somewhere, I’m sure a sharp-eyed sleuth will soon let me know.)

It must be said that the CDC’s history with this illness is awful, another bad science saga that I documented on Virology Blog in 2011. In past years, the agency cited PACE prominently and has collaborated closely with British members of the biopsychosocial school of thought. So it is ridiculous and, let’s be frank, blatantly dishonest for U.S. public health officials to now insist that the PACE-branded treatments they recommend have nothing to do with PACE and are simply generic management strategies. Nevertheless, it is significant that the agency has decided to disappear PACE from its site, presumably in response to the widespread condemnation of the trial.

Many of the PACE study’s myriad flaws represent bad science but clearly do not rise to the level of research misconduct. Other fields of medicine, for example, have abandoned the use of open-label trials with subjective outcomes because they invite biased results; Jonathan Edwards, an emeritus professor of medicine from University College London, has made this point repeatedly. But clearly large segments of the psychological and psychiatric fields do not share this perspective and believe such trials can provide reliable and authoritative evidence.

Moreover, the decision to use the very broad Oxford criteria to identify patients is bad science because it conflates the symptom of chronic fatigue with the specific disease entity known often as chronic fatigue syndrome but more appropriately called myalgic encephalomyelitis. This case definition generates heterogeneous samples that render it virtually impossible for such studies to identify accurate information about causes, diagnostic tests and treatments. Although a 2015 report from the National Institutes of Health recommended that it should be retired from use, the Oxford definition remains in the published literature. Studies relying on it should be discredited and their findings ignored or dismissed. But that’s probably as far as it goes.

Many definitions of research misconduct exist, but they generally share common elements. In Britain, the Medical Research Council, the main funder of PACE, endorses the definition from Research Councils U.K., an organization which outlines its principles in a statement called Policy and Guidelines on Governance of Good Research Conduct. In exploring this question, I will focus here on just two of the planks of the definition cited by the MRC: misrepresentation of interests and misrepresentation of data.

Let me be clear: I am not trained as a bioethicist. I have never been involved in determining if any particular study involves research misconduct. And I am not making any such claim here. However, when a clinical trial includes so many documented flaws that more than 100 experts from around the world are willing and even eager to sign a letter demanding immediate retraction of key findings, the question of whether there has been research misconduct will inevitably arise. Although people with different perspectives could clearly disagree on the answer, the final and authoritative determination will likely not emerge until the PACE study and the details involved in its conduct and the publication of the results are subjected to a fully independent investigation.

In the meantime, let’s look at how research misconduct is defined and examine some of the possible evidence that might be reviewed. For starters, the cited definition of misrepresentation of interests includes the failure to declare material interests either of the researcher or of the funders of the research.

I have repeatedly pointed out that the investigators have misled participants about their material interests in whether the trial reached certain conclusions, namely, that CBT and GET are effective treatments. The three main investigators have had longstanding links with insurance companies, advising them that rehabilitative approaches such as the interventions under study could get ME/CFS claimants off benefits and back to work. No reliable evidence actually supports this claim, certainly the PACE results failed to confirm it. And yet the investigators did not disclose these consulting and/or financial ties in the information leaflets and consent forms provided to participants.

Why is that a problem? Well, the investigators promised in their protocol to adhere to the Declaration of Helsinki, among other ethical guidelines. The declaration, an international human rights document enacted after WWII to protect human research subjects, is very specific about what researchers must do in order to obtain informed consent: They must tell prospective participants of any possible conflicts of interest and institutional affiliations.

Without such disclosures, in fact, any consent obtained is not informed but, per Helsinki’s guidelines, uninformed. Investigators cannot simply pick and choose from among their protocol promises and decide which ones they will implement and which ones they won’t. They cannot decide not to disclose any possible conflicts of interest, once they have promised to do so, even if it is inconvenient or uncomfortable or might make people reluctant to enter a trial. I have interviewed four PACE participants. Two said they would likely or definitely not have agreed to be in the study had they been told of these conflicts of interest; in fact, one withdrew her consent for her data to be used after she had already completed all the trial assessments because she found out about these insurance affiliations later on and was outraged at not having been told from the start.

The PACE investigators have responded to this concern, but their answers do not actually address the criticism, as I have previously pointed out. It is irrelevant that they made the appropriate disclosures in the journals that published their work; the Declaration of Helsinki does not concern itself with protecting journal editors and journal readers but with protecting human research subjects. The investigators have also argued that insurance companies were not directly involved in the study, thereby implying that no conflict of interest in fact existed. This is also a specious argument, relying as it does on an extremely narrow interpretation of what constitutes a conflict of interest.

Shockingly, the PACE trial’s ethical review board approved the consent forms, even without the disclosures clearly mandated by the Declaration of Helsinki. The Lancet and Psychological Medicine have been made aware of the issue but have no apparent problem with this breach of research ethics. Notwithstanding such moral obtuseness, the fact remains that the PACE investigators made a promise to disclose any possible conflicts of interest to trial participants, and failed to honor it. Case closed. In the absence of legitimate informed consent, they should not have been allowed to publish any of the data they collected from their 641 participants.

Does this constitute misrepresentation of material interests within the context of the applicable definition of research misconduct? I will leave it to others to make that determination. Certainly the PACE authors and their cheerleaders, including Sir Simon, Esther Crawley, Lancet editor Richard Horton and Psychological Medicine editors Robin Murray and Kenneth Kendler, would reject any such interpretation.

Turning to the category of misrepresentation of data, the MRC/RCUK definition cites the suppression of relevant findings and/or data, or knowingly, recklessly or by gross negligence, presenting a flawed interpretation of data. One of the PACE trial’s most glaring problems, of course, is the odd fact that 13% of participants met the physical function outcome threshold at baseline. (A smaller number, slightly more than one percent, met the fatigue outcome threshold at baseline.) In the Lancet study, participants who met these very poor outcome thresholds were referred to as being within normal ranges for these indicators. In the Psychological Medicine paper, these same participants were referred to as being recovered for these indicators.

Of course, it was obvious from the papers themselves that some participants could have met these thresholds at baseline. But the number of participants who actually did meet these thresholds at baseline became public only after the information was released pursuant to a freedom-of-information request. (This was an earlier data request than the one that eventually led to the release of all the raw trial data for some of the main results.)

The decision-making behind this earlier release remains a mystery to me, since the data make clear that the study is bogus. While the bizarre nature of the overlap in entry and outcome thresholds already raised serious questions about the trial’s credibility, the fact that a significant minority of participants actually met both the disability and normal range/recovery thresholds for physical function at baseline certainly adds salient and critical information. Any interpretation of the study made without that benefit of that key information is by definition incomplete and deficient.

Given the logical impossibility of meeting an outcome threshold at baseline, it is understandable why the authors made no mention of the fact that so many participants were simultaneously found to be disabled and within normal range/recovered for physical function. Any paper on breast cancer or multiple sclerosis or any other illness recognized as a medical disease would clearly have been rejected if it featured such an anomaly.

The PACE team compounded this error by highlighting these findings as evidence of the study’s success. At the press conference promoting the Lancet paper, Trudie Chalder, one of the three principal investigators, touted these normal range results by declaring that twice as many people in the CBT and GET groups as in the other groups “got back to normal, even though some of these back-to-normal participants still qualified as disabled under the study’s entry criteria. Moreover, the PACE authors themselves were allowed a pre-publication review of an accompanying Lancet commentary about the PACE trial by two Dutch colleagues. The commentary argued that the normal range analyses represented a strict criterion for recovery and declared that 30 percent of the participants had met this recovery standard.

Yet this statement is clearly preposterous, given that participants who met this strict criterion could have had scores indicating worse health than the scores required to demonstrate disability at trial entry. The ensuing headlines and news stories highlighted both Professor Chalder’s statement that CBT and GET were effective in getting people back to normal and that 30 percent had recovered according to a strict definition. This misinformation has since impacted treatment guidelines around the world.

I have previously criticized the authors’ attempts to explain away this problem. They have essentially stated that it makes no difference if some participants were recovered on one recovery threshold at baseline because the study included other recovery criteria as well. Moreover, they point out that the normal range analyses in The Lancet were not the main findings, instead, they have argued, the comparison of averages between the groups, the revised primary outcome of the study, was the definitive evidence that the treatments work.

Sorry. Those excuses simply do not wash. The inclusion of these overlapping entry and outcome thresholds, and the failure to mention or explain in the papers themselves how anyone could be within normal range or recovered while simultaneously being sick enough to enter the study, casts doubt on the entire enterprise. No study including such a bogus analysis should ever have passed peer review and been published, much less in journals presumed to subject papers to rigorous scientific scrutiny. That The Lancet and Psychological Medicine have rejected the calls of international experts to address the issue is a disgrace.

But does this constitute misrepresentation of data within the context of the applicable definition of research misconduct? Again, I leave it to others to make that determination. I know some people, in particular, the powerful cohort of PACE supporters, have reviewed the same set of facts and have expressed little or no concern about this unusual aspect of the trial.

[This section about the PLoS One study has been revised and corrected. At the end of the post, I have explained the changes. For full transparency, I have also re-posted the original paragraphs for anyone who wants to track the changes.]

Now let’s turn to the PLoS One paper published in 2012, which has been the subject of much dispute over data access. And yet that dispute is a distraction. We don’t need the data to determine that the paper included an apparently false statement that has allowed the investigators to claim that CBT and GET are the most cost-effective treatments from the societal perspective, a concept that factors in other costs along with direct health-care costs. PLoS One, like the other journals, has failed to address this concern. (The journal did post an expression of concern recently over the authors’ refusal to share data from the trial in accordance with the journal’s policies.)

The PACE statistical analysis plan included three separate assumptions for how to measure the costs of what they called informal care€“the care provided by family and friends, in assessing cost-effectiveness from the societal perspective. The investigators promised to analyze the data based on valuing this informal care at: 1) the cost of a home-care worker; 2) the minimum wage; and 3) zero cost. The latter, of course, is what happens in the real world, families care for loved ones without getting paid anything by anyone.

In PLoS One, the main analysis for assessing informal care presented only the results under a fourth assumption not mentioned in the statistical analysis plan, valuing this care at the mean national wage. The paper did not explain the reasons for this switch. Under this new assumption, the authors reported, CBT and GET proved more cost-effective than the two other PACE treatment arms. The paper did not include the results based on any of the three ways of measuring informal care promised in the statistical analysis plan. But the authors noted that sensitivity analyses using alternative approaches did not make a substantial difference to the results and that the findings were robust under other assumptions for informal care.

Sensitivity analyses are statistical tests used to determine whether, and to what extent, different assumptions lead to changes in results. The alternative approaches mentioned in the study as being included in the sensitivity analyses were the first two approaches cited in the statistical analysis plan, valuing informal care at the cost of a home-care worker and at minimum wage. The paper did not explain why it had dropped any mention of the third promised method of valuing informal care, the zero-cost assumption.

In the comments, a patient-researcher, Simon McGrath, pointed out that this claim of robust results under other assumptions could not possibly be accurate, given that the minimum wage was much lower than the mean national wage and would therefore alter the results and the sensitivity analyses. In response, Paul McCrone, the King’s College London expert in health economics who served as the study’s lead author, conceded the point.

You are quite correct that valuing informal care at a lower rate will reduce the savings quite substantially, and could even result in higher societal costs for CBT and GET, wrote Professor McCrone. So much for the paper’s claim that sensitivity analyses showed that alternative assumptions did not make a substantial difference to the results and were robust no matter how informal care was valued.

Surprisingly, given this acknowledgement, Professor McCrone did not explain why the paper included a contradictory statement about the sensitivity analyses under alternative assumptions. Nor did he offer to correct the paper to conform to this revised interpretation he presented in his comments. Instead, he presented a new rationale for highlighting the results based on the assumption that unpaid informal care was being valued at the mean national wage, rather than using the other assumptions outlined in the protocol.

In our opinion, the time spent by families caring for people with CFS/ME has a real value and so to give it a zero cost is controversial, Professor McCrone wrote. Likewise, to assume it only has the value of the minimum wage is also very restrictive. In other studies we have costed informal care at the high rate of a home care worker. If we do this then this would show increased savings shown [sic] for CBT and GET.

This concern for patients’ families is certainly touching and, in a general sense, laudable. But it must be pointed out that what they did in earlier studies is irrelevant to PACE, given that they had included the assumptions they planned to use in their statistical analysis plan. Moreover, it does not explain why Professor McCrone and his colleagues then decided to include an apparently false statement about the sensitivity analyses in the paper.

Another patient-researcher, Tom Kindlon, pointed out in a subsequent comment that the investigators themselves chose the alternative assumptions, which they were now dismissing as unfair to caregivers. If it’s ‘controversial’ now to value informal care at zero value, it was similarly ‘controversial’ when they decided before the data was looked at, to analyse the data in this way, wrote Kindlon. There is not much point in publishing a statistical plan if inconvenient results are not reported on and/or findings for them misrepresented.

Whatever their reasons, the PACE investigators’ inclusion in the paper of the apparently false statement about the sensitivity analyses represents a serious lapse in professional ethics and judgement. So does the unwillingness to correct the paper itself, given the exchanges in the comments. Does this constitute misrepresentation of data within the context of the MRC/RCUK definition of research misconduct?

As I have said, I will leave it to others to make that determination. I look forward to the day when an international group of experts finally pursues a thorough investigation of how and why everything went so terribly wrong with this highly influential five-million-pound trial.

A post-script: I did not contact the PACE authors prior to posting this blog. After my initial series ran in October 2015, Virology Blog posted their full response to my concerns. Since then, I have repeatedly tried to solicit their comments for subsequent blog posts, and they have repeatedly declined to respond. I saw no point in repeating that exercise this time around. I also did not try to solicit a response from Professor McCrone, since he has not responded to multiple earlier requests seeking an explanation for why the PLoS One paper contains the apparently false statement about sensitivity analyses.

However, I would be happy to post on Virology Blog a response of any length from any of the investigators, should they decide to send one. I would of course also correct any documented factual errors in what I have written, which is something I have done whenever necessary throughout my journalism career. (June 25, 2017: Of course, I have now made such corrections, per my professional obligations.)

**********

Next post: The Lancet’s awful new GET trial

**********

*Explanation for the changes: In the original version, I should have made clear that my concerns involved an analysis of what the investigators called cost-effectiveness from the societal perspective, which included not only the direct health-care costs but other considerations as well, including the informal costs. I also mistakenly wrote that the paper only presented the results under the assumption that informal care was valued at the cost of a home-care worker. In fact, for unexplained reasons, the paper’s main analysis was based on none of the three assumptions mentioned in the statistical analysis plan but on a fourth assumption based on the national mean wage.

In addition, I mistakenly assumed, based on the statistical analysis plan, that the sensitivity analyses conducted for assessing the impact of different approaches included both the minimum wage and zero-cost assumptions. In fact, the sensitivity analyses cited in the paper focused on the assumptions that informal care was valued at the cost of a home-care worker and at the minimum wage. The zero-cost assumption also promised in the protocol was not included at all. I apologize to Professor McCrone and his colleagues for the errors and am happy to correct them.

However, this does not change the fact that Professor McCrone’s subsequent comments contradicted the paper’s claim that, per the sensitivity analyses, changes in how informal care was valued did not make a substantial difference to the results and that the findings were robust for the alternative assumptions. This apparently false claim in the paper itself still needs to be explained or corrected. The paper also does not explain why the investigators included the zero-cost assumption in the detailed statistical analysis plan and then decided to drop it entirely in the paper itself.

**********

Here is the original version of the section on the PLoS One paper, for anyone who wants to compare the two and track the changes:

Now let’s turn to the PLoS One paper published in 2012, which has been the subject of much dispute over data access. And yet that dispute is a distraction, we don’t need the data to determine that the paper included an apparently false statement that has allowed the investigators to claim that CBT and GET are the most cost-effective treatments. PLoS One, like the other journals, has failed to address this concern, despite an open letter about it posted on Virology Blog last year. (The journal did post an expression of concern recently over the authors’ refusal to share data from the trial in accordance with the journal’s policies.)

The PACE statistical analysis plan included three separate assumptions for how to measure the costs of informal care–the care provided by family and friends. The investigators promised to provide results based on valuing this informal care at: 1) the average wage paid to health-care workers; 2) the minimum wage; and 3) at zero pay. The latter, of course, is what happens in the real world, families care for loved ones without getting paid anything by anyone.

In PLoS One, the main analysis only presented the results under the first assumption, costing the informal care at the average wage of a health-care worker. Under that assumption, the authors reported, CBT and GET proved more cost-effective than the two other PACE treatment arms. The paper did not include the results based on the other two ways of measuring informal care but declared that alternative approaches were used in the sensitivity analyses and these did not make a substantial difference to the results. (Sensitivity analyses are statistical tests used to determine whether, and to what extent, different assumptions lead to changes in results.)

Yet in the comments, two patient researchers contradicted this statement, pointing out that the claim that all three assumptions would essentially yield the same results could not possibly be accurate. In response, Paul McCrone, the King’s College London expert in health economics who served as the study’s lead author, conceded the point. Let me repeat that: Professor McCrone agreed that the cost savings would indeed be lower under the minimum wage assumption, and that under the third assumption any cost advantages for CBT and GET would disappear.

If a smaller unit cost for informal care is used, such as the minimum wage rate, then there would remain a saving in informal care costs in favour of CBT and GET but this would clearly be less than in the base case used in the paper, wrote Professor McCrone. If a zero value for informal care is used then the costs are based entirely on health/social care (which were highest for CBT, GET and APT) and lost employment which was not much different between arms. So much for the paper’s claim that sensitivity analyses showed that alternative assumptions did not make a substantial difference to the results.

Surprisingly, given these acknowledged facts, Professor McCrone did not explain why the paper included a completely contradictory statement. Nor did he offer to correct the paper itself to conform to his revised interpretation of the results of the sensitivity analyses. Instead, he presented a new rationale for highlighting only the results based on the assumption that unpaid informal care was being reimbursed at the average salary of a health-care worker.

This concern for patients’ families is certainly touching and, in a general sense, laudable. But it must be pointed out that what they did in earlier studies is irrelevant to PACE, given that they had included the alternative assumptions in their own statistical analysis plan. Moreover, it does not explain why Professor McCrone and his colleagues then decided to include an apparently false statement about the sensitivity analyses in the paper.

One of the commenters, patient-researcher Tom Kindlon from Dublin, pointed out in a subsequent comment that the investigators themselves chose the alternative assumptions that they were now dismissing as unfair to caregivers. If it’s ‘controversial’ now to value informal care at zero value, it was similarly ‘controversial’ when they decided before the data was looked at, to analyse the data in this way, he wrote. There is not much point in publishing a statistical plan if inconvenient results are not reported on and/or findings for them misrepresented.

Whatever their reasons, the PACE investigators’ inclusion in the paper of the apparently false statement about the sensitivity analyses represents a serious lapse in professional ethics and judgement. So does the unwillingness to correct the paper itself to reflect Professor McCrone’s belated acknowledgement of the actual results from the sensitivity analyses, rather than the inaccurate results reported in the paper. Does this constitute misrepresentation of data within the context of the MRC/RCUK definition of research misconduct?

**********

If you appreciate my work, please consider supporting my crowdfunding effort, which ends June 30th.

Start typing and press enter to search