By David Tuller, DrPH
(*Thanks to the the very informed discussion–and discussants–on the Science For ME forum for alerting me to this study and its many problems!)
In 2011, Professor Trudie Chalder declared at a press conference for the high-profile PACE trial that twice as many chronic fatigue syndrome patients who received cognitive behavior therapy and graded exercise therapy got “back to normal” compared to those in the two comparison arms. Although the statement was a dramatic misrepresentation of the findings just reported in The Lancet, Professor Chalder’s comments received international media attention and helped her and and her co-investigators position the trial as a success.
Her longtime colleague at King’s College London, Professor Sir Simon Wessely, has made comparably questionable assertions. For one, he called the PACE trial “a thing of beauty”—even though it violated core principles of scientific inquiry. Despite Sir Simon’s bountiful appreciation of PACE’s aesthetic qualities, much of the international scientific community has rejected the study’s findings.
It bears repeating that the US Centers for Disease Control and Prevention has eliminated references to PACE and dropped the CBT and GET recommendations. More than 100 experts from Columbia, Stanford, University College London, Queen Mary University of London, Berkeley and other leading institutions signed Virology Blog’s open letter to The Lancet denouncing the trial’s “unacceptable methodological lapses” and demanding an independent investigation.
Given this history, it should not be surprising that when these two like-minded investigators join forces, the result is an unconvincing mishmash like “Cognitive Behavioural Therapy for chronic fatigue and CFS: outcomes from a specialist clinic in the UK.“ This study has been accepted for publication in the Journal of the Royal Society of Medicine. While it is not yet officially published, King’s College London has posted a copy of the accepted draft. (Sir Simon is the outgoing president of the society.)
The study purports to demonstrate the effectiveness of CBT as a real-world treatment for what Sir Simon, Professor Chalder and their three co-authors still prefer to call CFS. As with much of the research from leading lights of the CBT/GET ideological brigades, close scrutiny of the paper reveals how the investigators have gussied up their disappointing results with pretty ribbons and a bow.
In fact, the paper reads like a possible effort to impact the ongoing deliberations over new guidelines for ME/CFS from the UK’s National Institute for Health and Care Excellence. The current guidelines, published in 2007, recommend CBT and GET for what was then being called CFS/ME. The revision was supposed to be done this year but the pandemic has pushed the process into 2021. The NICE decision to revisit the guidelines was one of a number of signs over the last few years that the hegemony of the CBT/GET paradigm for treatment of ME/CFS was starting to crumble under the weight of its own deficiencies and contradictions.
After the pandemic hit, NICE issued a statement that GET should not be presumed to be a treatment for post-Covid fatigue based on the 2007 guidelines. Meanwhile, Professor Chalder advised post-Covid patients in a video interview not to rest too much and to get back to their activities as quickly after the acute phase of illness had passed. With potent evidence from the ME/CFS field of harm stemming from GET, along with the emerging Covid-related concerns that this exercise advice might be misapplied in the real world, it seems increasingly possible that the NICE panel will dump the recommendations for GET altogether.
Addressing CBT is likely to be trickier for the NICE panel, for multiple reasons. CBT has been an established therapeutic intervention for decades. It is routinely offered to people with major illnesses like cancer and multiple sclerosis who are also experiencing depression or have other mental health needs, so some find it hard to understand why ME/CFS patients would object to the recommendation.
The reason is that the PACE-style version of CBT is very specific to ME/CFS and was promoted as a curative treatment for the illness itself, not as adjunct support while the patient receives medical care for an underlying problem. The intervention is premised on the unproven theory that “unhelpful beliefs” about illness drive the behaviors that perpetuate the terrible symptoms. And it is designed specifically to help patients overcome their purportedly irrational fears of being disabled from an organic disease. An article like this new one–with the impressive imprimatur of the Royal Society of Medicine and its president, Sir Simon himself–could be seen as an argument in support of preserving a role in ME/CFS treatment for CBT, even if NICE ends up deciding to dump GET.
Less Than Meets the Eye
The new study investigates the outcomes for 995 patients who passed through the a specialist CFS service and received a course of CBT between 2002 and 2016. At the beginning of this period, the CBT/GET paradigm was already the prevailing treatment approach. In 2003, government agencies approved funding for the PACE trial, which the investigators themselves hailed as the “definitive” trial of the interventions.
In setting out the rationale for the intervention, the investigators write: “CBT treatment is based on a model which assumes that certain triggers such as a virus and/or stress trigger symptoms of fatigue. Subsequently symptoms are perpetuated inadvertently by unhelpful cognitive and behavioural responses.” The treatment involves, among other elements, “addressing unhelpful beliefs which may be interfering with helpful changes.”
This theory is essentially the one laid out in a 1989 paper by a team that also included Sir Simon and Professor Chalder. In more than 30 years, the notion that pathophysiological processes and not just “unhelpful cognitive and behavioural responses” could be involved in perpetuation of the symptoms has not penetrated this static formulation.
This wouldn’t matter if the results of the research justified the hype. But they don’t, no matter what the ideological brigadiers continue to argue. In the new paper, Sir Simon, Professor Chalder and their colleagues simply assert their longstanding position, cite various flawed papers to back their case, and fail to acknowledge that it has come under well-grounded and robust criticism in recent years. For example, they favorably cite the reported PACE results but do not cite the peer-reviewed papers that document the study’s flaws.
The investigators might be unhappy that their ideations about patients’ “unhelpful beliefs” have lost much credibility, and that an entire issue of an academic journal–the Journal of Health Psychology–has been devoted to the PACE-gate controversy. But the broad challenge to the CBT/GET paradigm is part of the current clinical landscape as well as the medical literature. Like President Trump, Sir Simon and Professor Chalder appear to prefer ignoring bad news and making happy talk–even when their chatter is so easy to pick apart.
Who Were the Participants?
First, the investigators seem confused about whether they are investigating patients with chronic fatigue or patients with CFS. The title suggests the answer is both. But the paper itself refers to CFS throughout and to the participants as having met CFS criteria. The conflation of these two constructs makes some sense in light of the investigators’ apparent belief that fatigue exists on a continuum, with CFS “at the more severe end of the spectrum.” Many experts do not view CFS as just an extreme case of fatigue but rather as a clinical entity on its own, albeit one that has been challenging to define in the absence of a biomarker.
In the retrospective study, all 995 participant met the criteria outlined in the 2007 NICE guidance for what it called CFS/ME. Yet only 76% met the Oxford case definition, which requires six months of fatigue and no other symptoms, and 52% met the CDC criteria, which require six months of fatigue plus four of eight other symptoms. Hm. That’s odd. The 2007 NICE guidelines advised that a diagnosis of CFS/ME could be considered if a patient suffered fatigue for four months rather than the six required by both the Oxford and CDC criteria.
So did 24% of the sample only have fatigue for between four and six months? That seems hard to understand, given that participants had been ill for a mean duration of 6.64 years. Perhaps the numbers add up in some other way I haven’t figured out. Did peer reviewers assigned by the Journal of the Royal Society of Medicine notice or ask about these apparent discrepancies? Did they actually scrutinize the paper, unlike a recent BMJ peer reviewer who acknowledged in his review that he had not read “beyond the abstract” of the assigned study?
Nor is it clear if many or any of these participants experienced post-exertional malaise, which is considered a core symptom of the illness. Neither the Oxford nor CDC definitions require a version of this symptom—more recent and better case definitions do. NICE is ambiguous on the matter, including versions of it as part of the description of the fatigue but also as one of multiple optional symptoms. Without more specifics about the sample in this paper, it is hard to determine how many people with CFS were in the sample—as opposed to idiopathic chronic fatigue, for example, which might respond to some form of CBT.
As described in the paper, the course of CBT included up to 20 sessions on a twice-monthly basis. Patients were asked to fill out several questionnaires at the start of treatment, at the fourth and seventh sessions, at discharge, and at three months after discharge. The measures included the same questionnaires for physical function and fatigue as in the PACE trial—the SF-36 and the Chalder Fatigue Questionnaire. They also included more generic scales, such as those for work and social adjustment, depression and anxiety, and overall health.
It is important to note that all of these measures are subjective. The study includes no objective indicators—how far people could walk, whether they returned to work, whether they got off social benefits, and so on. And everyone knew they were receiving an intervention designed to help them. In fact, as described in PACE, the CBT approach includes informing participants that the intervention has already been proven to work. It should not be surprising that some people receiving such an intervention would report short-term but ephemeral benefits well within what might be expected from a placebo response. Without any objective measures, such responses are fraught with potential bias and inherently unreliable.
Showing Poor Results in the Most Flattering Light
Even when presented in the best light, the main results do not support the argument that the treatments overall are effective. For physical function, the mean score rose from 47.6 at baseline to 57.5 at discharge and 58.5 at three-month follow-up. (The SF-36 scale runs from 0 to 100, with higher scores representing better physical function.) In the PACE trial, a score of 65 or below was considered disabled enough for trial entry, so the mean scores at discharge and follow-up in this study represent serious disability. Likewise, the mean CFQ score at discharge and follow-up, while modestly improved since baseline still represents high levels of fatigue.
On closer inspection, things look even worse. As it turns out, those highlighted results do not seem to take into account a lot of missing data. Of the 995 participants in the study, the investigators define 31% as lost-to-follow-up—that is, they provided no data at either the end of treatment or the follow-up assessment three months later.
So we have no idea at all what happened to almost a third of the participants. Maybe some got worse and became bed-bound or even killed themselves. Maybe some got bored with the psychotherapy. Maybe others felt it was a waste of time and found they got more from smoking weed or going fishing. It’s not a good thing when almost a third of your patients, for whatever reason, don’t bother to let you know what happened to them. This lost-to-follow-up rate is not mentioned in the abstract–a disturbing omission that could be inadvertent or could be an attempt to underplay information that reflects poorly on the reported findings.
Interestingly, the drop-outs appeared to be in worse condition at baseline than those who stayed in. They reported more depression, poorer work and social adjustment, and significantly worse physical function—their mean score was 7.38 points lower on the SF-36 scale. Perhaps they were more likely to have actual ME/CFS and not idiopathic fatigue. For some of these patients, the CBT intervention–with its message that their symptoms were being perpetuated by their “unhelpful beliefs” and irrational behavior–might have led to deterioration in their health through both organic and psychological pathways.
To their credit, the investigators acknowledge this limitation. The poorer health condition of the drop-outs, they write, “suggests that there may have been some bias in the data, in that those who completed treatment may not represent all patients who access CBT treatment for CFS.” Notwithstanding this warning, they deploy their biased data to boost the impression that the intervention is effective.
And even the 31% drop-out figure isn’t a true reflection of the low data collection rates in the study. Of the 995 participants, only 581 answered the CFQ at end of treatment and only 503 at follow-up—58% and 51%, respectively. For the SF-36, only 441 responded at discharge and 404 at follow-up—44% and 41%, respectively. (For unexplained reasons, only 768 of the 995 participants provided information on the SF-36 at baseline.)
That means the loss-to-follow-up at discharge on the CFQ and the SF-36 were, respectively, 42% and a whopping 56%–and even worse on follow-up. When close to or more than half a sample does not provide data for a key outcome, investigators should be cautious in interpreting findings from those who managed to stick out the intervention. If participants with lower scores at baseline dropped out, as with the SF-36, that alone should raise the mean among those who remained. It seems silly to position modestly improved mean scores from a half-depleted sample as an indicator of treatment success when little or nothing is known about those who disappeared.
Claims of Causality
In the discussion section, the investigators write the following: “The CBT intervention led to significant improvements in patients [sic] self-reported fatigue, physical functioning and social adjustment.” From my understanding of the King’s English*, I would interpret that sentence as a statement of causality—and an unwarranted one at that. The investigators have not reported evidence that the CBT intervention “led to” anything. They have provided evidence only that their CBT intervention was chronologically followed by reported changes in mean results among a shrinking pool of participants.
*[I was informed this phrase should have been rendered as the Queen’s English, since there is a current queen and not a current king. But my American dictionary defines the King’s English as “standard, pure, or correct English speech or usage,” and does not mention a queen. In other words, the King’s English is a generic, to be used even when the monarch is female or non-binary. So I stand by my American usage.]
They make a similar extravagant slip when they write the following in their conclusions: “The lack of a control condition limits us from drawing any causal inferences as we can not be certain that the improvements seen are due to CBT alone and not any other extraneous variables.” This statement is self-contradictory. To state that the improvements might not be “due to CBT alone” is to posit as fact that they are due to CBT at least in part but that other factors might have contributed. In one sentence, the investigators are drawing a causal inference while denying the possibility of being able to do just that.
Let’s be clear. Given the study design, there is no evidence that the CBT intervention played any role whatsoever. Perhaps it did; perhaps not. It is unfortunate, but not surprising, that Sir Simon, Professor Chalder and the peer reviewers selected by the Journal of the Royal Society of Medicine* did not notice or care about these impactful misstatements of causality–a continuation of a time-honored tradition of sloppy argumentation and inadequate peer-reviewing in this domain of science. *[I initially wrote that the peer reviewers were selected by the Royal Society of Medicine, when I intended to write the Journal of the Royal Society of Medicine. I apologize for the error.]
Oh, one last point. In the abstract, the investigators highlight that 90 % of patients “were satisfied with their treatment.” Presumably that impressive-looking figure does not include responses from the 31% who were lost-to-follow-up. Does it include the many others who showed up at discharge and follow-up but failed to provide key information on other questionnaires? Who knows? The study does not mention how many responded to this question, as far as I can see. It is not surprising that this squishy but deceptively presented data point found its way into the abstract’s conclusions.
My epidemiology colleagues at Berkeley have used the PACE trial in seminars as a case study of how not to conduct research. If their students turned in something as inadequate as this new Wessely-Chalder collaboration, they’d get slapped down pretty quickly.