By David Tuller, DrPH
[*In the last paragraph, I mistakenly referred to the CODES protocol rather than the CODES statistical analysis plan. I apologize for the error.]
I have recently written about CODES, the high-profile clinical trial investigating whether cognitive behavior therapy (CBT) could reduce the frequency of dissociative seizures, also known as psychogenic non-epileptic seizures. The trial, published by Lancet Psychiatry, was an open-label study relying on self-reported outcomes–a design highly vulnerable to bias. The 368 participants were randomized into two armsâ€”a group that received standardized medical care (SMC) along with a form of CBT designed to impart seizure-reduction strategies, and a group that received SMC alone.
CODES reported null results for the primary outcome of seizure frequency per month at 12 months after randomization. This poor showing must have been a serious disappointment for the investigators, some of whom have been promoting CBT as a treatment for these unexplained seizures for more than a decade.
Two of the three lead investigators are with King’s College London’s Institute of Psychiatry, Psychology & Neuroscience. As I have already reported, the university issued a press release that inexplicably presented the trial as a success because of findings on some secondary outcomes while not mentioning that the primary outcome had null results. (The null results for seizure frequency at 12 months were mentioned in passing, but not the essential fact that this was the outcome deemed most important by the investigators.) One of the lead investigators even asserted, per the press release, that the study provided “evidence for the effectiveness” of the treatment. That rosy blurb about the study findings is not credible.
Earlier this month, I wrote to King’s College London about my concerns regarding this perplexing press release, but have not yet heard back.
Here is an assessment of the study from my colleague Philip Stark, a professor of statistics at Berkeley: “The trial did not support the primary clinical outcome, only secondary outcomes that involve subjective ratings by the subjects and their physicians, who knew their treatment status. This is a situation in which the placebo effect is especially likely to be confounded with treatment efficacy. The design of the trial evidently made no attempt to reduce confounding from the placebo effect. As a result, it is not clear whether CBT per se is responsible for any of the observed improvements in secondary outcomes.”
More on the Primary Outcome
I wanted to look a bit more at both the primary outcome and the smorgasbord of 16 secondary outcomes–all self-reported–that allowed the investigators to maximize their chances of obtaining positive results. Given their susceptibility to bias, self-reported outcomes in open-label studies are not viewed as acceptable evidence to support regulatory approval of pharmaceuticals.
And “self-reported” applies to outcomes involving the number of seizures as well. In this study, participants were asked to keep a daily seizure diary, and the data from these diaries were collected every two weeks. But just as people can forget to take daily pills on some mornings, they can forget to record one or more seizures they might experience on any given day. Anything self-reported–even what is or is supposed to be reported daily–can be subject to some form of recall or recording bias. It is certainly possible that participants who know they are receiving a treatment expected to help them and those who know they are not receiving that treatment would fill out daily seizure diaries differently. So while the number of seizures could be called an “objective-like” outcome, when measured in this way it remains subject to external influences unrelated to actual treatment effect.
In any event, the primary outcome yielded unexpected results–at least for the investigators. In the intervention group, the median number of seizures per month at baseline was 12.5, compared to 19 in the SMC group. (That big difference after randomization is odd.) At the 12-month assessment, those in the SMC group had a median of 7 seizures per month–a reduction of 12 seizures. In comparison, the CBT group had a median of 4 seizures a month–a reduction of 8.5 seizures.
Thus, the group that did not receive the intervention experienced a bigger drop–both proportionally and in absolute numbers of seizures. According to the study’s analysis, these data revealed a trend or tendency in that direction, although the findings did not reach the threshold of statistical significance. Let me repeat that: In the largest-ever study of the long-standing belief that CBT leads to a reduction in the number of seizures, the non-CBT group reported better results.
So this wasn’t even a near-miss. These definitive findings from CODES indisputably undermine claims stretching back more than a decade that CBT can reduce the number of seizures in patients with this disorder.
More On All Those Secondary Outcomes
The investigators included 16 secondary outcomes in the study, measured either through questionnaires or the seizure diaries, and reported statistically significant findings for nine of them: seizure bothersomeness, longest period of seizure-free days in the last six months, health-related quality of life, psychological distress, work and social adjustment, number of somatic symptoms, self-rated overall improvement, clinician-rated overall improvement, and satisfaction with treatment. Although many of these findings were modest, the array appeared impressive.
Yet the seven outcomes that failed to achieve statistically significant effects also constituted an impressive array: seizure severity, freedom from seizures in the last three months, reduction in seizure frequency of more than 50% relative to baseline, anxiety, depression, and both mental and physical scales on a different instrument assessing health-related quality of life than the one that yielded positive results.
So parsing these findings, CBT participants reported that the seizures were less bothersome than in the SMC group, but not less severe. They reported benefits on one health-related quality-of-life instrument, but not on two separate scales on another health-related quality-of-life instrument. They reported less psychological distress, but not less anxiety and depression. When viewed from that perspective, the results seem somewhat arbitrary, with findings perhaps dependent on how a particular instrument framed this or that construct.
If investigators throw 16 packages of spaghetti at the wall, some of them are likely to stick. The greater the number of secondary outcomes included in a study, the more likely it is that one or more will generate positive results, if only by chance. Given that, it would make sense for investigators to throw as many packages of spaghetti at the wall as feasibleâ€”unless they have to pay a statistical penalty for having boosted their odds of apparent success.
The standard statistical penalty involves accounting for the expanded number of outcomes with a procedure called correcting (or adjusting) for multiple comparisons (or analyses). In such circumstances, statistical formulae can be used to tighten the criteria for what should be considered statistically significant results–that is, results that are very unlikely to have occurred by chance.
The CODES protocol made no mention of correcting for this large number of analyses, or comparisons. The CODES statistical analysis plan included the following, under the heading of â€œmethod for handling multiple comparisons: “There is only a single primary outcome, and no formal adjustment of p values for multiple testing will be applied. However, care should be taken when interpreting the numerous secondary outcomes.”
In other words, the investigators decided not to perform a routine statistical test despite their broad range of secondary outcomes. It is fair to call this a questionable choice, or at least one that departs from the approach advocated by many trial design experts and statisticians, such as Professor Stark, my Berkeley colleague. A self-admonition to take care “when interpreting the numerous secondary outcomes” is not an appropriate substitute for an acceptable statistical strategy to address the potpourri of included measures.
Despite this lapse, it appears that someone–perhaps a peer-reviewer?–questioned the decision to completely omit this statistical step. A paragraph buried deep in the paper mentions the results after correcting for multiple comparisons, with no further comment on the implications. Of the nine secondary outcomes initially found to be statistically significant, only five survived this more stringent analysis: longest period of seizure-free days in the last six months, work and social adjustment, self-rated overall improvement, clinician-rated overall improvement, and treatment satisfaction.
Let’s be clear: These are pretty meager findings, especially since they are self-reported measures in an open-label trial. For example, it is understandable and even expected that those who received CBT would report more “treatment satisfaction” than those who did not receive it. It is also understandable that a participant who received a treatment and the clinician who treated that participant would be more likely to rate the participant’s health as improved than when compared to the SMC group. And a course of CBT could well help individuals with medical problems adjust to their troubling condition in work and social situations.
None of this means that the core condition itself has been treated–especially since those who did not receive CBT had better results for the primary outcome of seizure reduction at 12 months.
Dismissing the Primary Outcome After the Fact
When a trial fails to produce the expected results, it should lead smart investigators to reexamine their hypotheses. In this case, that would mean questioning the framework for the intervention, a specialized form of CBT that included training aimed at seizure reduction. Since the treatment did not reduce the number of seizures, perhaps the assumptions underlying the intervention were misguided. Yet the investigators do not appear to seriously entertain this possibility. Instead, they challenge the appropriateness of their own primary outcome.
Here’s what they write: “Increasing evidence suggests that seizure frequency alone is not the most important determinant of quality of life in patients with dissociative seizures. A 2017 study found that mood, anxiety, and illness perceptions were the factors most closely linked with quality of life in patients with dissociative seizures, emphasising the importance of the secondary outcome effects in this trial.”
A quick review of the 2017 paper suggests the investigators are engaging in some over-interpretation as they seek to squeeze meaning from their weak findings beyond the negation of their prior assertions. The 2017 paper reported on a cross-sectional study of a group of patients, including those with dissociative disorders, about associations between various indicators and quality of life. It is true that, in the analysis, some psychosocial factors were linked to quality of life, while number of seizures was not. But it is unwarranted to take these findings to mean seizure reduction is of secondary importance to patients.
Let’s imagine two people, one with 20 unexplained seizures a month and one with 15. After multiple consultations, neither can find answers to their medical questions or treatment beyond the suggestion of psychotherapy. In both cases, work associates, friends and even family members express doubts about the reality of their illness. All of this leads to poor quality of life–whether they have 20 or 15 unexplained seizures a month. In other words, the salient operative factor impacting quality of life might be the presence or absence of seizures for which no one can provide an adequate medical explanation, not the absolute number of such events.
Under the circumstances, the interpretation offered by the investigators–that seizure reduction is less essential to quality of life than other indicators–is unconvincing. In any event, neither of the three outcomes designated in CODES as measuring some aspect of health-related quality of life survived the correction for multiple comparisons. That suggests the intervention had no measurable impact on health-related quality of life, so perhaps the issue is moot.
The CBT intervention relied on the notion that unexplained seizures were amenable to reduction and perhaps elimination through cognitive restructuring. The approach was grounded in the assertion that these events were “lacking organic etiology”–as expressed in the 2010 pilot that road-tested the intervention for this purportedly “psychogenic” condition. Attitudes toward dissociative seizures have shifted in the last ten years, as evidenced by the name change and the more cautious language used in the medical literature when describing the presumed causes of this phenomenon.
Despite these developments, the therapeutic approach itself does not appear to have changed accordingly. The description of the intervention in the study’s supplemental material noted that it was “based on models of fear-avoidance and views DS as states of altered awareness and responsiveness initially occurring in the context of heightened arousal which leads to emotional and behavioural avoidance.”
To support this theoretical foundation, the investigators cite a 2006 study, co-authored by one of them, that defines these seizures as having “no epileptic or other organic basis”–the same argument advanced in the 2010 pilot study. The 2006 paper concluded that “the anxiety related symptoms and avoidance behaviour prevalent in DS are a potential focus for a cognitive behavioural approach analogous to that used in the treatment of other anxiety disorders.”
If the investigators now profess to have a more nuanced understanding of the disorder they are investigating, it is not clear how these advances have been reflected in the intervention itself. Of course, this distinction wouldn’t matter much if the seizure-reduction-oriented CBT had actually been effective in reducing seizures.
Violating an Admonition for Care in Interpreting Outcomes
If CBT helps people to cope with challenges, including major illness and disease, that’s excellent. In the case of CODES, perhaps some people gained from wise therapeutic counsel and experienced reduced levels of stress. Such benefits should be welcomed.
But no one should confuse these benefits with treatment of the core condition itself. It would be silly to suggest that CBT can cure cancer or Parkinson’s, even though people with cancer or Parkinson’s might benefit from the adjunctive support offered by CBT. Given the trial’s size and reach, the findings from CODES should be viewed as the best answer we are likely to have as to whether CBT leads to seizure reduction. The answer is no.
Post-CODES, it should no longer be possible to assert that CBT is a treatment for dissociative seizures. At best, it appears to be an intervention that can provide some patients with the kinds of psychosocial relief that would be expected from a course of CBT–and no more. The CODES statistical analysis plan* [I initially wrote “protocol” here and corrected it the day after publication] promised that care would be taken “when interpreting the numerous secondary outcomes.” Given the efforts to promote the trial as a success based on a handful of modest findings in some secondary outcomes, that promise seems to have been breached.
Alicia Butcher Ehrhardt, PhD says
Is there an equivalent to medical of legal malpractice – for scientists?
Sounds like there definitely needs to be, especially when people are being harmed.
There’d be some possibility researchers would be careful if they could be sued for malpractice and negligence.
After all, they are advocating for treatment – treatment that is a fraud. And dangerous.
Thanks for another excellent rundown of yet another dreadful CBT-related paper. One things strikes me though – if the patients who received CBT had no significant reduction in the number of seizures but the time interval between seizures was increased, doesn’t that mean that they had more seizures together at one time? That doesn’t sound to me like a good outcome, it sounds exhausting.
Peter Trewhitt says
I wish we could know whether the authors are still happy with their paper and its hype and are just ignoring you David as a trivial nuisance in the hope you will go away, or if they at any level recognise the validity of your critique and are desperately messaging each other as to which is better damage control, responding or ignoring you in the vain hope you will go away.
On the larger stage, there is an interesting research question here, where a group of academics are so attached to their theory that they repeatedly use an extremely weak methodology over and over again for confirmatory evidence and fail to address any critiques of their work. How do they justify this to themselves? Is this a failure of their understanding, a basic lack of insight into formulations of science taught to first year undergraduates for well over sixty years or is this a cynical manipulation of a flawed system for self promotion and potential financial gain, milking the CBT cash cow as it were.
In the UK in the Blair years the commoditisation of CBT and an obsession with process fitted the political zeitgeist. Developing marketable care packages that could be delivered â€˜cost effectivelyâ€™ by technicians with little or no understanding of the broader issues was seen as a desirable goal. This seems to have been replicated in the science misused to support it, with a clinical research package delivered again and again by research bots in relation to new patient populations with no understanding of research method or of the patient groups it is inflicted on.