Trial By Error: Did the IBS Trial Really Show that Web-Based CBT Offered Significant Clinical Effectiveness?

By David Tuller, DrPH

I wrote some posts last year about the ACTIB trial–a major study of telephone-delivered and web-delivered cognitive behavior therapy (TCBT and WCBT) for irritable bowel syndrome (IBS). Contrary to how the results have been framed by those with reputational and financial interests in promoting them, the study demonstrated that WCBT did not provide clinically significant benefits over treatment as usual on its two primary outcomes. At least that’s how I interpret the data.

Both of these primary outcome measures were self-reported, a symptom severity scale and a measure of impact on work and social functioning. (More on these below.) As no less an authority than the Journal of Psychosomatic Research has recently noted, these sorts of outcomes are at particular risk of bias when blinding is not rigorous. In this case, with blinding not possible given the nature of the interventions, the results should already be assumed to be infused with an unknown amount of bias. In other words, marginal or modest reported benefits would be well within the range expected given the bias inherent in an unblinded study relying on subjective outcomes.

In January of 2020, Mahana Therapeutics, a San-Francisco start-up, announced that it had licensed the rights to the web-based program. In the trial, the program was called Regul8; Mahana rechristened it as Parallel. (Does anyone get the new name? Parallel to what?) I noted in blog posts and in letters to Mahana officials and advisers that the company was making inflated claims about the purported benefits of the program.

Since then, the US Food andDrug Administration has approved it for marketing–although a product like this is not assessed by anything approaching the (purportedly) rigorous standards used for pharmaceuticals. It has also received a CE mark in the UK, which apparently allows it to be sold there as well. (Given Brexit, the implications of the CE mark for European sales are not fully clear to me–or, I presume, to anyone.) This is a very minimal regulatory bar. According to the relevant UK government agency, “the CE mark is not a quality mark, nor a guarantee that the product meets all of the requirements of relevant EU product safety law.”

More recently, the same research team has published cost-effectiveness data from the IBS study. My friend and colleague Keith Geraghty, a research fellow at the University of Manchester, has tweeted about this new paper and some of its assumptions. I haven’t looked at it yet in depth, but what I don’t get is how an intervention can be cost-effective if it isn’t really clinically effective. Cost-effective at what?

The abstract of the new cost-effectiveness study starts with the following sentence: €œTelephone therapist delivered CBT (TCBT) and web-based CBT (WCBT) have been shown to be significantly more clinically effective than treatment as usual (TAU) at reducing IBS symptom severity and impact at 12 months in adults with refractory IBS.€

It is important to note that the results for TCBT and WCBT in the IBS study were not the same. TCBT scored better overall on primary and secondary measures. That is not surprising. The greater personal contact could potentially lead to greater therapeutic effects; it could also lead to greater bias in participant responses. Teasing the two phenomena apart would be challenging.

But Mahana is marketing WCBT, not TCBT, and key academic investigators have financial interests in the product. The statement about the interventions being €œsignificantly more clinically effective than treatment as usual€ in the new cost-effectiveness paper prompted me to take another look at ACTIB’s 12-month results for the two co-primary outcomes. (Secondary outcomes are secondary outcomes; they should not be advanced as proving success when primary outcomes generate null results.) And when it comes to €œreducing IBS symptom severity and impact at 12 months,€ the above sentence from the cost-effectiveness paper does not seem to be consistent with the specific WCBT data for the primary outcomes.

**********

Two co-primary outcomes for the IBS study

The two co-primary outcomes were the 12-month results for the IBS-Symptom Severity Scale (IBS-SSS) and the Work and Social Adjustment Scale (WSAS). In previous posts, I focused on the results for the first–a 500-point scale in which higher scores represent more severe symptoms. A reduction of at least 50 points represents what is considered a clinically significant reduction in symptom severity for an individual. In this trial, at 12 months, the WCBT group had a mean score of 35.2 points lower than the TAU arm. That is quite a bit less than the 50-point difference considered to be clinically significant for an individual.

Despite this 50-point threshold, the investigators seem to have created something of a statistical out. Here’s what they wrote: €œIBS-SSS is widely used in IBS studies and a 50-point within-participant change from baseline regarded as clinically significant. We used a 35-point between-group change as our minimal clinically important difference (MCID) assuming some improvement in TAU arm (15 points placebo response) as in the MIBS trial.”

Given that the 35.2-point finding for the WSAS in the WCBT arm squeaks past the 35-point threshold designated by the investigators, they might consider these results to represent clinical significance. But it is unclear why they should reduce the MCID target for their intervention by 15 points because of a presumed placebo response–even if a previous trial reported such a finding.

The TAU condition is meant to replicate, more or less, the usual care that patients receive. To be considered effective, the intervention should demonstrate a clinically significant benefit when compared with TAU, not when compared with baseline. In fact, the first sentence of the abstract of the new cost-effectiveness paper specifically references the comparison with TAU, not with baseline numbers. After all, clinical trials are designed to assess the effects of interventions over and above changes in those who don’t receive them.

To be fair, the investigators also reported that a greater percentage of those who answered the IBS-SSS at 12 months in the WCBT arm reported an improvement of at least 50 points, compared to the percentage in the TAU arm. But 30% of the participants did not provide 12-month responses for the IBS-SSS, so their responses are unknown. That means interpreting these data can be challenging. Perhaps those who received the intervention were more inspired to report positive results than those who received TAU alone, even if the latter also felt they’d improved to the same extent. We simply don’t know.

In other words, these reported findings do not take into account the large amount of missing data. In contrast, the designated primary outcomes, the differences in the mean scores of the arms at 12 months–were specifically adjusted to address that issue.

The other co-primary outcome measure, the WSAS, is a 40-point scale to assess how an illness or medical condition impacts work and social domains. As with the IBS-SSS, lower scores represent better results. In the IBS study, the differences in mean scores at 12 months between the intervention arms and the TAU arms at 12 months were modest, a 3.5-point difference for TCBT and a 3-point difference for WCBT. The paper did not include an MCID for that measure, unlike for the IBS-SSS.

However, members of the same research team have recently published a study of CBT as a treatment for so-called €œpersistent physical symptoms€ (PPS), a category that includes IBS; in fact, the trial recruited a substantial number of participants from gastroenterology. In this instance, the investigators designated the WSAS as the sole primary outcome and further identified a difference of 3.6-points as the MCID for that measure. As reported, the benefits for the intervention fell well below the MCID for the measure.

If the MCID of a 3.6-point difference on the WSAS is applied to the IBS study findings, and I see no reason why it shouldn’t be, especially given the overlap in research teams and patient populations–the conclusion is clear. Neither the TCBT nor the WCBT provided clinically significant benefits over TAU on that measure.

Yet the investigators still claim in the first sentence of the abstract of their new cost-effectiveness paper that both TCBT and WCBT were shown to have a clinically significant effect on reducing symptom severity and impact. And the WCBT program is being marketed with the same argument. Given that the co-primary outcomes of the major study of WCBT found no clinically significant reduction of symptom severity and no clinically significant impact on improving work and social adjustment at 12 months, these claims are certainly open to question.

Start typing and press enter to search