Trial By Error: Columbia Experts Urge BMJ to Retract Problem-Plagued Study

By David Tuller, DrPH

On Thursday, Professors Vincent Racaniello and Mady Hornig, both from Columbia University, wrote to BMJ’s research integrity coordinator. I have been corresponding with BMJ, and specifically the research integrity coordinator, about the Norwegian study of cognitive behavior therapy combined with music therapy as a treatment for chronic fatigue in adolescents after acute EBV infection (known in the US as mononucleosis and elsewhere as glandular fever). BMJ Paediatrics Open published the paper a few months ago.

Besides many issues with the paper itself (see below), the peer review process seems to have broken down. BMJ Paediatrics Open has open peer reviews, and the reviews are posted on the journal’s website. In this case, one of the two peer reviewers wrote in his review that he did not read “beyond the abstract.”  (He provided some notes on the abstract.)

For reasons that haven’t been explained, the reviewer’s admission that he didn’t review the actual paper was not an obstacle to publication for BMJ Paediatrics Open. It is not clear if editors read the peer review and decided it didn’t matter that the reviewer didn’t read the paper, if they read the review and didn’t notice the relevant statement, or if they didn’t read the review at all. Whatever the explanation, the lapse represents a major peer review failure for BMJ.

Along with colleagues from Columbia, Berkeley and University College London, I sent a letter to BMJ alerting editors this problem as well as multiple methodological and ethical concerns involving the paper itself. That was almost two months ago. Yet BMJ Paediatrics Open has still not warned readers that this paper did not pass BMJ’s own strict standards for peer review. Nor has it offered a deadline for resolving the self-evident problems with the paper. What is BMJ waiting for?

This casual approach to addressing critical matters of scientific integrity seems to belie the point of having a research integrity coordinator. It also suggests a perplexing indifference to the health of children suffering from a serious post-viral condition–at a time when the planet is engulfed in a viral pandemic with potentially major long-term consequences.

Besides the broken peer review process, the study suffered from multiple flaws, including but not limited to the following:

1). The trial protocol, registration and statistical analysis plan all described the research as a fully powered trial. But recruitment proved difficult, the intervention group experienced high attrition, and the results were disappointing. The published paper presented the research as a feasibility study without mentioning that it was, in fact, designed as a fully powered trial. The investigators suggested in their conclusions that the results “might justify a full-scale clinical trial”—even though they had just conducted such a trial and it had failed to generate the findings they wanted.

2). “Post-exertional malaise” was highlighted as an outcome in the published paper but was not mentioned in the trial registration, protocol or statistical analysis plan.

3). The primary outcome was average steps per day, objectively measured. Both groups performed worse on this measure after the trial than they did at baseline, with the decline even greater among those who received the intervention. In arguing in their conclusion that the findings “might justify a full-scale clinical trial,” the investigators omitted reference to these poor results for their pre-designated primary outcome.

4) The investigators constructed a definition of “recovery” that ignored the objectively measured primary outcome and relied solely on a subjective secondary outcome. Then they presented these questionable “recovery” data only in a per-protocol analysis rather than an intention-to-treat analysis. The per-protocol analysis by definition failed to account for the high attrition rate in the intervention group, leading to an inflated reported “recovery” rate.

In their letter to the BMJ’s research integrity coordinator, Professors Racaniello and Hornig note that continued delay in this matter is not acceptable. As they write, “It seems clear that BMJ has an obligation to inform readers immediately that the paper did not pass a proper peer review, and, assuming the methodological lapses documented by Dr. Tuller are confirmed, to retract the paper.”

The letter was cc’d to the other co-signers of the initial letter (Professor Jonathan Edwards of University College London, Professor John Swartzberg of UC Berkeley, and me). Also cc’d were Professor Imti Choonara, BMJ Paediatrics Open’s editor-in-chief; Dr Fiona Godlee, BMJ’s editorial director; and Ingrid Spilde, a Norwegian journalist who has written about the study.


Dear Ms. Ragavooloo–

We hope you are managing well through these complicated times.

With increasing reports of persistent post-COVID-19 symptoms and concerns about the potential for these post-viral syndromes to develop into ME (concerns acknowledged by NIAID Director, Dr. Anthony Fauci, 9 July 2020, physicians are increasingly looking to authoritative sources to learn about post-viral fatigue as well as ME and how to best manage them.

Ensuring that the publications to which readers will turn for clinical guidance have passed proper peer review and be free of study should certainly be a matter of vital importance at the BMJ.

It is in this context that we wish to underscore the importance of rapid resolution of the matters to which Dr. Tuller, along with us and other colleagues, first called your attention on 31 May 2020.

It seems clear that BMJ has an obligation to inform readers immediately that the paper did not pass a proper peer review, and, assuming the methodological lapses documented by Dr. Tuller are confirmed, to retract the paper.

We thank you in advance for your considered attention and prompt action.


Vincent R. Racaniello, PhD
Professor of Microbiology and Immunology
Columbia University
New York, New York, USA

Mady Hornig, MA, MD
Associate Professor of Epidemiology
Columbia University Mailman School of Public Health
New York, New York, USA


By the way, if you noticed a glitch in the third paragraph of the letter, good for you! I am told that the phrase “free of study” should have read “free of protocol deviations, ethical anomalies and other issues.” Oops! I assume BMJ got the point in any event.

{ 10 comments… add one }
  • Ellen Goudsmit 25 July 2020, 5:14 pm

    This is consistent with the study of the bmj by Dr Stouten and myself. Online. Also covered in article with Sandra Howes in J Health Psych. But it just gets worse.

  • Anon. 26 July 2020, 1:49 pm

    1. Post-exertional malaise was *not* measured – they substituted it for post-exertional *fatigue* AND limited it to the day after – PEM can be delayed by up to 3 days and can include ANY combination of symptoms.
    This can be seen by the graph.

    The Canadian working definition, International Consensus Criteria and every other definition of post-exertional malaise I’ve seen do not limit it to fatigue or insist it must begin by the following day.

    2. I do not understand how a drop out rate of 7/21 (a third) equates to “acceptable” treatment. This *was* in the abstract. The time keeping of those remaining in the study is likely to be the result of their parents driving them, rather than an indication of acceptance.

    3. Reviewer Maria Loades is well known for her insistence that CBT “works” for CFS and should not ever have been asked to review a paper on a treatment that she has time and again published her own dubious results for – I notice though that she suggests dropping the mention of “statistically significant”.

    Because the outcome was not within statistically significant parameters perhaps?

    The reviewer from “Google”, no academic institution or location given, points of that p-value 0.12 is not adequate if you are claiming the treatment works, the figures are misleading on acceptability also. Perhaps he stopped reading due to the paper being unfit to publish?

    That is what his comments suggest – but clearly this BMJ journal needs it spelled out in triplicate.

    4. P.S. Do reviewers get paid much? I quite fancy a job that takes moments and requires no academic institution or other affiliation.

    3. A 7 day measurement of steps in wholly inadequate as a measure during a 3 month trial. Patients were aware of the measurement and may have wanted to do “well” or simply had a better week that week – and could well have deteriorated the following week, a pattern that often happens with the illness. I am also unclear on whether there were visible readings that patients could see – anyone who has had a step counter will know they are quite a motivator at first – especially during a treatment program when the children and teenagers were hoping for a positive result.

    Such a disappointment to see more research waste.

    I recommend EM Goudsmit’s article on bias in the BMJ, as she mentioned above.:

    2004 Goudsmit and Stouten – Chronic Fatigue Syndrome: Editorial Bias in the British Medical Journal
    doi: 10.1300/J092v12n04_05

    More recently, also by
    Goudsmit, EM and Howes, S. (2017) –
    Bias, misleading information and lack of respect for alternative views have distorted perceptions of myalgic encephalomyelitis/chronic fatigue syndrome and its treatment[

  • SusanC 26 July 2020, 3:19 pm

    From reading the paper, it appears that the treatment group actually did worse than the control group on the primary outcome measure defined in the registration of the trial (i.e. mean steps per day). They don’t make it at all clear if this was statistically significant or not.

    If it was statistically significant, they should have made this negative result (i.e. the treatment actually makes patients worse) much more prominent in the paper.

    Even if it wasn’t statistically significant — the effect of the treatment is too small to detect with the given sample size — some more discussion of the power of the test and what this says about the clinical significance of the effect would be in order .(e.g. can we conclude that the benefit of the treatment, if there is any at all, is too small to be worthwhile?)

    So OK, these things happen all the time in research: you don’t get the result you hoped for, or the effect is smaller than you hoped for when you picked the sample size, or the drop-out rate is higher than expected etc. But still, better reporting in the paper of the negative result is called for.

    (I’ve certainly had to stand up in project review meetings with government sponsors and report findings along the line of “the fancy new thing works, but it works less well than this competing older thing we’re comparing it against”. And said something along the lines of “we had to actually do this experiment to find that out” when the sponsors raise an eyebrow at the negative result.)

  • CT 26 July 2020, 5:27 pm

    Come on BMJ (company), do the necessary, this isn’t a good look.

  • SusanC 27 July 2020, 4:49 am

    Figure 3 in the paper doesn’t have error bars; it really should.

    In general, when you report this kind of statistical data you should also report the uncertainty in the measurement due to the sample size. This is particularly the case here, where we’re told that the study is statistically underpowered and they had difficulty recruiting participants. From the figure, you have no idea whether the effect is real or just random noise due to the small sample size. And this figure is the really important one, because it’s where they report the outcome measure they declared up front in the registration of the clinical trial.

  • Jen 27 July 2020, 8:48 am

    Is it ironic that CBT & GET made me so ill that I could not even listen to music for 5 years. I am a music lover so the loss was significant. I still can’t listen to music every day like I used to, but I can every other week or so for short periods. I find music therapeutic and have benefitted throughout my life in lots of different ways, but thinking of it as some sort of physical cure for a distinct metabolic energy impairment of PENE… instead of actually investigating the metabolic impairment, I mean, jeez these people are ridiculous. How do people like this get funding, but Karl Morten at Oxford can’t get funded to actually study the metabolics and is fundraising from patients! What kind of crazy world do we live in that research funders think this is a good use of resources in the long term? I despair at modern science. It’s become fanciful and scientific principals are lost in hubris & woolgathering.

  • jimells 27 July 2020, 9:11 am

    Dear Academics,

    Please stop moaning about how awful all these publishers are. They exist to generate a return on investment. They do so by publishing marketing materials, not scientific research. No amount of letter-writing will change the stripes on these skunks.

    Exposing fraud is always good, but even better would be to do your own publishing. I recall that Columbia already has a publishing arm called “Columbia University Press”. I see on their website that they are “a leading publisher of scholarly and trade books in the humanities, social sciences, and sciences.” They probably even have staff that know about research journals.

    Instead of Columbia’s library sending huge fees to Elsevier and the other profiteers, how about spending the money in house?

  • SusanC 2 August 2020, 3:36 pm

    Some of the non-academics here are wondering if this is a sign of a more general problem with peer review.

    As an academic, I’ll say that the phenomenon of the junk journal that will print just about anything is well known. You can get just about anything – no matter how bad – published somewhere. But for your publication to count when you are applying for an academic job or looking for a promotion (or just for your contract to be renewed) the publication needs to be in some venue that the department considering hiring you believes is non-junk.

    Part of the reason why you cant solve the peer review problem by having a university department start a new journal, is that if you do that everyone (especially: the people who get to decide whether academics get hired) will default to thinking that the new journal is another one of those junk journals that print any rubbish they get sent; a serious effort is needed to convince the community that a journal is not junk, and doing this is quite hard.

Leave a Comment