Trial By Error: My Exchange With Professor Bishop

By David Tuller, DrPH

I recently wrote to Oxford University neuropsychologist Dorothy Bishop, who had provided a statement to the Science Media Centre about the Lightning Process study. Although she had expressed concerns about the pseudo-scientific nature of the intervention, she found it to be generally well conducted and noted that the findings appeared to be solid. In my e-mail, which I posted on Virology Blog, I noted the study’s methodological anomalies and asked her to review the documentation and consider whether she wanted to revise her remarks.

Professor Bishop wrote back to me that she would review the documentation and respond within a week or two. I heard from her again over the weekend. I appreciate that Professor Bishop took the time to reconsider the issues. Unfortunately, she reaffirmed her support for the findings–but she based that support on a clear misreading of the documentation, as I noted in my response to her.

Specifically, Professor Bishop referred to a sensitivity analysis conducted to compare responses provided before and after 2011, when the investigators received permission to obtain answers by telephone. She mistakenly maintained that this sensitivity analysis addressed the question of bias introduced by the outcome-swapping that occurred in 2012. Since the sensitivity analysis had nothing to do with the outcome-swapping, it is unclear why she has made this assertion.

Professor Bishop also mistakenly declared that the outcome-swapping made no difference in the results. She based this on the fact that what she referred to as the original primary outcome–school attendance at 12 months–provided positive results. The problem is that the original primary outcome was school attendance at six months, not at 12 months; the latter was always a secondary outcome. And school attendance at six months had null results.

Moreover, the investigators promised in both the feasibility and trial protocols to obtain official school records in order to double-check the reliability of self-reported school attendance. These data were not provided in the trial report, for unexplained reasons. So we have no way of knowing whether even the positive self-reports for school attendance at 12 months were accurate, given the investigators’ disappointing decision not to include information about official school records. One possible reason the investigators chose not to include this information is that it did not in fact support the claim of improved school attendance at 12 months.

Therefore, both of Professor Bishop’s reasons for upholding her opinion that the findings were “solid” turned out to be fallacious, as I wrote in my response. I have not heard anything further from her.

**********

Dear Dr Tuller,

I am writing in response to your email concerning the study by Esther Crawley and colleagues evaluating the Lightning Process as a treatment for ME/CFS.

You are already familiar with my views on the Lightning Process, which I stated when asked for an opinion of the study by the Science Media Centre, and subsequently in an interview with Tom Chivers for a piece he wrote on this subject. I have myself been in the position of having enthusiasts trying to persuade me to run a trial of an intervention that appears pseudoscientific, and I can confirm that it places a researcher in a difficult position: if you refuse to engage, you can be accused of having a closed mind; but if you do go ahead, you run the risk of appearing to endorse an unscientific approach, which is all the more worrisome if there are commercial implications. Having said that, I would not be happy myself if I had a child with ME/CFS who was offered a treatment with such spurious credentials, and I question the wisdom of conducting the trial.

Given that the trial was run, the question of whether it was properly conducted and whether it demonstrated effectiveness of LP are separate matters. In terms of trial conduct, the inclusion of a feasibility study is good practice. The report of this indicates it was useful in establishing what patients found acceptable in terms of procedures and measures. It led to more use of telephone interviews to reduce the burden of paper questionnaires, and it raised questions about using school attendance as a primary outcome, given that it was deemed particularly inappropriate for those doing A-levels. These changes were approved and were included in the next phase when new patients were recruited after the feasibility study. Was it reasonable to incorporate the patients from the feasibility study into the new study? This would be efficient in that the data from the feasibility patients would not be wasted, and it would reduce the length of the study overall, but, as you have noted, it introduces important risk of bias, even if a reasonable rationale is given. So the answer to that question is yes only if the risk of bias introduced by this move could be managed. It seems the ethics committee scrutinising the study felt that was the case. We need to turn to the study report to see how this was done.

One way to avoid bias is to restrict the analysis to cases recruited after the feasibility study was complete. In the 2017 paper the authors mention a sensitivity analysis which included only those cases recruited after 2011 – who would not be biased – and it is stated that the results were similar. This sensitivity analysis ‘restricted to participants recruited after the protocol changed to collect primary outcome data by telephone’ is again mentioned in the discussion. Table 2 shows the relevant data, which support the claim that the results on SF-36 hold for the subset of children recruited after the feasibility study.

A second way to give confidence against biased findings is to analyse the data from the original primary outcome measure, even though this is thought less optimal . In your email, you stated that the original outcome measures, school attendance, did not show any benefit of LP; I think that is wrong. My understanding of Table 3 is that there was a benefit at 12 months. Thus if the original primary outcome measure had been used, the benefit of LP would still be evident.

I agree it is problematic that neither the 2013 paper describing the study protocol nor the 2017 report of the study highlighted that the study involved combining an existing feasibility study with newly recruited cases, with a change of primary outcome between the two phases. I agree that while this information can be extracted from the various papers, it is not at all obvious.

I find it surprising that this issue was not given any prominence in the write-up, given its important implications for the analysis. However, I do not agree that the study is invalidated by the design. The report contains two comparisons that would not be affected by bias from change of outcome, viz the analysis of SF-36 in Table 2, row 3, and the analysis of school attendance in last row of Table 3, and both support a benefit for LP.

What may be driving this effect is a good question; I agree that it is important to be cautious where outcome measures are based on subjective report, as apparent improvement may not be mirrored by objective measures. This is illustrated by this example in my own field of study: Bull, L. (2007). Sunflower therapy for children with specific learning difficulties (dyslexia): A randomised, controlled trial. Complement Ther Clin Pract, 13, 15-24.  In this study the benefits were not seen on direct assessment of the child, but parents were nevertheless enthusiastic about the programme. This emphasises the complexity of evaluating interventions where subjective and objective measures of outcome may diverge.

In sum, it is rather extraordinary for me to find myself in the position of defending a study that is not in my area and that claims to find a benefit from a pseudoscientific intervention. However, since you ask for my judgement on the issues you raised, my reading is that the data from the 2017 paper support the conclusion of superior outcome for the LP, both for the original primary outcome, and for the revised primary outcome; crucially, the latter results appears to hold when analysis is restricted to those recruited after the feasibility study.

I realise that you and others will be disappointed not to have my support for all your criticisms, but I think that claims that the results are invalidated by outcome switching are not supported by the published data. My opinion is based on my reading of the 2 papers from 2013, the paper from 2017 and the trial registration. I have taken some hours to look over the published papers because you made a reasonable request for me to do so, and I have given my best judgement in good faith. Given the history of this trial, I anticipate that many people will want to debate this further, but it is not feasible for me to devote any more time to this issue, as my priority has to be conducting my own research. This has, therefore, to be my final word on this topic.  I do not mind you sharing this response, but would ask that if you do so you quote the entire email.

Yours sincerely

**********

Dear Dr. Bishop–

I might respond at greater length in future, but in any event I wanted to point out a couple of errors in your analysis right now. Both of the arguments you have advanced about why the study findings can be trusted are simply incorrect and based on a misreading of the documents.

1) The decision to get answers by telephone was made in 2011. The decision to extend the feasibility trial and swap outcomes was made in 2012. The sensitivity analysis had to do with the first event, and not with the second. So it is unclear why you would make the claim that the sensitivity analysis conducted for those recruited after the change in methodology in 2011 would address the bias introduced a year later by outcome swapping, after 56 out of 100 had already provided data. The sensitivity analysis is irrelevant here. The only way to do what you have suggested should be done would be a separate analysis involving the 44 participants recruited after the outcome swapping. That analysis has not been provided.

2) You are mistaken about the results for the original primary outcome of school attendance. The original outcome was school attendance at six months, not at 12 months. .For six months, the results were null.  Moreover, both the feasibility trial and the full trial protocols promised that self-reported school attendance would be checked with official school records. This was not done, apparently, or at least not mentioned in the paper. So the self-reported school attendance at 12 months is not the original primary outcome, contrary to your claim in your response. Had they reported their original primary outcome, they would have had to report null results, as I have already documented.

3) Your conclusion that the results can be trusted is based on your erroneous statements or assumptions about points 1) and 2) mentioned above. Given that neither of your responses to the first two points was based on accurate information about the study, do you still believe that the results as reported can be trusted? In other words, since there was no sensitivity analysis that addressed the issue and since school attendance at six months had null results, do you want to revise your conclusion?

4) You have not addressed the simple fact that BMJ has a strict policy against publishing trials that have recruited participants before registration–with good reason. BMJ’s policy does not seem to carry an exception for cases in which feasibility trials are extended into full trials. I would think it certainly doesn’t carry an exception for cases in which outcomes were swapped more than half-way through. Do you agree that this study violates BMJ’s own policy?

5) You mention that it is “problematic” and “surprising” that these various aspects were not mentioned in the trial paper. Wouldn’t “astonishing” and “disgraceful” be more appropriate words to describe this omission?

6) At this point, what recommendations do you have for BMJ about addressing the obvious issues with the study?

Thanks again for your response.

Best–David

Comments on this entry are closed.

  • Trish Davis 26 June 2018, 5:21 am

    Thank you David for pursuing this. I accept that Prof. Bishop has qualms about LP, and was reluctant to endorse the study. Surely that places an extra responsibility on her to be scrupulously careful in her analysis of the situation with this trial. She says herself from her own experience with another trial that subjective measures are not valid, yet she accepts the switch to a subjective primary outcome measure on this trial as valid.

    I can understand that she did not spot all the flaws when she initially made a public comment on the paper, but to state after this second look: ”This has, therefore, to be my final word on this topic” seems to me to be completely unacceptable, even irresponsible. Either she should publicly withdraw her original endorsement of the study on the grounds that she does not have time to examine it carefully, or she should be prepared to examine and respond further any flaws in her analysis that are pointed out to her.

    This is not just an academic exercise, it affects the health and lives of vulnerable adults as well as children, who may take her endorsement of the trial as evidence to support them wasting large amounts of money and endangering their health with a crackpot treatment.