By David Tuller, DrPH
David Tuller is academic coordinator of the concurrent masters degree program in public health and journalism at the University of California, Berkeley.
First, some comments: When Virology Blog posted my very, very, very long investigation of the PACE trial two weeks ago, I hoped that the information would gradually leak out beyond the ME/CFS world. So Iâ€™ve been overwhelmed by the response, to say the least, and technologically unprepared for my viral moment. I didnâ€™t even have a photo on my Twitter profile until yesterday.
Given the speed at which events are unfolding, I thought it made sense to share a few thoughts, prompted by some of the reactions and comments and subsequent developments.
I approached this story as a journalist, not an academic. I read as much as I could and talked to a lot of people. I did not set out to write the definitive story about the PACE trial, document every single one of its many oddities, or credit everyone involved in bringing these problems to light. My goal was to explain what I recognized as some truly indefensible flaws in a clear, readable way that would resonate with scientists, public health and medical professionals, and others not necessarily immersed in the complicated history of this terrible disease.
To do that most effectively and maximize the impact, I had to find a story arc, some sort of narrative, to carry readers through 14,000 words and many dense explanations of statistical and epidemiologic concepts. After a couple of false starts, I settled on a patient and advocate, Tom Kindlon, as my â€œprotagonistâ€â€”someone readers could understand and empathize with. Tom is smart, articulate, and passionate about good science–and he knows the PACE saga inside out. He was a terrific choice whose presence in the story, I think, made reading it a lot more bearable.
That decision in no way implied that Tom was the only possible choice or even the best possible choice. I built my work on the work of others, including many that James Coyne recently referred to as â€œcitizen-scientists.â€ Tomâ€™s dedication to tracking and critiquing the research has been heroic, given his health struggles. But the same could be said, and should be said, of many others who have fought to raise awareness about the problems with PACE since the trial was announced in 2003.
The PACE study has generated many peer-reviewed publications and a healthy paper trail. My account of the story, notwithstanding its length, has significant gaps. I havenâ€™t finished writing about PACE, so I hope to fill in some of them myselfâ€”as with todayâ€™s story on the 2011 Lancet commentary written by colleagues of Peter White, the lead PACE investigator. But I have no monopoly on this story, nor would I want oneâ€”the stakes are too high and too many years have already been wasted. Given the trialâ€™s wealth of problems and its enormous influence and ramifications, there are plenty of PACE-related stories left for everyone to tackle.
I am, obviously, indebted to Tomâ€”for his good humor, his willingness to trust me given so many unfair media portrayals of ME/CFS, and his patience when I peppered him with question afterÂ question via Facebook, Twitter, and e-mail.
I am also indebted to my friend Valerie Eliot Smith. We met when I began research on this project in July, 2014; since then, she has become an indispensible resource, offering transatlantic support across multiple domains. Valerie has given me invaluable legal counsel, making sure that what I was writing was verifiable and, just as important, defendableâ€”especially in the U.K. (I donâ€™t want to know how many billable hours she has invested!) She has provided keen strategic advice. She has been a terrific editor, whose input greatly improved the storyâ€™s flow and readability. She has done all this, I realize, at some risk to her own health. I am lucky she decided to join me on this unexpected journey.
I would like to thank, as well, Dr. Malcolm Hooper, Margaret Williams, Dr. Nigel Speight, Dr. William Weir, Natalie Boulton, Lois Addy, and the Countess of Mar for their help and hospitality while I was in England researching the story last year. I will always cherish the House of Lords plastic bag that I received from the Countess. (The bag was stuffed with PACE-related reports and documents.)
So far, Richard Horton, the editor of The Lancet, has not responded to the criticisms documented in my story. As for the PACE investigators, they provided their own response last Friday on Virology Blog, followed by my rebuttal.
In seeking that opportunity for the PACE investigators to respond, a public relations representative from Queen Mary University of London, or QMUL, had approached Virology Blog. In e-mails to Dr. Racaniello, the public relations representative had suggested that â€œmisinformationâ€ and â€œinaccuraciesâ€ in my article had triggered social media â€œabuseâ€ and could cause â€œreputational damage.â€
These are serious charges, not to be taken lightly. Last Fridayâ€™s exchange has hopefully put an end to such claims. It seems unlikely that calling rituximab an â€œanti-inflammatoryâ€ rather than an â€œimmunomodulatoryâ€ drug would trigger social media abuse or cause reputational damage.
Last week, in an effort to expedite Virology Blogâ€™s publication of the PACE investigatorsâ€™ response, the QMUL public relations representative further charged that I had not sought their input before the article was posted. This accusation goes to the heart of my professional integrity as a journalist. It is also untrueâ€”as the public relations representative would have known had he read my piece or talked to the PACE investigators themselves. (Whether earlier publication of their response would have helped their case is another question.)
Disseminating false information to achieve goals is not usually an effective PR strategy. I have asked the QMUL public relations representative for an explanation as to why he conveyed false information to Dr. Racaniello in his attempt to advance the interests of the PACE investigators. I have also asked for an apology.
Since 2011, the PACE investigators have released several papers, repeatedly generating enthusiastic news coverage about the possibility of â€œrecoveryâ€–coverage that has often drawn conclusions beyond what the publications themselves have reported.
The PACE researchers canâ€™t control the media and donâ€™t write headlines. But in at least one case, their actions appeared to stimulate inaccurate media accounts–and they made no apparent effort immediately afterwards to correct the resulting international coverage. The misinformation spread to medical and public health journals as well.
(I mentioned this episode, regarding the Lancet â€œcommentâ€ that accompanied the first PACE results in 2011, in my excruciatingly long series two weeks ago on Virology Blog. However, that series focused on the PACE study, and the comment itself raised additional issues that I did not have the chance to explore. Because the Lancet comment had such an impact on media coverage, and ultimately most likely on patient care, I felt it was important to return to it.)
The Lancet comment, written by Gils Bleijenberg and Hans Knoop from the Expert Centre for Chronic Fatigue at Radboud University Nijmegen in the Netherlan was called â€œChronic fatigue syndrome: where to PACE from here?â€ It reported that 30 percent of those receiving the two rehabilitative interventions favored by the PACE investigators–cognitive behavior therapy and graded exercise therapy–had â€œrecovered.â€ Moreover, these participants had â€œrecoveredâ€ according to what the comment stated was the â€œstrict criterionâ€ used by the PACE study itself.
Yet the PACE investigators themselves did not make this claim in their paper. Rather, they reported that participants in the two rehabilitative arms were more likely to improve and to be within what they referred to as â€œthe normal rangeâ€ for physical function and fatigue, the studyâ€™s two primary outcome measures. (â€œNormal rangeâ€ is a statistical concept that has no inherent connection to â€œnormal functioningâ€ or â€œrecovery.â€ More on that below.)
In addition, the comment did not mention that 15 percent of those receiving only the baseline condition of â€œspecialist medical careâ€ also â€œrecoveredâ€ according to the same criterion. Thus, only half of this 30 percent â€œrecoveryâ€ rate could actually be attributed to the interventions.
The PACE investigators themselves reviewed the comment before publication.
Thanks to this inaccurate account of the PACE studyâ€™s reported findings, the claim of a 30 percent â€œrecoveryâ€ rate dominated much of the news coverage. Trudie Chalder, one of the key PACE investigators, reinforced the message of the Lancet comment when she declared at the press conference announcing the PACE results that participants in the two rehabilitative interventions got â€œback to normal.â€
Just as the PACE paper did not report that anyone had â€œrecovered,â€ it also did not report that anyone got â€œback to normal.â€
Three months later, the PACE authors acknowledged in correspondence in The Lancet that the paper did not discuss â€œrecoveryâ€ at all and that they would be presenting â€œrecoveryâ€ data in a subsequent paper. They did not explain, however, why they had not taken earlier steps to correct the apparently inaccurate news coverage about how patients in the trial had â€œrecoveredâ€ and gotten â€œback to normal.â€
It is not unusual for journals, when they publish studies of significance, to also commission commentaries or editorials that discuss the implications of the findings. It is also not unusual for colleagues of a studyâ€™s authors to be asked to write such commentaries. In this case, Bleijenberg and Knoop were colleagues of Peter White, the lead PACE investigator.Â In 2007, the three had published, along with two other colleagues, a paper called â€œIs a full recovery possible after cognitive behavior therapy for chronic fatigue syndrome?â€ in the journal Psychotherapy and Psychosomatics.
(In their response last Friday to my Virology Blog story, the PACE investigators noted that they had published a â€œcorrectionâ€ to clarify that the 2011 Lancet paper was not about â€œrecoveryâ€; presumably, they were referring to the Lancet correspondence three months later. In their response to Virology Blog, they blamed the misconception on an â€œeditorialâ€¦written by others.â€ But they did not mention that those â€œothersâ€ were Whiteâ€™s colleagues. In their response, they also did not explain why they did not â€œcorrectâ€ this â€œrecoveryâ€ claim during their pre-publication review of the comment, nor why Chalder spoke at the press conference of participants getting â€œback to normal.â€)
In the Lancet comment, Bleijenberg and Knoop hailed the PACE team for its work. And hereâ€™s what they wrote about the trialâ€™s primary outcome measures for physical function and fatigue: â€œPACE used a strict criterion for recovery: a score on both fatigue and physical function within the range of the mean plus (or minus) one standard deviation of a healthy person’s score.â€
This statement was problematic for a number of reasons. Given that the PACE paper itself made no claims for â€œrecovery,â€ Bleijenberg and Knoopâ€™s assertion that it â€œusedâ€ any criterion for â€œrecoveryâ€ at all was false. The PACE study protocol had outlined four specific criteria that constituted what the investigators referred to as â€œrecovery.â€ Two of them were thresholds on the physical function and fatigue measures, but the Lancet paper did not present data for the other criteria and so could not report â€œrecoveryâ€ rates.
Instead, the Lancet paper reported the rates of participants in all the groups who finished the study within what the researchers referred to as â€œthe normal rangesâ€ for physical function and fatigue. But as noted immediately by some in the patient community, these â€œnormal rangesâ€ featured a bizarre paradox: the thresholds for being â€œwithin the normal rangeâ€ on both the physical function and fatigue scales indicated worse health than the entry thresholds required to demonstrate enough disability to qualify for the trial in the first place.
To many patients and other readers, for the Lancet comment to refer to â€œnormal rangeâ€ scales in which entry and outcome criteria overlapped as a â€œstrict criterion for recoveryâ€ defied logic and common sense. (According to data not included in the Lancet paper but obtained later by a patient through a freedom-of-information request, 13 percent of the total sample was already â€œwithin normal rangeâ€ for physical function, fatigue or both at baseline, before any treatment began.)
In the Lancet comment, Bleijenberg and Knoop also noted that these â€œnormal rangesâ€ were based on â€œa healthy personâ€™s score.â€ In other words, the â€œnormal rangesâ€ were purportedly derived from responses to the physical function and fatigue questionnaires by population-based samples of healthy people.
But this statement was also at odds with the fact. The source for the fatigue scale was a population of attendees at a medical practiceâ€”a population that could easily have had more health issues than a sample from the general population. And as the PACE authors themselves acknowledged in the Lancet correspondence several months after the initial publication, the SF-36 population-based scores they used to determine the physical function â€œnormal rangeâ€ were from an â€œadultâ€ population, not the healthier, working-age population they had inaccurately referred to in The Lancet. (An â€œadultâ€ population includes the elderly.)
The Lancet has never corrected this factual mistake in the PACE paper itself. The authors had describedâ€”inaccurately–how they derived a key outcome for one of their two primary measures. This error indisputably made the results appear better than they were, but only those who scrutinized the correspondence were aware of this discrepancy.
The Lancet comment, like the Lancet paper itself, has also never been corrected to indicate that the source population for the SF-36 responses was not a â€œhealthyâ€ population after all, but an â€œadultâ€ one that included many elderly. The commentâ€™s parallel claim that the source population for the fatigue scale â€œnormal rangeâ€ was â€œhealthyâ€ as well has also not been corrected.
Richard Horton, the editor of The Lancet, did not respond to a request for an interview to discuss whether he agreed that the â€œnormal rangeâ€ thresholds represented â€œa strict criterion for recovery.â€ Peter White, Trudie Chalder and Michael Sharpe, the lead PACE investigators, and Gils Bleijenberg, the lead author of the Lancet comment, also did not respond to requests for interviews for this story.
How did the PACE study end up with â€œnormal rangesâ€ in which participants could get worse and still be counted as having achieved the designated thresholds?
Hereâ€™s how: The investigators committed a major statistical error in determining the PACE â€œnormal ranges.â€ They used a standard statistical formula designed for normally distributed populations â€” that is, populations in which most people score somewhere in the middle, with the rest falling off evenly on each side. When normally distributed populations are graphed, they form the classic bell curve. In PACE, however, the data they were analyzing was far from normally distributed. The population-based responses to the physical function and fatigue questionnaires were skewedâ€”that is, clustered toward the healthy end rather than symmetrically spread around a mean value.
With a normally distributed set of data, a â€œnormal rangeâ€ using the standard formula used in PACEâ€”taking the mean, plus/minus one standard deviation–contains 68 percent of the values. But when the values are clustered toward one end, as in the source populations for physical function and fatigue, a larger percentage ends up being included in a â€œnormal rangeâ€ calculated using this same formula. Other statistical methods can be used to calculate 68 percent of the values when a dataset does not form a normal distribution.
If the standard formula is used on a population-based survey with scores clustered toward the healthier end, the result is an expanded â€œnormal rangeâ€ that pushes the lower threshold even lower, as happened with the PACE physical function scale. And in PACE, the threshold wasnâ€™t just low–it was lower than the score required for entry into the trial. This score, of course, already represented severe disability, not â€œrecoveryâ€ or being â€œback to normalâ€â€”and certainly not a â€œstrict criterionâ€ for anything.
Bleijenberg and Knoop, the comment authors, were themselves aware of the challenges faced in calculating accurate â€œnormal ranges,â€ since the issue was addressed in the 2007 paper they co-wrote with Peter White. In this paper, White, Bleijenberg, and Knoop discussed the concerns related to determining a â€œnormal rangeâ€ from population data that was heavily clustered toward the healthy end of the scale. The paper noted that using the standard formula â€œassumed a normal distribution of scoresâ€ and generated different results under the â€œviolation of the assumptions of normality.â€
Despite the caveats the three scientists included in this 2007 paper, Bleijenberg and Knoopâ€™s 2011 Lancet comment did not mention these concerns about distortion arising from applying the standard statistical formula to values that were not normally distributed. (White and his colleagues also did not mention this problem in the PACE study itself.)
Moreover, the 2007 paper from White, Bleijenberg, and Knoop had identified a score of 80 on the SF-36 as representing â€œrecoveryâ€â€”a much higher â€œrecoveryâ€ threshold than the SF-36 score of 60 that Bleijenberg and Knoop now declared to be a â€œstrict criterionâ€ In the Lancet comment, the authors did not mention this major discrepancy, nor did they explain how and when they had changed their minds about whether an SF-36 score of 60 or 80 best represented â€œrecovery.â€ (In 2011, White and his colleagues also did not mention this discrepancy between the score for â€œrecoveryâ€ in the 2007 paper and the much lower â€œnormal rangeâ€ threshold in the PACE paper.)
Along with the PACE paper, The Lancet comment caused an uproar in the patient and advocacy communities–especially since the claim that 30 percent of participants in the rehabilitative arms â€œrecoveredâ€ per a â€œstrict criterionâ€ was widely disseminated.
The comment apparently caused some internal consternation at The Lancet as well. In an e-mail to Margaret Williams, the pseudonym for a longtime clinical manager in the National Health Service who had complained about the Lancet comment, an editor at the journal, Zoe Mullan, agreed that the reference to â€œrecoveryâ€ was problematic.
â€œYes I do think we should correct the Bleijenberg and Knoop Comment, since White et al explicitly state that recovery will be reported in a separate report,â€ wrote Mullan in the e-mail. â€œI will let you know when we have done this.â€
No correction was made, however.
In 2012, to press the issue, the Countess of Mar pursued a complaint about the commentâ€™s claim of â€œrecoveryâ€ with the (now-defunct) Press Complaints Commission, a regulatory body established by the media industry that was authorized to investigate the conduct of news organizations. The countess, who frequently championed the cause of the ME/CFS patient community in Parliamentâ€™s House of Lords, had long questioned the scientific basis of support of cognitive behavior therapy and graded exercise therapy, and she believed the Lancetâ€™s commentâ€™s claims of â€œrecoveryâ€ contradicted the study itself.
In defending itself to the Press Complaints Commission, The Lancet acknowledged the earlier suggestion by a journal editor that the comment should be corrected.
â€œI can confirm that our editor of our Correspondence section, Zoe Mullan, did offer her personal opinion at the time, in which she said that she thought that we should correct the Comment,â€ wrote Lancet deputy editor Astrid James to the Press Complaints Commission, in an e-mail.
â€œZoe made a mistake in not discussing this approach with a more senior member of our editorial team,â€ continued James in the e-mail. â€œNow, however, we have discussed this case at length with all members of The Lancetâ€™s senior editorial team, and with Zoe, and we do not agree that there is a need to publish a correction.â€
The Lancet now rejected the notion that the comment was inaccurate. Despite the explicit language in the comment identifying the â€œnormal rangeâ€ thresholds as the PACE trialâ€™s own â€œstrict criterion for recovery,â€ The Lancet argued in its response to the Press Complaints Commission that the authors were only expressing their personal opinion about what constituted â€œrecovery.â€
In other words, according to The Lancet, Bleijenberg and Knoop were not describingâ€”wrongly–the conclusions of the PACE paper itself. They were describing their own interpretation of the findings. Therefore, the comment was not inaccurate and did not need to be corrected.
(In its response to the Press Complaints Commission, The Lancet did not explain why thresholds that purportedly represented a â€œstrict criterion for recoveryâ€ overlapped with the entry criteria for disability.)
The Press Complaints Commission issued its findings in early 2013. The commission agreed with the Countess of Mar that the statement about â€œrecoveryâ€ in the Lancet comment was inaccurate. But the commission gave a slightly different reason. The commission accepted the Lancetâ€™s argument that Bleijenberg and Knoop were trying to express their own opinion. The problem, the commission ruled, was that the comment itself didnâ€™t make that point clear.
â€œThe authors of the comment piece were clearly entitled to take a view on how “recovery” should be defined among the patients in the trial,â€ wrote the commission. However, continued the decision: â€œThe authors of the comment had failed to make clear that the 30 per cent figure for â€˜recoveryâ€™ reflected their view that function within “normal rangeâ€™ was an appropriate way of â€˜operationalisingâ€™ recovery–rather than statistical analysis by the researchers based on the definition for recovery provided. This was a distinction of significance, particularly in the context of a comment on a clinical trial published in a medical journal. The comment was misleading on this point and raised a breach of Clause 1 (Accuracy) of the Code.â€
However, this determination seemed based on a msreading of what Bleijenberg and Knoop had actually written: â€œPACE used a strict criterion for recovery.â€ That phrasing did not suggest that the authors were expressing their own opinion about â€œrecovery.â€ Rather, it was a statement about how the PACE study itself purportedly defined â€œrecovery.â€ And the statement was demonstrably untrue.
Compounding the confusion, the Press Complaints Commission decision noted that the Lancet comment had been discussed with the PACE investigators prior to publication. Since the phrase â€œstrict criterion for recoveryâ€ had thus apparently been vetted by the PACE team itself, it remained unclear why the commission determined that Bleijenberg and Knoop were only expressing their own opinion.
The commissionâ€™s response left other questions unanswered. The commission noted that the Countess had pointed out that the â€œrecoveryâ€ score for physical function cited by the commenters was lower than the score required for entry. Despite this obvious anomaly, the commission did not indicate whether it had asked The Lancet or Bleijenberg and Knoop to explain how such a nonsensical scale could be used to assess â€œrecovery.â€.
Notwithstanding the inaccuracy of the Lancet commentâ€™s â€œrecoveryâ€ claim, the commission also found that the journal had already taken â€œsufficient remedial actionâ€ to rectify the problem. The commission noted that the correspondence published after the trial had provided a prominent forum to debate concerns over the definition of â€œrecovery.â€ The decision also noted that the PACE authors themselves had clarified in the correspondence that the actual â€œrecoveryâ€ findings would be published in a subsequent paper.
In ruling that â€œsufficient remedial actionâ€ had already been taken, however, the commission did not mention the potential damage that already might have been caused by this inaccurate â€œrecoveryâ€ claim. Given the commentâ€™s declaration that 30 percent of participants in the cognitive behavior and graded exercise therapy arms had â€œrecoveredâ€ according to a â€œstrict criterion,â€ the message received worldwide disseminationâ€”even though the PACE paper itself made no such claim.
Medical and public health journals, conflating the Lancet comment and the PACEÂ study itself, also transmitted the 30 percent â€œrecoveryâ€ rate directly to clinicians and others who treat or otherwise deal with ME/CFS patients.
The BMJ referred to the approximately 30 percent of patients who met the â€œnormal rangeâ€ thresholds as â€œcured.â€ A study in BMC Health Services Research cited PACE as having demonstrated â€œa recovery rate of 30-40%â€â€”months after the PACE authors had issued their â€œcorrectionâ€ that their paper did not report on â€œrecoveryâ€ at all. (Another mystery about the BMC Health Services Research report is the source of the 40 percent figure for â€œrecovery.â€) A 2013 paper in PLoS One similarly cited the PACE studyâ€”not the Lancet commentâ€”and noted that 30 percent achieved a â€œfull recovery.â€
Given that relapsing after too much exertion is a core symptom of the illness, it is impossible to calculate the possible harms that could have arisen from this widespread dissemination of misinformation to health care professionalsâ€”all based on the flawed claim from the comment that 30 percent of participants had recovered according to the PACE studyâ€™s â€œstrict criterion for recovery.â€
And that â€œstrict criterion,â€ it should be remembered, allowed participants to get worse and still be counted as better.