TWiV 397: Trial by error

Journalism professor David Tuller returns to TWiV for a discussion of the PACE trial for ME/CFS: the many flaws in the trial, why its conclusions are useless, and why the data must be released and re-examined.

You can find TWiV #397 at, or listen below.

Click arrow to play
Download TWiV 397 (67 MB .mp3, 93 min)
Subscribe (free): iTunesRSSemailGoogle Play Music

Become a patron of TWiV!

An open letter to PLoS One

PLoS One
1160 Battery Street
Koshland Building East, Suite 100
San Francisco, CA 94111

Dear PLoS One Editors:

In 2012, PLoS One published “Adaptive Pacing, Cognitive Behaviour Therapy, Graded Exercise, and Specialist Medical Care for Chronic Fatigue Syndrome: A Cost-Effectiveness Analysis.” This was one in a series of papers highlighting results from the PACE study—the largest trial of treatments for the illness, also known as ME/CFS. Psychologist James Coyne has been seeking data from the study based on PLoS’ open-access policies, an effort we support.

However, as David Tuller from the University of California, Berkeley, documented in an investigation of PACE published last October on Virology Blog, the trial suffered from many indefensible flaws, as patients and advocates have argued for years. Among Dr. Tuller’s findings: the main claim of the PLoS One paper–that cognitive behavior therapy and graded exercise therapy are cost-effective treatments–is wrong, since it is based on an erroneous characterization of the study’s sensitivity analyses. The PACE authors have repeatedly cited this inaccurate claim of cost-effectiveness to justify their continued promotion of these interventions.

Yet the claim is not supported by the evidence, and it is not necessary to obtain the study data to draw this conclusion. The claim is based solely on the decision to value the free care provided by family and friends as if it were compensated at the level of a well-paid health care worker. Here is what Dr. Tuller wrote last October about the PLoS One paper and its findings:

        The PLoS One paper argued that the graded exercise and cognitive behavior therapies were the most cost-effective treatments from a societal perspective. In reaching this conclusion, the investigators valued so-called  “informal” care—unpaid care provided by family and friends–at the replacement cost of a homecare worker. The PACE statistical analysis plan (approved in 2010 but not published until 2013) had included two additional, lower-cost assumptions. The first valued informal care at minimum wage, the second at zero compensation. 

       The PLoS One paper itself did not provide these additional findings, noting only that “sensitivity analyses revealed that the results were robust for alternative assumptions.”

Commenters on the PLoS One website, including [patient] Tom Kindlon, challenged the claim that the findings would be “robust” under the alternative assumptions for informal care. In fact, they pointed out, the lower-cost conditions would reduce or fully eliminate the reported societal cost-benefit advantages of the cognitive behavior and graded exercise therapies. 

        In a posted response, the paper’s lead author, Paul McCrone, conceded that the commenters were right about the impact that the lower-cost, alternative assumptions would have on the findings. However, McCrone did not explain or even mention the apparently erroneous sensitivity analyses he had cited in the paper, which had found the societal cost-benefit advantages for graded exercise therapy and cognitive behavior therapy to be “robust” under all assumptions. Instead, he argued that the two lower-cost approaches were unfair to caregivers because families deserved more economic consideration for their labor.

        “In our opinion, the time spent by families caring for people with CFS/ME has a real value and so to give it a zero cost is controversial,” McCrone wrote. “Likewise, to assume it only has the value of the minimum wage is also very restrictive.”

       In a subsequent comment, Kindlon chided McCrone, pointing out that he had still not explained the paper’s claim that the sensitivity analyses showed the findings were “robust” for all assumptions. Kindlon also noted that the alternative, lower-cost assumptions were included in PACE’s own statistical plan.

      “Remember it was the investigators themselves that chose the alternative assumptions,” wrote Kindlon. “If it’s ‘controversial’ now to value informal care at zero value, it was similarly ‘controversial’ when they decided before the data was looked at, to analyse the data in this way. There is not much point in publishing a statistical plan if inconvenient results are not reported on and/or findings for them misrepresented.”

Given that Dr. McCrone, the lead author, directly contradicted in his comments what the paper itself claimed about sensitivity analyses having confirmed the “robustness” of the findings under other assumptions, it is clearly not necessary to scrutinize the study data to confirm that this central finding cannot be supported. Dr. McCrone has not responded to e-mail requests from Dr. Tuller to explain the discrepancy. And PLoS One, although alerted to this problem last fall by Dr. Tuller, has apparently not yet taken steps to rectify the misinformation about the sensitivity analyses contained in the paper.

PLoS One has an obligation to question Dr. McCrone about the contradiction between the text of the paper and his subsequent comments, so he can either provide a reasonable explanation, produce the actual sensitivity analyses demonstrating “robustness” under all three assumptions outlined in the statistical analysis plan, or correct the paper’s core finding that CBT and GET are “cost-effective” no matter how informal care is valued.  Should he fail to do so, PLoS One has an obligation itself to correct the paper, independent of the disposition of the issue of access to trial data.

We appreciate your quick response to these concerns.


Ronald W. Davis, PhD
Professor of Biochemistry and Genetics
Stanford University

Rebecca Goldin, Ph.D.
Professor of Mathematical Sciences
George Mason University

Bruce Levin, PhD
Professor of Biostatistics
Columbia University

Vincent R. Racaniello, PhD
Professor of Microbiology and Immunology
Columbia University

Arthur L. Reingold, MD
Professor of Epidemiology
University of California, Berkeley

An open letter to The Lancet, again

On November 13th, five colleagues and I released an open letter to The Lancet and editor Richard Horton about the PACE trial, which the journal published in 2011. The study’s reported findings–that cognitive behavior therapy and graded exercise therapy are effective treatments for chronic fatigue syndrome–have had enormous influence on clinical guidelines for the illness. Last October, Virology Blog published David Tuller’s investigative report on the PACE study’s indefensible methodological lapses. Citing these problems, we noted in the letter that “such flaws have no place in published research” and urged Dr. Horton to commission a fully independent review.

Although Dr. Horton’s office e-mailed that he would respond to our letter when he returned from “traveling,” it has now been almost three months. Dr. Horton has remained silent on the issue. Today, therefore, we are reposting the open letter and resending it to The Lancet and Dr. Horton, with the names of three dozen more leading scientists and clinicians, most of them well-known experts in the ME/CFS field.

We still hope and expect that Dr. Horton will address–rather than continue to ignore–these critical concerns about the PACE study.


Dr. Richard Horton

The Lancet
125 London Wall
London, EC2Y 5AS, UK

Dear Dr. Horton:

In February, 2011, The Lancet published an article called “Comparison of adaptive pacing therapy, cognitive behaviour therapy, graded exercise therapy, and specialist medical care for chronic fatigue syndrome (PACE): a randomized trial.” The article reported that two “rehabilitative” approaches, cognitive behavior therapy and graded exercise therapy, were effective in treating chronic fatigue syndrome, also known as myalgic encephalomyelitis, ME/CFS and CFS/ME. The study received international attention and has had widespread influence on research, treatment options and public attitudes.

The PACE study was an unblinded clinical trial with subjective primary outcomes, a design that requires strict vigilance in order to prevent the possibility of bias. Yet the study suffered from major flaws that have raised serious concerns about the validity, reliability and integrity of the findings. The patient and advocacy communities have known this for years, but a recent in-depth report on this site, which included statements from five of us, has brought the extent of the problems to the attention of a broader public. The PACE investigators have replied to many of the criticisms, but their responses have not addressed or answered key concerns.

The major flaws documented at length in the recent report include, but are not limited to, the following:

*The Lancet paper included an analysis in which the outcome thresholds for being “within the normal range” on the two primary measures of fatigue and physical function demonstrated worse health than the criteria for entry, which already indicated serious disability. In fact, 13 percent of the study participants were already “within the normal range” on one or both outcome measures at baseline, but the investigators did not disclose this salient fact in the Lancet paper. In an accompanying Lancet commentary, colleagues of the PACE team defined participants who met these expansive “normal ranges” as having achieved a “strict criterion for recovery.” The PACE authors reviewed this commentary before publication.

*During the trial, the authors published a newsletter for participants that included positive testimonials from earlier participants about the benefits of the “therapy” and “treatment.” The same newsletter included an article that cited the two rehabilitative interventions pioneered by the researchers and being tested in the PACE trial as having been recommended by a U.K. clinical guidelines committee “based on the best available evidence.” The newsletter did not mention that a key PACE investigator also served on the clinical guidelines committee. At the time of the newsletter, two hundred or more participants—about a third of the total sample–were still undergoing assessments.

*Mid-trial, the PACE investigators changed their protocol methods of assessing their primary outcome measures of fatigue and physical function. This is of particular concern in an unblinded trial like PACE, in which outcome trends are often apparent long before outcome data are seen. The investigators provided no sensitivity analyses to assess the impact of the changes and have refused requests to provide the results per the methods outlined in their protocol.

*The PACE investigators based their claims of treatment success solely on their subjective outcomes. In the Lancet paper, the results of a six-minute walking test—described in the protocol as “an objective measure of physical capacity”–did not support such claims, notwithstanding the minimal gains in one arm. In subsequent comments in another journal, the investigators dismissed the walking-test results as irrelevant, non-objective and fraught with limitations. All the other objective measures in PACE, presented in other journals, also failed. The results of one objective measure, the fitness step-test, were provided in a 2015 paper in The Lancet Psychiatry, but only in the form of a tiny graph. A request for the step-test data used to create the graph was rejected as “vexatious.”

*The investigators violated their promise in the PACE protocol to adhere to the Declaration of Helsinki, which mandates that prospective participants be “adequately informed” about researchers’ “possible conflicts of interest.” The main investigators have had financial and consulting relationships with disability insurance companies, advising them that rehabilitative therapies like those tested in PACE could help ME/CFS claimants get off benefits and back to work. They disclosed these insurance industry links in The Lancet but did not inform trial participants, contrary to their protocol commitment. This serious ethical breach raises concerns about whether the consent obtained from the 641 trial participants is legitimate.

Such flaws have no place in published research. This is of particular concern in the case of the PACE trial because of its significant impact on government policy, public health practice, clinical care, and decisions about disability insurance and other social benefits. Under the circumstances, it is incumbent upon The Lancet to address this matter as soon as possible.

We therefore urge The Lancet to seek an independent re-analysis of the individual-level PACE trial data, with appropriate sensitivity analyses, from highly respected reviewers with extensive expertise in statistics and study design. The reviewers should be from outside the U.K. and outside the domains of psychiatry and psychological medicine. They should also be completely independent of, and have no conflicts of interests involving, the PACE investigators and the funders of the trial.

Thank you very much for your quick attention to this matter.


Ronald W. Davis, PhD
Professor of Biochemistry and Genetics
Stanford University

Jonathan C.W. Edwards, MD
Emeritus Professor of Medicine
University College London

Leonard A. Jason, PhD
Professor of Psychology
DePaul University

Bruce Levin, PhD
Professor of Biostatistics
Columbia University

Vincent R. Racaniello, PhD
Professor of Microbiology and Immunology
Columbia University

Arthur L. Reingold, MD
Professor of Epidemiology
University of California, Berkeley


Dharam V. Ablashi, DVM, MS, Dip Bact
Scientific Director, HHV-6 Foundation
Former Senior Investigator
National Cancer Institute, NIH
Bethesda, Maryland

James N. Baraniuk, MD
Professor, Department of Medicine,
Georgetown University
Washington, D.C.

Lisa F. Barcellos, PhD, MPH
Professor of Epidemiology
School of Public Health
California Institute for Quantitative Biosciences
University of California
Berkeley, California

Lucinda Bateman, MD
Medical Director, Bateman Horne Center
Salt Lake City, Utah

David S. Bell, MD
Clinical Associate Professor of Pediatrics
State University of New York at Buffalo
Buffalo, New York

Alison C. Bested MD FRCPC
Clinical Associate Professor of Hematology
University of British Columbia
Vancouver, British Columbia, Canada

Gordon Broderick, PhD
Director, Clinical Systems Biology Group
Institute for Neuro Immune Medicine
Professor, Dept of Psychology and Neuroscience
College of Psychology
Nova Southeastern University
Miami, Florida

John Chia, MD
EV Med Research
Lomita, California

Lily Chu, MD, MSHS
Independent Researcher
San Francisco, California

Derek Enlander, MD, MRCS, LRCP
Attending Physician
Mount Sinai Medical Center, New York
ME CFS Center, Mount Sinai School of Medicine
New York, New York

Mary Ann Fletcher, PhD
Schemel Professor of Neuroimmune Medicine
College of Osteopathic Medicine
Nova Southeastern University
Professor Emeritus, University of Miami School of Medicine
Fort Lauderdale, Florida

Kenneth Friedman, PhD
Associate Professor of Pharmacology and Physiology (retired)
New Jersey Medical School
University of Medicine and Dentistry of NJ
Newark, New Jersey

David L. Kaufman, MD,
Medical Director
Open Medicine Institute
Mountain View, California

Nancy Klimas, MD
Professor and Chair, Department of Clinical Immunology
Director, Institute for Neuro-Immune Medicine
Nova Southeastern University
Director, GWI and ME/CFS Research, Miami VA Medical Center
Miami, Florida

Charles W. Lapp, MD
Director, Hunter-Hopkins Center
Assistant Consulting Professor at Duke University Medical Center
Charlotte, North Carolina

Susan Levine, MD
Clinician, Private Practice
New York, New York
Visiting Fellow, Cornell University
Ithaca, New York

Alan R. Light, PhD
Professor, Department of Anesthesiology and Department of Neurobiology and Anatomy
University of Utah
Salt Lake City, Utah

Sonya Marshall-Gradisnik, PhD
Professor and Co-Director
National Centre for Neuroimmunology and Emerging Diseases
Griffith University
Queensland, Australia

Peter G. Medveczky, MD
Professor, Department of Molecular Medicine, MDC 7
College of Medicine
University of South Florida
Tampa, Florida

Zaher Nahle, PhD, MPA
Vice President for Research and Scientific Programs
Solve ME/CFS Initiative
Los Angeles, California

James M. Oleske, MD, MPH
Francois-Xavier Bagnoud Professor of Pediatrics
Senator of RBHS Research Centers, Bureaus, and Institutes
Director, Division of Pediatrics Allergy, Immunology & Infectious Diseases
Department of Pediatrics
Rutgers – New Jersey Medical School
Newark, New Jersey

Richard N. Podell, M.D., MPH
Clinical Professor
Rutgers Robert Wood Johnson Medical School
New Brunswick, New Jersey

Charles Shepherd, MB, BS
Honorary Medical Adviser to the ME Association
London, United Kingdom

Christopher R. Snell, PhD
Scientific Director
WorkWell Foundation
Ripon, California

Nigel Speight, MA, MB, BChir, FRCP, FRCPCH, DCH
County Durham, United Kingdom

Professor and Co-Director
National Centre for Neuroimmunology and Emerging Diseases
Griffith University
Queensland, Australia

Philip B. Stark, PhD
Professor of Statistics
University of California, Berkeley
Berkeley, California

Eleanor Stein, MD FRCP(C)
Assistant Clinical Professor
University of Calgary
Calgary, Alberta, Canada

John Swartzberg, MD
Clinical Professor Emeritus
School of Public Health
University of California, Berkeley
Berkeley, California

Ronald G. Tompkins, MD, ScD
Summer M Redstone Professor of Surgery
Harvard University
Boston, Massachusetts

Rosemary Underhill, MB BS.
Physician, Independent Researcher
Palm Coast, Florida

Dr Rosamund Vallings MNZM, MB BS
General Practitioner
Auckland, New Zealand

Michael VanElzakker, PhD
Research Fellow, Psychiatric Neuroscience Division
Harvard Medical School & Massachusetts General Hospital
Boston, Massachusetts

William Weir, FRCP
Infectious Disease Consultant
London, England

Marcie Zinn, PhD
Research Consultant in Experimental Neuropsychology, qEEG/LORETA, Medical/Psychological Statistics
NeuroCognitive Research Institute, Chicago
Center for Community Research
DePaul University
Chicago, Illinois

Mark Zinn, MM
Research consultant in experimental electrophysiology
Center for Community Research
DePaul University
Chicago, Illinois

Trial By Error, Continued: A Few Words About “Harassment”

By David Tuller, DrPH

David Tuller is academic coordinator of the concurrent masters degree program in public health and journalism at the University of California, Berkeley.


Last week, a commentary in Nature about the debate over data-sharing in science made some excellent points. Unfortunately, the authors lumped “hard-line opponents” of research into chronic fatigue syndrome with those who question climate change and the health effects of tobacco, among others—accusing them of engaging in “endless information requests, complaints to researchers’ universities, online harassment, distortion of scientific findings and even threats of violence.”

Whatever the merits of the overall argument, this charge—clearly a reference to the angry response of patients and advocates to the indefensible claims made by the PACE trial–unleashed a wave of online commentary and protest on ME/CFS forums. Psychologist James Coyne posted a fierce response, linking the issue specifically to the PACE authors’ efforts to block access to their data and citing the pivotal role of the Science Media Centre in the battle.

The Nature commentary demonstrated the degree to which this narrative—that the PACE authors have been subjected to a wave of threats and unfair campaigning against their work and reputations—has been accepted as fact by the UK medical and academic establishment. Despite the study’s unacceptable methodological lapses and the lack of any corroborating public evidence from law enforcement about such threats, the authors have wielded these claims to great effect. Wrapping themselves in victimhood, they have even managed to extend their definition of harassment to include any questioning of their science and the filing of requests for data—a tactic that has shielded their work from legitimate and much-needed scrutiny.

Until recently, complaining about harassment worked remarkably well for the PACE team. Maybe that’s why they tried claiming victimhood again last October, when Virology Blog ran “Trial By Error,” my in-depth investigation of PACE. The series was the first major critique of the trial’s many indefensible flaws from outside the ME/CFS patient and advocacy community. Afterwards, the investigators complained that “misinformation” and “inaccuracies” in my stories had subjected them to “abuse” on social media and could cause them “a considerable amount of reputational damage.”

These claims were ridiculous—an attempt to deploy their standard strategy for dismissing valid criticisms. The PACE authors amplified this error in December, when they rejected Dr. Coyne’s request for data from a PACE paper published in PLoS One as “vexatious.” They had called previous requests from patients “vexatious” without attracting negative comment or attention—except from other patients. But applying the term to a respected researcher backfired, drawing howls from others in the scientific community with no knowledge of ME/CFS—the PACE team’s action was “unforgivable,” according to Columbia stats professor Andrew Gelman, and “absurd,” according to Retraction Watch.

(In fact, the PLoS One data, when ultimately released, will show that the paper’s main claim—that the PACE-endorsed treatments are cost-effective—is based on a false statement about sensitivity analyses, as I reported on Virology Blog.)

How did this theme of harassment and “vexatiousness” become part of the conversation in the first place? Starting in 2011, a few months after The Lancet published the first PACE results, top news organizations began reporting on an alarming phenomenon: Possibly dangerous chronic fatigue syndrome patients were threatening prominent psychiatrists and psychologists who were researching the illness. These reports appeared in, among other outlets, the BMJ, the Guardian, and The Sunday Times of London. The Times headline, a profile of Sir Simon Wessely, a longtime colleague of the PACE authors, was typical: “This man faced death threats and abuse. His crime? He suggested that ME was a mental illness.”

One patient had supposedly appeared at a PACE author’s lecture with a knife. Other CFS researchers had received death threats. Sir Wessely famously said that he felt safer in Afghanistan and Iraq than in the UK doing research into the disease—a preposterous statement that the press appeared to take at face value. News accounts compared the patients to radical animal terrorists.

According to the news reports, the patients objected to the involvement of these mental health experts because they were anti-psychiatry and resented being perceived as suffering from a psychological disorder. Editorials in medical journals and other publications followed the news accounts, all of them defending “science” against these unwarranted and frightening attacks.

In fact, the Science Media Centre orchestrated the story in the first place—not surprising, given its longtime association with the PACE team and its uncritical promotion of the various PACE papers. According to a 2013 SMC report reviewing the accomplishments of the first three years of its “mental health research function”: “Tom Feilden, science correspondent for BBC Radio 4’s Today programme, won the UK Press Gazette’s first ever specialist science writing award for breaking the story the SMC gave him about the harassment and intimidation of researchers working on CFS/ME. The SMC had nominated him for the award.”

It’s great that the SMC not only spoon-fed Feilden the story but was so pleased with the reporter’s hard work that it nominated him for a prestigious award. In a brochure prepared for SMC’s anniversary, Feilden himself thanked the centre for its help in organizing the scoop about the “vitriolic abuse” and the “campaign of intimidation.”

Of course, patients were attacking the PACE study not because they were anti-science or anti-psychiatry but because the study itself was so terrible, as I reported last October. Luckily, a growing number of scientists outside the field, like Dr. Coyne and the top researchers from Columbia, Stanford and Berkeley who signed an open letter to The Lancet demanding an independent review, have now recognized this. How are patients supposed to react when a study so completely ignores scientific norms, and no one else seems to notice or care, no matter how many times it is pointed out?

The PACE study’s missteps rendered the results meaningless. Let’s recap briefly. The investigators changed their primary outcomes in ways that made it easier to report success, included outcome measures for improvement that were lower than the entry criteria for disability, and published a newsletter in which they promoted the therapies under investigation. They rejected as irrelevant their own pre-selected objective outcomes when the results failed to uphold their claims, and used an overly broad definition for the illness that identified people without it. Finally, despite an explicit promise in their protocol to inform participants of “any possible conflicts of interest,” they did not tell them of their work advising disability insurers on how to handle claimants with ME/CFS.

Patients and advocates have raised these and other legitimate concerns, in every possible academic, scientific and popular forum. This effort has been framed by the investigators, The Lancet and the Science Media Centre as a vicious and anti-scientific “campaign” against PACE. The news reports adopted this viewpoint and utterly failed to examine the scientific mistakes at the root of patients’ complaints.

Moreover, the reports did not present any independent evidence of the purported threats, other than claims made by the researchers. There were no statements from law enforcement authorities confirming the claims. No mention of any arrests made or charges having been filed. And little information from actual patients, much less these extremist, dangerous patients who supposedly hated psychiatry [see correction below]. In short, these news reports failed to pass any reasonable test of independent judgment and editorial skepticism.

Despite their questionable scientific methods and unreliable results, the PACE authors have widespread support among the UK medical and academic establishment. So does the Science Media Centre. Media reports, including last week’s Nature commentary, have presented without question the PACE authors’ perspective on patient response to the study. The reality is that patients have been protesting a study they know to be deeply flawed. Sometimes they have protested very, very loudly. That’s what people do when they are desperate for help, and no one is listening. To call it harassment is disgraceful.

Update 2/3/16: After reading some of the comments, I thought it was important to make clear that I don’t doubt the PACE investigators and some of their colleagues might have received very raw and nasty e-mails or phone calls. Perhaps some of these felt threatening, and perhaps they called in the police. (I’ve worked as a reporter for many years and have also received many, many raw and nasty e-mails, so I know it’s not enjoyable—but pissing people off is also part of the job.) The news accounts, however, provided no independent verification of the investigators’ charges. And the point is that, whether or not they have been the recipient of some unpleasant communications, the investigators have repeatedly used these claims to justify blocking legitimate inquiry into the PACE trial.

Correction: I reviewed the three major articles I linked to, not every single article about the issue, so my description of the coverage applies to those three. I originally wrote that the articles contained “no” interviews with actual patients. However, the Sunday Times article did include a short interview with one ME/CFS patient–a convicted child-molester who blamed his crime on fall-out from his illness. I apologize for the mistake, although I leave it to readers to decide if interviewing this person represented a sincere effort on the reporter’s part to present patients’ legitimate concerns.

I also wrote that the articles included no statements from law enforcement confirming the claims of threats. The Guardian article contained this sentence: “According to the police, the militants are now considered to be as dangerous and uncompromising as animal rights extremists.” This statement is vague, anonymous and impossible to verify with anyone in particular, so I don’t view it as an authoritative statement from law enforcement.

At least we’re not vexatious

On 17 December 2015, Ron Davis, Bruce Levin, David Tuller and I requested trial data from the PACE study of treatments for ME/CFS published in The Lancet in 2011. Below is the response to our request from the Records & Compliance Manager of Queen Mary University of London. The bolded portion of our request, noted in the letter, is the following: “we would like the raw data for all four arms of the trial for the following measures: the two primary outcomes of physical function and fatigue (both bimodal and Likert-style scoring), and the multiple criteria for “recovery” as defined in the protocol published in 2007 in BMC Neurology, not as defined in the 2013 paper published in Psychological Medicine. The anonymized, individual-level data for “recovery” should be linked across the four criteria so it is possible to determine how many people achieved “recovery” according to the protocol definition.”

Dear Prof. Racaniello

Thank you for your email of 17th December 2015. I have bolded your request below, made under the Freedom of Information Act 2000.

You have requested raw data, linked at an individual level, from the PACE trial. I can confirm that QMUL holds this data but I am afraid that I cannot supply it. Over the last five years QMUL has received a number of similar requests for data relating to the PACE trial. One of the resultant refusals, relating to Decision Notice FS50565190, is due to be tested at the First-tier Tribunal (Information Rights) during 2016. We believe that the information requested is similarly exempt from release in to the public domain. At this time, we are not in a position to speculate when this ongoing legal action will be concluded.

Any release of information under FOIA is a release to the world at large without limits. The data consists of (sensitive) personal data which was disclosed in the context of a confidential relationship, under a clear obligation of confidence. This is not only in the form of explicit guarantees to participants but also since this is data provided in the context of medical treatment, under the traditional obligation of confidence imposed on medical practitioners. See generally, General Medical Council, ‘Confidentiality’ (2009) available at The information has the necessary quality of confidence and release to the public would lead to an actionable breach.

As such, we believe it is exempt from disclosure under s.41 of FOIA. This is an absolute exemption.

The primary outcomes requested are also exempt under s.22A of FOIA in that these data form part of an ongoing programme of research.

This exemption is subject to the public interest test. While there is a public interest in public authorities being transparent generally and we acknowledge that there is ongoing debate around PACE and research in to CFS/ME, which might favour disclosure, this is outweighed at this time by the prejudice to the programme of research and the interests of participants. This is because participants may be less willing to participate in a planned feasibility follow up study, since we have promised to keep their data confidential and planned papers from PACE, whether from QMUL or other collaborators, may be affected.

On balance we believe that the public interest in withholding this information outweighs the public interest in disclosing it.

In accordance with s.17, please accept this as a refusal notice.

For your information, the PACE PIs and their associated organisations are currently reviewing a data sharing policy.

If you are dissatisfied with this response, you may ask QMUL to conduct a review of this decision.  To do this, please contact the College in writing (including by fax, letter or email), describe the original request, explain your grounds for dissatisfaction, and include an address for correspondence.  You have 40 working days from receipt of this communication to submit a review request.  When the review process has been completed, if you are still dissatisfied, you may ask the Information Commissioner to intervene. Please see for details.

Yours sincerely

Paul Smallcombe
Records & Information Compliance Manager

Trial By Error, Continued: More Nonsense from The Lancet Psychiatry

By David Tuller, DrPH

David Tuller is academic coordinator of the concurrent masters degree program in public health and journalism at the University of California, Berkeley.


The PACE authors have long demonstrated great facility in evading questions they don’t want to answer. They did this in their response to correspondence about the original 2011 Lancet paper. They did it again in the correspondence about the 2013 recovery paper, and in their response to my Virology Blog series. Now they have done it in their answer to critics of their most recent paper on follow-up data, published last October in The Lancet Psychiatry.

(They published the paper just a week after my investigation ran. Wasn’t that a lucky coincidence?)

The Lancet Psychiatry follow-up had null findings: Two years or more after randomization,  there were no differences in reported levels of fatigue and physical function between those assigned to any of the groups. The results showed that cognitive behavior therapy and graded exercise therapy provided no long-term benefits because those in the other two groups reported improvement during the year or more after the trial was over. Yet the authors, once again, attempted to spin this mess as a success.

In their letters, James Coyne, Keith Laws, Frank Twist, and Charles Shepherd all provide sharp and effective critiques of the follow-up study. I’ll let others tackle the PACE team’s counter-claims about study design and statistical analysis. I want to focus once more on the issue of the PACE participant newsletter, which they again defend in their Lancet Psychiatry response.

Here’s what they write: “One of these newsletters included positive quotes from participants. Since these participants were from all four treatment arms (which were not named) these quotes were [not]…a source of bias.”

Let’s recap what I wrote about this newsletter in my investigation. The newsletter was published in December 2008, with at least a third of the study’s sample still undergoing assessment. The newsletter included six glowing testimonials from participants about their positive experiences with the trial, as well as a seventh statement from one participant’s primary care doctor. None of the seven statements recounted any negative outcomes, presumably conveying to remaining participants that the trial was producing a 100 % satisfaction rate. The authors argue that the absence of the specific names of the study arms means that these quotes could not be “a source of bias.”

This is a preposterous claim. The PACE authors apparently believe that it is not a problem to influence all of your participants in a positive direction, and that this does not constitute bias. They have repeated this argument multiple times. I find it hard to believe they take it seriously, but perhaps they actually do. In any case, no one else should. As I have written before, they have no idea how the testimonials might have affected anyone in any of the four groups—so they have no basis for claiming that this uncontrolled co-intervention did not alter their results.

Moreover, the authors now ignore the other significant effort in that newsletter to influence participant opinion: publication of an article noting that a federal clinical guidelines committee had selected cognitive behavior therapy and graded exercise therapy as effective treatments “based on the best available evidence.” Given that the trial itself was supposed to be assessing the efficacy of these treatments, informing participants that they have already been deemed to be effective would appear likely to impact participants’ responses. The PACE authors apparently disagree.

It is worth remembering what top experts have said about the publication of this newsletter and its impact on the trial results. “To let participants know that interventions have been selected by a government committee ‘based on the best available evidence’ strikes me as the height of clinical trial amateurism,” Bruce Levin, a biostatistician at Columbia University, told me.

My Berkeley colleague, epidemiologist Arthur Reingold, said he was flabbergasted to see that the researchers had distributed material promoting the interventions being investigated, whether they were named or not. This fact alone, he noted, made him wonder if other aspects of the trial would also raise methodological or ethical concerns.

“Given the subjective nature of the primary outcomes, broadcasting testimonials from those who had received interventions under study would seem to violate a basic tenet of research design, and potentially introduce substantial reporting and information bias,” he said. “I am hard-pressed to recall a precedent for such an approach in other therapeutic trials. Under the circumstances, an independent review of the trial conducted by experts not involved in the design or conduct of the study would seem to be very much in order.”

Trial By Error, Continued: Did the PACE Trial Really Prove that Graded Exercise Is Safe?

By Julie Rehmeyer and David Tuller, DrPH

Julie Rehmeyer is a journalist and Ted Scripps Environmental Journalism Fellow at the University of Colorado, Boulder, who has written extensively about ME/CFS.

David Tuller is academic coordinator of the concurrent masters degree program in public health and journalism at the University of California, Berkeley.

Joining me for this episode of our ongoing saga is my friend and colleague Julie Rehmeyer. In my initial series, I only briefly touched on the PACE trial’s blanket claim of safety. Here we examine this key aspect of the study in more detail, which is complicated and requires a deep dive into technicalities. Sorry about that, but the claim is too consequential to ignore.    


One of the most important and controversial claims from the PACE Trial was that graded exercise therapy is safe for patients with chronic fatigue syndrome (or ME/CFS, as U.S. government agencies now call it).

“If this treatment is done by skilled people in an appropriate way, it actually is safe and can stand a very good chance of benefiting [patients],” Michael Sharpe, one of the principal PACE investigators, told National Public Radio in 2011, shortly after The Lancet published the first results.

But to many in the ME/CFS community, this safety claim goes against the very essence of the disease. The hallmark of chronic fatigue syndrome, despite the name, is not actually fatigue but the body’s inability to tolerate too much exertion — a phenomenon that has been documented in exercise studies. All other symptoms, like sleep disorders, cognitive impairment, blood pressure regulation problems, and muscle pain, are exacerbated by physical or mental activity. An Institute of Medicine report this year even recommended that the illness be renamed to emphasize this central problem, choosing the name “systemic exertion intolerance disease,” or SEID. [see correction below]

A careful analysis shows that the PACE researchers’ attempts to prove safety were as flawed as their attempts to prove efficacy. However, while the trial reports gave enough information to establish that the treatments were not effective (in spite of the claims of success and “recovery”), they did not give enough information to establish whether they were safe (also in spite of their claims). We simply do not know.

“I would be very skeptical in recommending a blanket statement that GET is safe,” says Bruce Levin, a biostatistician at Columbia University, who has reviewed the PACE trial and found other methodological aspects indefensible. “The aphorism that absence of evidence is not evidence of absence applies here. There is real difficulty interpreting these results.”

*          *          *          *          *          *

Assessing the PACE team’s safety claims is critical, because the belief that graded exercise is safe has had enormous consequences for patients. In the UK, graded exercise therapy is recommended for all mild to moderate ME/CFS patients by the National Institute for Health and Care Excellence, which strongly influences treatment across the country. In the US, the Centers for Disease Control and Prevention also recommends graded exercise.

Exertion intolerance—also called “post-exertional malaise”—presents ME/CFS patients with a quandary: They want to do as much as they can when they’re able, while not doing so much that they make themselves sicker later. Among themselves, they’ve worked out a strategy to accomplish that, which they call “pacing.” Because their energy levels fluctuate, they carefully monitor how they are feeling and adapt their activities to stay within the day’s “energy envelope.”  This requires sensitive attunement to their symptoms in order to pick up on early signs of exacerbation and avoid exceeding their limits.

But according to the hypothesis behind the PACE study, this approach is all wrong. Because the investigators believe physical deconditioning rather than an organic disease perpetuated the many symptoms, they theorized that the key to getting better was to deliberately exceed current limits, gradually training the body to adapt to greater levels of activity. Rather than being sensitively attuned to symptoms, patients should ignore them, on the grounds that they have become obsessed about sensations most people would consider normal. Any increase in symptoms from exertion was explained as expected, transient and unimportant—the result of the body’s current state of weakness, not an underlying disease.

Many patients in the UK have tested this theory, since graded exercise therapy, or GET, is one of the few therapies available to patients there. And patient reports on the approach are very, very bad. In May 2015, the ME Association, a British charity, released a survey of patients’ experiences with GET, cognitive behavioral therapy, and pacing. The results suggested that GET was far and away the most dangerous. Of patients who received GET, 74 percent said that it had made them worse. In contrast, 18 percent said they were worse after cognitive behavior therapy and only 14 percent after pacing.

The survey is filled with reports similar to this one: “My condition deteriorated significantly, becoming virtually housebound, spending most of my day in bed in significant pain and with extreme fatigue.”

Anecdotal reports, however, don’t provide the proof of a randomized clinical trial. So this was one of the central issues at stake in the PACE study: Is it safe for patients to increase their activity on a set schedule while ignoring their symptoms?

*          *          *          *          *          *

In the 2011 Lancet article with the first PACE results, the researchers reported that eight percent of all participants experienced a “serious deterioration” and less than two percent experienced a “serious adverse reaction” over the course of the year, without significant differences between the arms of the trial.

For patients to have a “serious deterioration,” their physical function score needed to drop by at least 20 points and they needed to report that their overall health was “much worse” or “very much worse” at two consecutive assessment periods (out of a total of three).

To have a “serious adverse reaction,” the patient needed to experience a persistent “severe, i.e. significant deterioration,” which was not defined, or to experience a major health-related event, such as a hospitalization or even death. Furthermore, a doctor needed to determine that the event was directly caused by the treatment—a decision that was made after the doctor was told which arm of the trial the patient was in.

Subsequent “safety” results were published in a 2014 article in the Journal of Psychosomatic Research. And this paper revealed a critical detail unmentioned in the Lancet paper: the six centers around England participating in the study appear to have applied the methods for assessing safety differently. That raises questions about how to interpret the results and whether the overall claims of “safety” can be taken at face value.

Beyond that issue, a major problem with the PACE investigators’ reporting on harms from exercise is that it looks as though participants might not have actually done much exercise. While the researchers stated the ambitious goal that participants would exercise for at least 30 minutes five times a week, they gave no information on how much exercise participants in fact did.

The trial’s objective outcomes suggest it may not have been much. The exercise patients were only able to walk 11 percent further in a walking test at the end of the trial than patients who hadn’t exercised. Even with this minimal improvement, participants were still severely disabled, with a poorer performance than patients with chronic heart failure, severe multiple sclerosis, or chronic obstructive pulmonary disorder.

On top of that, almost a third of those in the exercise arm who finished other aspects of the trial never completed the final walking test; if they couldn’t because they were too sick, that would skew the results. In addition, the participants in GET showed no improvement at all on a step test designed to measure fitness. Presumably, if the trial’s theory that patients suffered from deconditioning was correct, participants who had managed to exercise should have become more fit and performed better on these tests.

Tom Kindlon, a long-time patient and an expert on the clinical research, suggests that even if those in the exercise arm performed more graded exercise under the guidance of trial therapists, they may have simply cut back on other activities to compensate, as has been found in other studies of graded activity. He also notes that the therapists in the trial were possibly more cautious than therapists in everyday practice.

“In the PACE Trial, there was a much greater focus on the issue of safety [than in previous studies of graded activity], with much greater monitoring of adverse events,” says Kindlon, who published an analysis of the reporting of harms from trials of graded activity in ME/CFS, including PACE. “In this scenario, it seems quite plausible that those running the trial and the clinicians would be very cautious about pushing participants to keep exercising when they had increased symptoms, as this could increase the chances the patients would say such therapies caused adverse events.”

*          *          *          *          *          *

Had the investigators stuck to their original plan, we would have more evidence to evaluate participants’ activity levels.  Originally, participants were going to wear a wristwatch-sized ankle band called an actometer, similar to a FitBit, that would measure how many steps they took for a week at the beginning of the trial and for a week at the end.

A substantial increase in the number of steps over the course of the trial would have definitively established both that participants were exercising and that they weren’t decreasing other activity in order to do so.

But in reviewing the PACE Trial protocol, which was published in 2007, Kindlon noticed, to his surprise, that the researchers had abandoned this plan. Instead, they were asking participants to wear the actometers only at the beginning of the trial, but not at the end. Kindlon posted a comment on the journal’s website questioning this decision. He pointed out that in previous studies of graded activity, actometer measurements showed that patients were not moving more, even if they reported feeling better. Hence, the “exercise program” in that case in fact did not raise their overall activity levels.

In a posted response, White and his colleagues explained that they “decided that a test that required participants to wear an actometer around their ankle for a week was too great a burden at the end of the trial.” However, they had retained the actometer as a baseline measure, they wrote, to test as “a moderator of outcome”—that is, to determine factors that predicted which participants improved. The investigators also noted that the trial contained other objective outcome measures. (They subsequently dismissed the relevance of these objective measures after they failed to demonstrate efficacy.)

That answer didn’t make sense to Kindlon. “They clearly don’t find it that great a burden that they drop it altogether as it is being used on patients before the start,” he wrote in a follow-up comment. “If they feel it was that big of a burden, it should probably have been dropped altogether.”

*          *          *          *          *

The other major flaws that make it impossible to assess the validity of their safety claims are related to those that affected the PACE trial as a whole.  In particular, problems related to four issues affected their methods for reporting harms: the case definition, changes in outcome measures after the trial began, lack of blinding, and encouraging participants to discount symptoms in a trial that relied on subjective endpoints.

First, the study’s primary case definition for identifying participants, called the Oxford criteria, was extremely broad; it required only six months of medically unexplained fatigue, with no other symptoms necessary. Indeed, 16% of the participants didn’t even have exercise intolerance—now recognized as the primary symptom of ME/CFS—and hence would not be expected to suffer serious exacerbations from exercise. The trial did use two additional case definitions to conduct sub-group analyses, but they didn’t break down the results on harms by the definition used. So we don’t know if the participants who met one of the more stringent definitions suffered more setbacks due to exercise.

Second, after the trial began, the researchers tightened their definition of harms, just as they had relaxed their methods of assessing improvement. In the protocol, for example, a steep drop in physical function since the previous assessment, or a slight decline in reported overall health, both qualified as a “serious deterioration.” However, as reported in The Lancet, the steep drop in physical function had to be sustained across two out of the trial’s three assessments rather than just since the previous one. And reported overall health had to be “much worse” or “very much worse,” not just slightly worse. The researchers also changed their protocol definition of a “serious adverse reaction,” making it more stringent.

The third problem was that the study was unblinded, so both participants and therapists knew the treatment being administered. Many participants were probably aware that the researchers themselves favored graded exercise therapy and another treatment, cognitive behavior therapy, which also involved increasing activity levels. Such information has been shown in other studies to lead to efforts to cooperate, which in this case could lead to lowered reporting of harms.

And finally, therapists were explicitly instructed to urge patients in the graded exercise and cognitive behavioral therapy arms to “consider increased symptoms as a natural response to increased activity”—a direct encouragement to downplay potential signals of physiological deterioration. Since the researchers were relying on self-reports about changes in functionality to assess harms, these therapeutic suggestions could have influenced the outcomes.

“Clinicians or patients cannot take from this trial that it is safe to undertake graded exercise programs,” Kindlon says. “We simply do not know how much activity was performed by individual participants in this trial and under what circumstances; nor do we know what was the effect on those that did try to stick to the programs.”

Correction: The original text stated that the Institute of Medicine report came out “this” year; that was accurate when it was written in late December but inaccurate by the time of publication.

Trial By Error, Continued: Questions for Dr. White and his PACE Colleagues

By David Tuller, DrPH

David Tuller is academic coordinator of the concurrent masters degree program in public health and journalism at the University of California, Berkeley.

I have been seeking answers from the PACE researchers for more than a year. At the end of this post, I have included the list of questions I’d compiled by last September, when my investigation was nearing publication. Most of these questions remain unanswered.

The PACE researchers are currently under intense criticism for having rejected as “vexatious” a request for trial data from psychologist James Coyne—an action called “unforgivable” by Columbia statistician Andrew Gelman and “absurd” by Retraction Watch. Several colleagues and I have filed a subsequent request for the main PACE results, including data for the primary outcomes of fatigue and physical function and for “recovery” as defined in the trial protocol. The PACE team has two more weeks to release this data, or explain why it won’t.

Any data from the PACE trial will likely confirm what my Virology Blog series has already revealed: The results cannot stand up to serious scrutiny. But the numbers will not provide answers to the questions I find most compelling. Only the researchers themselves can explain why they made so many ill-advised choices during the trial.

In December, 2014, after months of research, I e-mailed Peter White, Trudie Chalder and Michael Sharpe—the lead PACE researcher and his two main colleagues–and offered to fly to London to meet them. They declined to talk with me. In an email, Dr. White cited my previous coverage of the illness as a reason. (The investigators and I had already engaged in an exchange of letters in The New York Times in 2011, involving a PACE-related story I had written.) “I have concluded that it would not be worthwhile our having a conversation,” Dr. White wrote in his e-mail.

I decided to postpone further attempts to contact them for the story until it was near completion. (Dr. Chalder and I did speak in January 2015 about a new study from the PACE data, and I previously described our differing memories of the conversation.) In the meantime, I wrote and rewrote the piece and tweaked it and trimmed it and then pasted back in stuff that I’d already cut out. Last June, I sent a very long draft to Retraction Watch, which had agreed to review it for possible publication.

I still hoped Dr. White would relent and decide to talk with me. Over the summer, I drew up a list of dozens of questions that covered every single issue addressed in my investigation.

I had noticed the kinds of non-responsive responses Dr. White and his colleagues provided in journal correspondence and other venues whenever patients made cogent and incontrovertible points. They appeared to excel at avoiding hard questions, ignoring inconvenient facts, and misstating key details. I was surprised and perplexed that smart journal editors, public health officials, reporters and others accepted their replies without pointing out glaring methodological problems—such as the bizarre fact that the study’s outcome thresholds for improvement on its primary measures indicated worse health status than the entry criteria required to demonstrate serious disability.

So my list of questions included lots of follow-ups that would help me push past the PACE team’s standard portfolio of evasions. And if, as I suspected, I wouldn’t get the chance to pose the questions myself, I hoped the list would be a useful guide for anyone who wanted to conduct a rigorous interview with Dr. White or his colleagues about the trial’s methodological problems. (Dr. White never agreed to talk with me; I sent my questions to Retraction Watch as part of the fact-checking process.)

In September, Retraction Watch interviewed Dr. White in connection with my piece, as noted in a recent post about Dr. Coyne’s data request. Retraction Watch and I subsequently determined that we differed on the best approach and direction for the story. On October 21st to 23rd, Virology Blog ran my 14,000-word investigation.

But I still don’t have the answers to my questions.


List of Questions, September 1, 2015:

I am posting this list verbatim, although if I were pulling it together today I would add, subtract and rephrase some questions. (I might have misstated a statistical concept or two.) The list is by no means exhaustive. Patients and researchers could easily come up with a host of additional items. The PACE team seems to have a lot to answer for.

1) In June, a report commissioned by the National Institutes of Health declared that the Oxford criteria should be “retired” because the case definition impeded progress and possibly caused harm. As you know, the concern is that it is so non-specific that it leads to heterogeneous study samples that include people with many illnesses besides ME/CFS. How do you respond to that concern?

2) In published remarks after Dr. White’s presentation in Bristol last fall, Dr. Jonathan Edwards wrote: “What Dr White seemed not to understand is that a simple reason for not accepting the conclusion is that an unblinded trial in a situation where endpoints are subjective is valueless.” What is your response to Dr. Edward’s position?

3) The December 2008 PACE participants’ newsletter included an article about the UK NICE guidelines. The article noted that the recommended treatments, “based on the best available evidence,” included two of the interventions being studied–CBT and GET. (The article didn’t mention that PACE investigator Jessica Bavington also served on the NICE guidelines committee.) The same newsletter included glowing testimonials from satisfied participants about their positive outcomes from the trial “therapåy” and “treatment” but included no statements from participants with negative outcomes. According to the graph illustrating recruitment statistics in the same newsletter, about 200 or so participants were still slated to undergo one or more of their assessments after publication of the newsletter.

Were you concerned that publishing such statements would bias the remaining study subjects? If not, why not? A biostatistics professor from Columbia told me that for investigators to publish such information during a trial was “the height of clinical trial amateurism,” and that at the very least you should have assessed responses before and after disseminating the newsletter to ensure that there was no bias resulting from the statements. What is your response? Also, should the article about the NICE guidelines have disclosed that Jessica Bavington was on the committee and therefore playing a dual role?

4) In your protocol, you promised to abide by the Declaration of Helsinki. The declaration mandates that obtaining informed consent requires that prospective participants be “adequately informed” about “any possible conflicts of interest” and “institutional affiliations of the researcher.” In the Lancet and other papers, you disclosed financial and consulting ties with insurance companies as “conflicts of interest.” But trial participants I have interviewed said they did not find out about these “conflicts of interest” until after they completed the trial. They felt this violated their rights as participants to informed consent. One demanded her data be removed from the study after the fact. I have reviewed participant information and consent forms, including those from version 5.0 of the protocol, and none contain the disclosures mandated by the Declaration of Helsinki.

Why did you decide not to inform prospective participants about your “conflicts of interest” and “institutional affiliations” as part of the informed consent process? Do you believe this omission violates the Declaration of Helsinki’s provisions on disclosure to participants? Can you document that any PACE participants were told of your “possible conflicts of interest” and “institutional affiliations” during the informed consent process?

5) For both fatigue and physical function, your thresholds for “normal range” (Lancet) and “recovery” (Psych Med) indicated a greater level of disability than the entry criteria, meaning participants could be fatigued or physically disabled enough for entry but “recovered” at the same time. Thirteen percent of the sample was already “within normal range” on physical function, fatigue or both at baseline, according to information obtained under a freedom-of-information request.

Can you explain the logic of that overlap? Why did the Lancet and Psych Med papers not specifically mention or discuss the implication of the overlaps, or disclose that 13 percent of the study sample were already “within normal range” on an indicator at baseline? Do you believe that such overlaps affect the interpretation of the results? If not, why not? What oversight committee specifically approved this outcome measure? Or was it not approved by any committee, since it was a post-hoc analysis?

6) You have explained these “normal ranges” as the product of taking the mean value +/- 1 SD of the scores of  representative populations–the standard approach to obtaining normal ranges when data are normally distributed. Yet the values in both those referenced source populations (Bowling for physical function, Chalder for fatigue) are clustered toward the healthier ends, as both papers make clear, so the conventional formula does not provide an accurate normal range. In a 2007 paper, Dr. White mentioned this problem of skewed populations and the challenge they posed to calculation of normal ranges.

Why did you not use other methods for determining normal ranges from your clustered data sets from Bowling and Chalder, such as basing them on percentiles? Why did you not mention the concern or limitation about using conventional methods in the PACE papers, as Dr. White did in the 2007 paper? Is this application of conventional statistical methods for non-normally distributed data the reason why you had such broad normal ranges that ended up overlapping with the fatigue and physical function entry criteria?

7) According to the protocol, the main finding from the primary measures would be rates of “positive outcomes”/”overall improvers,” which would have allowed for individual-level. Instead, the main finding was a comparison of the mean performances of the groups–aggregate results that did not provide important information about how many got better or worse. Who approved this specific change? Were you concerned about losing the individual-level assessments?

8) The other two methods of assessing the primary outcomes were both post-hoc analyses. Do you agree that post-hoc analyses carry significantly less weight than pre-specified results? Did any PACE oversight committees specifically approve the post-hoc analyses?

9) The improvement required to achieve a “clinically useful benefit” was defined as 8 points on the SF-36 scale and 2 points on the continuous scoring for the fatigue scale. In the protocol, categorical thresholds for a “positive outcome” were designated as 75 on the SF-36 and 3 on the Chalder fatigue scale, so achieving that would have required an increase of at least 10 points on the SF-36 and 3 points (bimodal) for fatigue. Do you agree that the protocol measure required participants to demonstrate greater improvements to achieve the “positive outcome” scores than the post-hoc “clinically useful benefit”?

10) When you published your protocol in BMC Neurology in 2007, the journal appended an “editor’s comment” that urged readers to compare the published papers with the protocol “to ensure that no deviations from the protocol occurred during the study.” The comment urged readers to “contact the authors” in the event of such changes. In asking for the results per the protocol, patients and others followed the suggestion in the editor’s comment appended to your protocol. Why have you declined to release the data upon request? Can you explain why Queen Mary has considered requests for results per the original protocol “vexatious”?

11) In cases when protocol changes are absolutely necessary, researchers often conduct sensitivity analyses to assess the impact of the changes, and/or publish the findings from both the original and changed sets of assumptions. Why did you decide not to take either of these standard approaches?

12) You made it clear, in your response to correspondence in the Lancet, that the 2011 paper was not addressing “recovery.” Why, then, did Dr. Chalder refer at the 2011 press conference to the “normal range” data as indicating that patients got “back to normal”–i.e. they “recovered”? And since you had input into the accompanying commentary in the Lancet before publication, according to the press complaints commission, why did you not dissuade the writers from declaring a 30 percent “recovery” rate? Do you agree with the commentary that PACE used “a strict criterion for recovery,” given that in both of the primary outcomes participants could get worse and be counted as “recovered,” or “back to normal” in Dr. Chalder’s words?

13) Much of the press coverage focused on “recovery,” even though the paper was making no such claim. Were you at all concerned that the media was mis-interpreting or over-interpreting the results, and did you feel some responsibility for that, given that Dr. Chalder’s statement of “back to normal” and the commentary claim of a 30 percent “recovery” rate were prime sources of those claims?

14) You changed your fatigue outcome scoring method from bimodal to continuous mid-trial, but cited no references in support of this that might have caused you to change your mind since the protocol. Specifically, you did not explain that the FINE trial reported benefits for its intervention only in a post-hoc re-analysis of its fatigue data using continuous scoring.

Were the FINE findings the impetus for the change in scoring in your paper? If so, why was this reason not mentioned or cited? If not, what specific change prompted your mid-trial decision to alter the protocol in this way? And given that the FINE trial was promoted as the “sister study” to PACE, why were that trial and its negative findings not mentioned in the text of the Lancet paper? Do you believe those findings are irrelevant to PACE? Moreover, since the Likert-style analysis of fatigue was already a secondary outcome in PACE, why did you not simply provide both bimodal and continuous analyses rather than drop the bimodal scoring altogether?

15)  The “number needed to treat” (NNT) for CBT and GET was 7, as Dr. Sharpe indicated in an Australian radio interview after the Lancet publication. But based on the “normal range” data, the NNT for SMC was also 7, since those participants achieved a 15% rate of “being within normal range,” accounting for half of the rate experienced under the rehabilitative interventions.

Is that what Dr. Sharpe meant in the radio interview when he said: “What this trial wasn’t able to answer is how much better are these treatments and really not having very much treatment at all”? If not, what did Dr. Sharpe mean? Wasn’t the trial designed to answer the very question Dr. Sharpe cited? Since each of the rehabilitative intervention arms as well as the SMC arm had an NNT of 7, would it be accurate to interpret the “normal range” findings as demonstrating that CBT and GET worked as well as SMC, but not any better?

16) The PACE paper was widely interpreted, based on your findings and statements, as demonstrating that “pacing” isn’t effective. Yet patients describe “pacing” as an individual, flexible, self-help method for adapting to the illness. Would packaging and operationalizing it as a “treatment” to be administered by a “therapist” alter its nature and therefore its impact? If not, why not? Why do you think the evidence from APT can be extrapolated to what patients themselves call “pacing”? Also, given your partnership with Action4ME in developing APT, how do you explain the organization rejection of the findings in the statement issued after the study was published?

17) In your response to correspondence in the Lancet, you acknowledged a mistake in describing the Bowling sample as a “working age” rather than “adult” population–a mistake that changes the interpretation of the findings. Comparing the PACE participants to a sicker group but mislabeling it a healthier one makes the PACE results look better than they were; the percentage of participants scoring “within normal range” would clearly have been even lower had they actually been compared to the real “working age” population rather than the larger and more debilitated “adult” population. Yet the Lancet paper itself has not been corrected, so current readers are provided with misinformation about the measurement and interpretation of one of the study’s two primary outcomes.

Why hasn’t the paper been corrected? Do you believe that everyone who reads the paper also reads the correspondence, making it unnecessary to correct the paper itself? Or do you think the mistake is insignificant and so does not warrant a correction in the paper itself? Lancet policy calls for corrections–not mentions in correspondence–for mistakes that affect interpretation or replicability. Do you disagree that this mistake affects interpretation or replicability?

18) In our exchange of letters in the NYTimes four years ago, you argued that PACE provided “robust” evidence for treatment with CBT and GET “no matter how the illness is defined,” based on the two sub-group analyses. Yet Oxford requires that fatigue be the primary complaint–a requirement that is not a part of either of your other two sub-group case definitions. (“Fatigue” per se is not part of the ME definition at all, since post-exertional malaise is the core symptom; the CDC obviously requires “fatigue,” but not that it be the primary symptom, and patients can present with post-exertional malaise or cognitive problems as being their “primary” complaint.)

Given that discrepancy, why do you believe the PACE findings can be extrapolated to others “no matter how the illness is defined,” as you wrote in the NYTimes? Is it your assumption that everyone who met the other two criteria would automatically be screened in by the Oxford criteria, despite the discrepancies in the case definitions?

19) None of the multiple outcomes you cited as “objective” in the protocol supported the subjective outcomes suggesting improvement (excluding the extremely modest increase in the six-minute walking test for the GET group)? Does this lack of objective support for improvement and recovery concern you?  Should the failure of the objective measures raise questions about whether people have achieved any actual benefits or improvements in performance?

20) If wearing the actometer was considered too much of a burden for patients to wear at the end of the trial, when presumably many of them would have been improved, why wasn’t it too much of a burden for patients at the beginning of the trial? In retrospect, given that your other objective findings failed, do you regret having made that decision?

21) In your response to correspondence after publication of the Psych Med paper, you mentioned multiple problems with the “objectivity” of the six-minute walking test that invalidated comparisons with other studies. Yet PACE started assessing people using this test when the trial began recruitment in 2005, and the serious limitations–the short corridors requiring patients to turn around more than was standard, the decision not to encourage patients during the test, etc.–presumably become apparent quickly.

Why then, in the published protocol in 2007, did you describe the walking test as an “objective” measure of function? Given that the study had been assessing patients for two years already, why had you not already recognized the limitations of the test and realized that it was apparently useless as an objective measure? When did you actually recognize these limitations?

22) In the Psych Med paper, you described “recovery” as recovery only from the current episode of illness–a limitation of the term not mentioned in the protocol. Since this definition describes what most people would refer to as “remission,” not “recovery,” why did you choose to use the word “recovery”–in the protocol and in the paper–in the first place? Would the term “remission” have been more accurate and less misleading? Not surprisingly, the media coverage focused on “recovery,” not on “remission.” Were you concerned that this coverage gave readers and viewers an inaccurate impression of the findings, since few readers or viewers would understand that what the Psych Med paper examined was in fact “remission” and not “recovery,” as most people would understand the terms?

23) In the Psychological Medicine definition of “recovery,” you relaxed all four of the criteria. For the first two, you adopted the “normal range” scores for fatigue and physical function from the Lancet paper, with “recovery” thresholds lower than the entry criteria. For the Clinical Global Impression scale, “recovery” in the Psych Med paper required a 1 or 2, rather than just a 1, as in the protocol. For the fourth element, you split the single category of not meeting any of the three case definitions into two separate categories–one less restrictive (‘trial recovery’) than the original proposed in the protocol (now renamed ‘clinical recovery’).

What oversight committee approved the changes in the overall definition of recovery from the protocol, including the relaxation of all four elements of the definition? Can you cite any references for your reconsideration of the CGI scale, and explain what new information prompted this reconsideration after the trial? Can you provide any references for the decision to split the final “recovery” element into two categories, and explain what new information prompted this change after the trial?

24) The Psychological Medicine paper, in dismissing the original “recovery” threshold of 85 on the SF-36, asserted that 50 percent of the population would score below this mean value and that it was therefore not an appropriate cut-off. But that statement conflates the mean and median values; given that this is not a normally distributed sample and that the median value is much higher than the mean in this population, the statement about 50 percent performing below 85 is clearly wrong.

Since the source populations were skewed and not normally distributed, can you explain this claim that 50 percent of the population would perform below the mean? And since this reasoning for dismissing the threshold of 85 is wrong, can you provide another explanation for why that threshold needed to be revised downward so significantly? Why has this erroneous claim not been corrected?

25) What are the results, per the protocol definition of “recovery”?

26) The PLoS One paper reported that a sensitivity analysis found that the findings of the societal cost-effectiveness of CBT and GET would be “robust” even when informal care was measured not by replacement cost of a health-care worker but using alternative assumptions of minimum wage or zero pay. When readers challenged this claim that the findings would be “robust” under these alternative assumptions, the lead author, Paul McCrone, agreed in his responses that changing the value for informal care would, in fact, change the outcomes. He then criticized the alternative assumptions because they did not adequately value the family’s caregiving work, even though they had been included in the PACE statistical plan.

Why did the PLoS One paper include an apparently inaccurate sensitivity analysis that claimed the societal cost-effectiveness findings for CBT and GET were “robust” under the alternative assumptions, even though that wasn’t the case? And if the alternative assumptions were “controversial” and “restrictive, as the lead author wrote in one of his posted responses, then why did the PACE team include them in the statistical plan in the first place?

Revisiting the PLoS One economics analysis of PACE

On October 23rd, virology blog published the third installment of David Tuller’s investigative report about the PACE study of treatments for ME/CFS. In the post, Dr. Tuller demonstrated that the key finding of an economic analysis of the PACE trial, published in PLoS One in 2012, was almost certainly false. The finding–that cognitive behavior therapy and graded exercise therapy were cost-effective treatments–relied on an inaccuracy in the paper about whether the results of sensitivity analyses were “robust.”

Since the publication of the virology blog series, the PACE study has come under sustained and blistering public criticism for its many flaws. The PLoS One paper is currently the center of attention as a result of the researchers’ insulting response to Dr. James Coyne, a well-known psychologist and PLoS blogger. Dr. Coyne requested data to verify the results from the PLoS One paper, and was told that his request was “vexatious.” The researchers have called patients “vexatious” for years, of course, but the effort to use this strategy against a respected researcher has caused an uproar. Several colleagues and I, including Dr. Tuller, cited this rejection recently in our own request for a different set of PACE-related data.

Because of the open data policies of the PLoS journals, requesting data on that basis was a smart move by Dr. Coyne, and he has done a brilliant job of rousing support for the larger issue of access to data in scientific research. The PACE authors must recognize by now that at some point they will have to release all of their data.

The PLoS One study reported that cognitive behavior therapy and graded exercise therapy, the two treatments long favored by the main investigators, were more cost-effective than other approaches. The investigators have routinely cited these findings in promoting use of the two treatments. The truth or falseness of these claims from the PLoS One study are at the heart of the current controversy

In fact, it is already clear that the claim is highly unlikely to withstand serious scrutiny, based on the public record. In the October 23rd post, Dr. Tuller demonstrated that subsequent public comments of the lead author contradicted a critical statement in the paper about the PLoS One study’s sensitivity analyses..

The relevant excerpt from virology blog is below:

In another finding, the PLoS One paper argued that the graded exercise and cognitive behavior therapies were the most cost-effective treatments from a societal perspective. In reaching this conclusion, the investigators valued so-called  “informal” care—unpaid care provided by family and friends–at the replacement cost of a homecare worker. The PACE statistical analysis plan (approved in 2010 but not published until 2013) had included two additional, lower-cost assumptions. The first valued informal care at minimum wage, the second at zero compensation.

The PLoS One paper itself did not provide these additional findings, noting only that “sensitivity analyses revealed that the results were robust for alternative assumptions.” Commenters on the PLoS One website, including Tom Kindlon, challenged the claim that the findings would be “robust” under the alternative assumptions for informal care. In fact, they pointed out, the lower-cost conditions would reduce or fully eliminate the reported societal cost-benefit advantages of the cognitive behavior and graded exercise therapies.

In a posted response, the paper’s lead author, Paul McCrone, conceded that the commenters were right about the impact that the lower-cost, alternative assumptions would have on the findings. However, McCrone did not explain or even mention the apparently erroneous sensitivity analyses he had cited in the paper, which had found the societal cost-benefit advantages for graded exercise therapy and cognitive behavior therapy to be “robust” under all assumptions. Instead, he argued that the two lower-cost approaches were unfair to caregivers because families deserved more economic consideration for their labor.

“In our opinion, the time spent by families caring for people with CFS/ME has a real value and so to give it a zero cost is controversial,” McCrone wrote. “Likewise, to assume it only has the value of the minimum wage is also very restrictive.”

In a subsequent comment, Kindlon chided McCrone, pointing out that he had still not explained the paper’s claim that the sensitivity analyses showed the findings were “robust” for all assumptions. Kindlon also noted that the alternative, lower-cost assumptions were included in PACE’s own statistical plan.

“Remember it was the investigators themselves that chose the alternative assumptions,” wrote Kindlon. “If it’s ‘controversial’ now to value informal care at zero value, it was similarly ‘controversial’ when they decided before the data was looked at, to analyse the data in this way. There is not much point in publishing a statistical plan if inconvenient results are not reported on and/or findings for them misrepresented.”

A request for data from the PACE trial

Mr. Paul Smallcombe
Records & Information Compliance Manager
Queen Mary University of London
Mile End Road
London E1 4NS

Dear Mr Smallcombe:

The PACE study of treatments for ME/CFS has been the source of much controversy since the first results were published in The Lancet in 2011. Patients have repeatedly raised objections to the study’s methodology and results. (Full title: “Comparison of adaptive pacing therapy, cognitive behaviour therapy, graded exercise therapy, and specialist medical care for chronic fatigue syndrome: a randomized trial.”)

Recently, journalist and public health expert David Tuller documented that the trial suffered from many serious flaws that raise concerns about the validity and accuracy of the reported results. We cited some of these flaws in an open letter to The Lancet that urged the journal to conduct a fully independent review of the trial. (Dr. Tuller did not sign the open letter, but he is joining us in requesting the trial data.)

These flaws include, but are not limited to: major mid-trial changes in the primary outcomes that were not accompanied by the necessary sensitivity analyses; thresholds for “recovery” on the primary outcomes that indicated worse health than the study’s own entry criteria; publication of positive testimonials about trial outcomes and promotion of the therapies being investigated in a newsletter for participants; rejection of the study’s objective outcomes as irrelevant after they failed to support the claims of recovery; and the failure to inform participants about investigators’ significant conflicts of interest, and in particular financial ties to the insurance industry, contrary to the trial protocol’s promise to adhere to the Declaration of Helsinki, which mandates such disclosures.

Although the open letter was sent to The Lancet in mid-November, editor Richard Horton has not yet responded to our request for an independent review. We are therefore requesting that Queen Mary University of London to provide some of the raw trial data, fully anonymized, under the provisions of the U.K.’s Freedom of Information law.

In particular, we would like the raw data for all four arms of the trial for the following measures: the two primary outcomes of physical function and fatigue (both bimodal and Likert-style scoring), and the multiple criteria for “recovery” as defined in the protocol published in 2007 in BMC Neurology, not as defined in the 2013 paper published in Psychological Medicine. The anonymized, individual-level data for “recovery” should be linked across the four criteria so it is possible to determine how many people achieved “recovery” according to the protocol definition.

We are aware that previous requests for PACE-related data have been rejected as “vexatious.” This includes a recent request from psychologist James Coyne, a well-regarded researcher, for data related to a subsequent study about economic aspects of the illness published in PLoS One—a decision that represents a violation of the PLoS policies on data-sharing.

Our request clearly serves the public interest, given the methodological issues outlined above, and we do not believe any exemptions apply. We can assure Queen Mary University of London that the request is not “vexatious,” as defined in the Freedom of Information law, nor is it meant to harass. Our motive is easy to explain: We are extremely concerned that the PACE studies have made claims of success and “recovery” that appear to go beyond the evidence produced in the trial. We are seeking the trial data based solely on our desire to get at the truth of the matter.

We appreciate your prompt attention to this request.


Ronald W. Davis, PhD
Professor of Biochemistry and Genetics
Stanford University

Bruce Levin, PhD
Professor of Biostatistics
Columbia University

Vincent R. Racaniello, PhD
Professor of Microbiology and Immunology
Columbia University

David Tuller, DrPH
Lecturer in Public Health and Journalism
University of California, Berkeley