Trial By Error, Continued: Did the PACE Trial Really Prove that Graded Exercise Is Safe?

By Julie Rehmeyer and David Tuller, DrPH

Julie Rehmeyer is a journalist and Ted Scripps Environmental Journalism Fellow at the University of Colorado, Boulder, who has written extensively about ME/CFS.

David Tuller is academic coordinator of the concurrent masters degree program in public health and journalism at the University of California, Berkeley.

Joining me for this episode of our ongoing saga is my friend and colleague Julie Rehmeyer. In my initial series, I only briefly touched on the PACE trial’s blanket claim of safety. Here we examine this key aspect of the study in more detail, which is complicated and requires a deep dive into technicalities. Sorry about that, but the claim is too consequential to ignore.    


One of the most important and controversial claims from the PACE Trial was that graded exercise therapy is safe for patients with chronic fatigue syndrome (or ME/CFS, as U.S. government agencies now call it).

“If this treatment is done by skilled people in an appropriate way, it actually is safe and can stand a very good chance of benefiting [patients],” Michael Sharpe, one of the principal PACE investigators, told National Public Radio in 2011, shortly after The Lancet published the first results.

But to many in the ME/CFS community, this safety claim goes against the very essence of the disease. The hallmark of chronic fatigue syndrome, despite the name, is not actually fatigue but the body’s inability to tolerate too much exertion — a phenomenon that has been documented in exercise studies. All other symptoms, like sleep disorders, cognitive impairment, blood pressure regulation problems, and muscle pain, are exacerbated by physical or mental activity. An Institute of Medicine report this year even recommended that the illness be renamed to emphasize this central problem, choosing the name “systemic exertion intolerance disease,” or SEID. [see correction below]

A careful analysis shows that the PACE researchers’ attempts to prove safety were as flawed as their attempts to prove efficacy. However, while the trial reports gave enough information to establish that the treatments were not effective (in spite of the claims of success and “recovery”), they did not give enough information to establish whether they were safe (also in spite of their claims). We simply do not know.

“I would be very skeptical in recommending a blanket statement that GET is safe,” says Bruce Levin, a biostatistician at Columbia University, who has reviewed the PACE trial and found other methodological aspects indefensible. “The aphorism that absence of evidence is not evidence of absence applies here. There is real difficulty interpreting these results.”

*          *          *          *          *          *

Assessing the PACE team’s safety claims is critical, because the belief that graded exercise is safe has had enormous consequences for patients. In the UK, graded exercise therapy is recommended for all mild to moderate ME/CFS patients by the National Institute for Health and Care Excellence, which strongly influences treatment across the country. In the US, the Centers for Disease Control and Prevention also recommends graded exercise.

Exertion intolerance—also called “post-exertional malaise”—presents ME/CFS patients with a quandary: They want to do as much as they can when they’re able, while not doing so much that they make themselves sicker later. Among themselves, they’ve worked out a strategy to accomplish that, which they call “pacing.” Because their energy levels fluctuate, they carefully monitor how they are feeling and adapt their activities to stay within the day’s “energy envelope.”  This requires sensitive attunement to their symptoms in order to pick up on early signs of exacerbation and avoid exceeding their limits.

But according to the hypothesis behind the PACE study, this approach is all wrong. Because the investigators believe physical deconditioning rather than an organic disease perpetuated the many symptoms, they theorized that the key to getting better was to deliberately exceed current limits, gradually training the body to adapt to greater levels of activity. Rather than being sensitively attuned to symptoms, patients should ignore them, on the grounds that they have become obsessed about sensations most people would consider normal. Any increase in symptoms from exertion was explained as expected, transient and unimportant—the result of the body’s current state of weakness, not an underlying disease.

Many patients in the UK have tested this theory, since graded exercise therapy, or GET, is one of the few therapies available to patients there. And patient reports on the approach are very, very bad. In May 2015, the ME Association, a British charity, released a survey of patients’ experiences with GET, cognitive behavioral therapy, and pacing. The results suggested that GET was far and away the most dangerous. Of patients who received GET, 74 percent said that it had made them worse. In contrast, 18 percent said they were worse after cognitive behavior therapy and only 14 percent after pacing.

The survey is filled with reports similar to this one: “My condition deteriorated significantly, becoming virtually housebound, spending most of my day in bed in significant pain and with extreme fatigue.”

Anecdotal reports, however, don’t provide the proof of a randomized clinical trial. So this was one of the central issues at stake in the PACE study: Is it safe for patients to increase their activity on a set schedule while ignoring their symptoms?

*          *          *          *          *          *

In the 2011 Lancet article with the first PACE results, the researchers reported that eight percent of all participants experienced a “serious deterioration” and less than two percent experienced a “serious adverse reaction” over the course of the year, without significant differences between the arms of the trial.

For patients to have a “serious deterioration,” their physical function score needed to drop by at least 20 points and they needed to report that their overall health was “much worse” or “very much worse” at two consecutive assessment periods (out of a total of three).

To have a “serious adverse reaction,” the patient needed to experience a persistent “severe, i.e. significant deterioration,” which was not defined, or to experience a major health-related event, such as a hospitalization or even death. Furthermore, a doctor needed to determine that the event was directly caused by the treatment—a decision that was made after the doctor was told which arm of the trial the patient was in.

Subsequent “safety” results were published in a 2014 article in the Journal of Psychosomatic Research. And this paper revealed a critical detail unmentioned in the Lancet paper: the six centers around England participating in the study appear to have applied the methods for assessing safety differently. That raises questions about how to interpret the results and whether the overall claims of “safety” can be taken at face value.

Beyond that issue, a major problem with the PACE investigators’ reporting on harms from exercise is that it looks as though participants might not have actually done much exercise. While the researchers stated the ambitious goal that participants would exercise for at least 30 minutes five times a week, they gave no information on how much exercise participants in fact did.

The trial’s objective outcomes suggest it may not have been much. The exercise patients were only able to walk 11 percent further in a walking test at the end of the trial than patients who hadn’t exercised. Even with this minimal improvement, participants were still severely disabled, with a poorer performance than patients with chronic heart failure, severe multiple sclerosis, or chronic obstructive pulmonary disorder.

On top of that, almost a third of those in the exercise arm who finished other aspects of the trial never completed the final walking test; if they couldn’t because they were too sick, that would skew the results. In addition, the participants in GET showed no improvement at all on a step test designed to measure fitness. Presumably, if the trial’s theory that patients suffered from deconditioning was correct, participants who had managed to exercise should have become more fit and performed better on these tests.

Tom Kindlon, a long-time patient and an expert on the clinical research, suggests that even if those in the exercise arm performed more graded exercise under the guidance of trial therapists, they may have simply cut back on other activities to compensate, as has been found in other studies of graded activity. He also notes that the therapists in the trial were possibly more cautious than therapists in everyday practice.

“In the PACE Trial, there was a much greater focus on the issue of safety [than in previous studies of graded activity], with much greater monitoring of adverse events,” says Kindlon, who published an analysis of the reporting of harms from trials of graded activity in ME/CFS, including PACE. “In this scenario, it seems quite plausible that those running the trial and the clinicians would be very cautious about pushing participants to keep exercising when they had increased symptoms, as this could increase the chances the patients would say such therapies caused adverse events.”

*          *          *          *          *          *

Had the investigators stuck to their original plan, we would have more evidence to evaluate participants’ activity levels.  Originally, participants were going to wear a wristwatch-sized ankle band called an actometer, similar to a FitBit, that would measure how many steps they took for a week at the beginning of the trial and for a week at the end.

A substantial increase in the number of steps over the course of the trial would have definitively established both that participants were exercising and that they weren’t decreasing other activity in order to do so.

But in reviewing the PACE Trial protocol, which was published in 2007, Kindlon noticed, to his surprise, that the researchers had abandoned this plan. Instead, they were asking participants to wear the actometers only at the beginning of the trial, but not at the end. Kindlon posted a comment on the journal’s website questioning this decision. He pointed out that in previous studies of graded activity, actometer measurements showed that patients were not moving more, even if they reported feeling better. Hence, the “exercise program” in that case in fact did not raise their overall activity levels.

In a posted response, White and his colleagues explained that they “decided that a test that required participants to wear an actometer around their ankle for a week was too great a burden at the end of the trial.” However, they had retained the actometer as a baseline measure, they wrote, to test as “a moderator of outcome”—that is, to determine factors that predicted which participants improved. The investigators also noted that the trial contained other objective outcome measures. (They subsequently dismissed the relevance of these objective measures after they failed to demonstrate efficacy.)

That answer didn’t make sense to Kindlon. “They clearly don’t find it that great a burden that they drop it altogether as it is being used on patients before the start,” he wrote in a follow-up comment. “If they feel it was that big of a burden, it should probably have been dropped altogether.”

*          *          *          *          *

The other major flaws that make it impossible to assess the validity of their safety claims are related to those that affected the PACE trial as a whole.  In particular, problems related to four issues affected their methods for reporting harms: the case definition, changes in outcome measures after the trial began, lack of blinding, and encouraging participants to discount symptoms in a trial that relied on subjective endpoints.

First, the study’s primary case definition for identifying participants, called the Oxford criteria, was extremely broad; it required only six months of medically unexplained fatigue, with no other symptoms necessary. Indeed, 16% of the participants didn’t even have exercise intolerance—now recognized as the primary symptom of ME/CFS—and hence would not be expected to suffer serious exacerbations from exercise. The trial did use two additional case definitions to conduct sub-group analyses, but they didn’t break down the results on harms by the definition used. So we don’t know if the participants who met one of the more stringent definitions suffered more setbacks due to exercise.

Second, after the trial began, the researchers tightened their definition of harms, just as they had relaxed their methods of assessing improvement. In the protocol, for example, a steep drop in physical function since the previous assessment, or a slight decline in reported overall health, both qualified as a “serious deterioration.” However, as reported in The Lancet, the steep drop in physical function had to be sustained across two out of the trial’s three assessments rather than just since the previous one. And reported overall health had to be “much worse” or “very much worse,” not just slightly worse. The researchers also changed their protocol definition of a “serious adverse reaction,” making it more stringent.

The third problem was that the study was unblinded, so both participants and therapists knew the treatment being administered. Many participants were probably aware that the researchers themselves favored graded exercise therapy and another treatment, cognitive behavior therapy, which also involved increasing activity levels. Such information has been shown in other studies to lead to efforts to cooperate, which in this case could lead to lowered reporting of harms.

And finally, therapists were explicitly instructed to urge patients in the graded exercise and cognitive behavioral therapy arms to “consider increased symptoms as a natural response to increased activity”—a direct encouragement to downplay potential signals of physiological deterioration. Since the researchers were relying on self-reports about changes in functionality to assess harms, these therapeutic suggestions could have influenced the outcomes.

“Clinicians or patients cannot take from this trial that it is safe to undertake graded exercise programs,” Kindlon says. “We simply do not know how much activity was performed by individual participants in this trial and under what circumstances; nor do we know what was the effect on those that did try to stick to the programs.”

Correction: The original text stated that the Institute of Medicine report came out “this” year; that was accurate when it was written in late December but inaccurate by the time of publication.

  • Safety Inspector

    A few brief points:

    1) There is much unpublished harms data from the PACE trial. Results for non-serious adverse events were not presented were recorded in grades of severity but the published results were aggregated together, so while the number of events were similar between groups, it is possible that the CBT and GET groups suffered more severe events. Even worse, no data whatsoever has been published for non-serious adverse reactions. Many of the significant relapses that patients frequently experience,such as a few weeks of significantly declined function, were regarded as “non-serious”. Given the strictness of serious harms while the “non-serious” incidences were downplayed, the real story of safety in the PACE trial may remain hidden within this unpublished data.

    2) Look at the individual patient data from the FINE trial for a clue of how much the redefinition of one of the measures of harms (now a 20 point reduction in SF-36 physical function over two consecutive followup assessments instead of one) has decreased the occurrence, i.e. by several fold, just as redefinition of improvement and recovery has increased those by several fold. These drastic changes have been unjustified and clearly make the results appear more effective and safer than they really are.

    3) Objective measures used in the PACE trial and similar trials dispute the notion that patients are increasing their activity levels. Therefore it is dangerous when proponents of CBT and GET claim that these therapies safely increase function and activity.

  • disqus_Rv8tqVZbOP

    Right, but you need to deprogram. PACE was not studying “ME/CFS,” they were studying CFS.

  • davetuller

    Hi, Sasha–by my calculation, it should be Jan 18th–a week from Monday. Now it’s possible, I guess, that they could argue that the weekdays during the holidays don’t count, if the university was closed. that would presumably give them two more weeks. I think the standard is regular workdays, not university workdays–but I don’t know. I assume they will reject it but not call it “vexatious.” I would guess they will use patient confidentiality reasons. But who knows? They could surprise.

  • Sasha

    Thanks, David. We had, in the UK, some public holidays, but I don’t know what the universities do.

  • davetuller

    we shall see!

  • davetuller

    Hi, I’m interested in writing about the impact of PACE in the U.S. Can we talk about your experience with Kaiser Permanente?

  • tomkindlon

    What Peter White actually said was:

    “The PACE trial paper refers to chronic fatigue syndrome (CFS) which is operationally defined; it does not purport to be studying CFS/ME”.

    It is bizarre to see people who say ME is not CFS or CFS/ME to claim that suddenly CFS/ME equals ME i.e. to claim Peter White said they were not studying ME.

    If one reads material from the PACE trial such as the Lancet paper, it is clear that they do consider that they studied ME
    “Subgroup analysis of 427 participants meeting international criteria for chronic fatigue syndrome and 329 participants meeting London criteria for myalgic encephalomyelitis yielded equivalent results.”

    My reading of the quote from Peter White is that he may have been thinking of the Canadian ME/CFS criteria.

  • weyland

    I’m not going to argue about the lack of equivalence of ME and CFS because I think we both agree on that point. You do need to read the 2011 Lancet PACE paper though because you seem to have missed the fact that they did purport to be studying ME, regardless of what doublespeak Peter White spews out in other venues. In addition to applying the Oxford criteria, they also applied version 2 of the London criteria for ME:

    “Participants were also assessed by international criteria for chronic fatigue syndrome,12 requiring four or more accompanying symptoms, and the London criteria13 for myalgic encephalomyelitis (version 2), requiring postexertional fatigue, poor memory and concentration, symptoms that fluctuate, and no primary depressive or anxiety disorder (interpreted as an absence of any such disorder).”

    “Subgroup analysis of 427 participants meeting international criteria for chronic fatigue syndrome and 329 participants meeting London criteria for myalgic encephalomyelitis yielded equivalent results. ”

    “Participant subgroups meeting international criteria for chronic fatigue syndrome, London criteria for myalgic encephalomyelitis, and depressive disorder criteria did not differ in the pattern of treatment effects”

    Convincing everyone that CFS and ME are different does not make the PACE trial go away for ME patients. I wish that were true but it’s not. Attacking the PACE trial for its methodological flaws are what will make it go away.

  • patrick holland

    Criminal Aspects
    It is a criminal offence in Britain (and other countries) to aid and abet or support in any way a person(s) involved in criminal activity. Criminal charges and civil damages charges could be taken against certain psychiatrists, insurance companies and medical doctors. The dismissal and ignoring of all biological research into ME / CFS in over 5000 research papers, and of known biological markers, while accepting false psychiatric claims has had dire consequences for patients, ranging from terrible suffering and deterioration for many years, financial losses, break up of family and relationships, and premature deaths. The high number of deaths from ME / CFS and the enormous suffering inflicted on many patients over many years through medical neglect borne of dismissal of the disease as a psychiatric illness by medical authorities, represents criminal neglect, assault, grevious bodily harm, and may constitute manslaughter in certain cases. The harms caused by CBT and GET treatments constitute criminal acts, ranging from assault, grevious bodily harm to manslaughter. The failure to inform patients of these harms and risks represents medical negligence and the breach of the law concerning consent. Refusals of insurance companies to pay benefits to ME / CFS patients can be prosecuted in criminal courts under fraud, breach of contract, and RICO charges and sued in civil damages courts. In the USA RICO charges and suing for benefits and other costs have been successfully undertaken by patients. Also, insurance companies or doctors which recommend CBT and GET as treatments for ME / CFS may be financially liable for the harms caused.

    The denial of government funds for biological research into ME / CFS – a denial which was orchestrated by certain influential psychiatrists meant patients were deprived of vital biological research which would provide more accurate and effective diagnostics and treatments, and this had the effect of prolonging the suffering and deterioration of patients, and causing premature deaths ; there are further criminal offences here. The conflicts of interest in research, advice given to government and government bodies, advice and guidelines given to medical doctors and medical bodies, and advice and services offered to insurance companies and the enrichment of some psychiatrists while patients were neglected, suffered and died prematurely includes more criminal offences.

  • disqus_Rv8tqVZbOP

    Yes but this is probably the real doublespeak that Tuller and Rehmeyer miss or ignore. This is rather obscure within the paper to begin with, and is mainly more fudging of terms and definitions. PACE Trial studied: “CFS defined simply as a principal complaint of fatigue that is disabling, having lasted six months, with no alternative medical explanation (Oxford criteria).”

    This whole “version 2” of London criteria is really murky, more of a a rewrite and not ME but the usual shifting of the meaning of “ME” to “CFS.”

    So heres proof. PACE trial entry protocol excluded patients with infection or inflammation. Methods In our parallel-group randomised trial, patients meeting Oxford criteria for chronic fatigue syndrome” [2] and thus excluded any really ill patients.

    Thus, not ME, no matter what they say or call it.

  • patrick holland

    a carnival of more lies and deception by PACE researchers. Its time for court, Tom.
    Any persons with some backbone (courage) and determination in the ME community ?

  • Boka

    ‘Almost a third of those in the exercise arm who finished other aspects of the trial never completed the final walking test’. What??? Really??? Why??? I am frankly shocked. I knew for a long time that the PACE Trial was poor. But I put this down to the use of the Oxford criteria leading to a selection of patients with a broad spectrum of illnesses. I didn’t realise how appalling every aspect of it actually was. The above fact ALONE should be enough to throw the whole thing out.

  • Husserl

    It is a pleasure to hear about the court plans.I hope they go ahead.

  • patrick holland

    always a pleasure, mate. Even more of a pleasure when you and many other thousands of ME and CFS people rise up and take your cases to court and stop being walked on, s****d on, mocked, despised and destroyed by the psychiatrists and their disciples.

  • patrick holland

    unfortunately another example of the dumbing down of the masses of people in Europe and North America. They would swallow any number of lies and deceptions from supposedly “reputable sources” dressed up in suits and ties.

