Vincent Racaniello

Would you be interested in this person’s science?

If you are reading this you know that I am a strong proponent of communicating science to the general public. My favorite media are blogging and podcasting, but I’ve also dabbled in video. That’s why I was startled by a study which concludes that facial appearance affects science communication.

It is well known that people will base their impression of a person’s abilities and character, including trustworthiness, competence, and sociability, on their facial appearance. Is the same true for scientists, practitioners of what has been called a ‘dispassionate sifting of evidence’? The answer is important because of the impact that scientists can have on the public’s understanding of our world.

Three different approaches were used to answer this question.

First, study participants were asked to rate several hundred faces of scientists on a variety of social traits, including intelligence, attractiveness, and age. The results show that people are more interested in learning about the work of scientists who are considered physically attractive, competent, and moral. There was little effect of sex, race, or age. In contrast, the ability to conduct high quality science was negatively related to attractiveness. Older scientists were thought to be more likely to do better research.

Does facial appearance affect interest in a scientist’s work? In a study in which participants could watch a video describing research, or read an article on the work (accompanied by a photo of the scientist), an interesting looking scientist was more likely to generate appeal.

Can your face influence what people think about your research? In this study, articles were paired with faces that did or did not score as a good scientist in the first study. Research done by scientists who were considered good was judged to be of higher quality. Scientists thought to be more competent, based on their facial appearance, were more likely to be selected to win a prize for excellence.

The authors summarize their study:

People reported more interest in the research of scientists who appear competent, moral, and attractive; when judging whether a research does ‘good science’, people again preferred scientists who look competent and moral, but also favored less sociable and more physically unattractive individuals.

Given that it is crucial that scientists communicate their work to the public, what should we take away from these observations? Should unattractive scientists not bother to communicate? Nonsense! All scientists should communicate!  There are many forms of communication that do not involve seeing the scientist, such as blogging and podcasting (be sure to omit photos from your website). The results of this study are not absolute – even scientists perceived as unattractive garnered interest, albeit less than attractive ones.

So go ahead and communicate, scientists. If you make science videos, and no one watches, they are probably not well done, and you need to go back to the drawing board – not to the stylist.

By David Tuller, DrPH

[June 30, 2017: This post has been corrected and revised.]

Professor Peter White and colleagues have published yet another study in The Lancet promoting graded exercise as an appropriate intervention for the illness they refer to as “chronic fatigue syndrome” but that is more appropriately called “myalgic encephalomyelitis.” (Two compromise terms, ME/CFS and CFS/ME, satisfy no one.) This new article exhibits the range of problems found repeatedly in this body of research, including the reliance on subjective outcomes for an open-label trial, unusual outcome-switching, and self-serving presentations of data.

In short, this latest study seeks to bolster the crumbling evidence base for the PACE/CBT/GET paradigm by reporting modest benefits for graded exercise. But as with previous research espousing this approach, even the unimpressive results reported here cannot be taken seriously by scientists who understand basic research standards.

The full name of the article is: “Guided graded exercise self-help plus specialist medical care vs. specialist medical care alone for chronic fatigue syndrome (GETSET): a pragmatic randomised controlled trial.” It involved 211 patients, split into two groups. Both groups received at least one meeting with a doctor—what the study called “specialist medical care.” The intervention group also received up to 90 minutes of instruction from a physiotherapist on how to pursue a self-guided graded exercise program.

The results presented are short-term–12 weeks after trial entry. The investigators reported very modest benefits for the intervention in scores for self-reported fatigue and non-clinically significant improvements in scores for self-reported physical function. There has already been some terrific analysis of the study’s shortcomings on patient forums, so I’m just going to make a few points here.

The study design itself incorporates a huge and fundamental flaw: The unreliable combination of an open-label study with subjective outcomes. Experts outside the influence of Sir Simon Wessely, most notably Jonathan Edwards, an emeritus professor of medicine from University College London, have repeatedly highlighted this feature as rendering the findings meaningless. As Professor Edwards wrote recently in his commentary for the Journal of Health Psychology, this flaw alone makes the PACE trial “a non-starter in the eyes of any physician or clinical pharmacologist familiar with problems of systemic bias in trial execution.” Studies with this design, he explained, have been abandoned in other fields of medicine.

The difficulty of shielding such trials from systematic bias is the reason that studies are blinded in the first place. It is common sense that if you tell people in one group that the intervention they are getting should help them, and then if you don’t give the intervention to people in another group and do not provide encouragement that they will improve, more people from the first group than the second group are likely to tell you in the short term that they feel a bit better.

It does not mean that they have experienced any objective improvements. It also doesn’t mean that these self-reported benefits of the intervention will be apparent at long-term follow-up. In fact, follow-up studies in this body of research do not provide evidence of long-term differences between study groups.

Unlike in the PACE trial, the investigators chose not to test these subjective findings against objective outcomes. They acknowledge the absence of objective outcomes as a limitation but do not explain why they made the choice to exclude them. Presumably they remembered that PACE’s own objective measures—the six-minute walking test, the step-test for fitness, and whether people got off benefits and back to work—all failed to confirm the trial’s claims of success. In other trials, objective measurements of participants’ movements have also failed to document benefits from the non- interventions tested for the illness.

[Correction/revision: The following three paragraphs replace material included in the original version posted on June 28, 2017]

In this new article, Professor White and his colleagues refer to the GETSET intervention as a “management approach.” The investigators fail to mention that a 2013 paper in Psychological Medicine purported to have proven that people could actually “recover” with GET. They mention patient surveys on reported harms from graded exercise, but they choose to omit the growing peer-reviewed literature on immunological and other dysfunctions of ME/CFS, from leading medical research centers around the world.

They also ignore the major 2015 report from the U.S. Institute of Medicine (now called the Academy of Medicine). This report, which involved an extensive review of the literature, identified “post-exertional malaise” as the cardinal symptom, in the process proposing to rename the illness “systemic exertion intolerance disease.” Other research has also shed light on possible pathophysiological pathways involved in causing the severe relapses that characterize the disease.

If post-exertional malaise or (per the IOM) exertion intolerance is the cardinal symptom, then graded exercise in any form could be contraindicated. Professor White and his colleagues obviously do not have to agree with this interpretation of the recent research and reports. But the failure to mention and discuss these findings in an article investigating a graded exercise intervention demonstrates the investigators’ apparent unwillingness or inability to grapple with the current state of scientific knowledge in the field.

In terms of outcome-switching, the investigators report that they made a confusing (to me) change midway through the trial. Here’s how they explain it:

The original protocol had only one primary outcome measure, the SF-36 PF. However, when some eligible participants were found to have high SF-36 PF scores at randomisation (because of their illness affecting cognitive or social functions but not physical function), we decided to also include fatigue, using the CFQ, as a co-primary outcome. This decision was made mid-way through trial recruitment (on June 20, 2013, after recruitment of 99 [47%] patients), before any outcome data had been examined, and was approved by the Research Ethics Committee, the Data Monitoring and Ethics Committee, and the Trial Steering Committee.

I’m not a statistician or an epidemiologist, but it struck me as unusual that investigators would be allowed to make a change of this magnitude in the middle of a trial. If the specialized treatment centers were following the NICE guidelines and yet diagnosing people having high SF-36 physical function scores, one obvious possible explanation is they could have been misdiagnosing people as having chronic fatigue syndrome even if they were suffering from chronic fatigue for other reasons. In that case, I can understand that adding a fatigue outcome measure might make it easier to demonstrate evidence for improvement. But could this addition be justified from the methodological and statistcal perspective?

For an answer, I turned to Bruce Levin, a professor of biostatistics at Columbia. Since Professor Levin first reviewed the PACE study at my request in 2015, he has been a staunch critic of the trial’s methodological choices and the decision by journals like The Lancet and Psychological Medicine to publish the questionable results. Here’s Professor Levin’s perspective on the choice to add fatigue [July 3, 2017: changed “physical function” to “fatigue”; as previously explained, “physical function” was already the primary outcome] as a co-primary outcome midway through the GETSET trial:

I think the main problem with this follows from the overarching issue of bias in unblinded studies.  Inflation of the type I error rate isn’t a problem because the significance criterion was adjusted to control that.  My concern is what the investigators could have observed and surmised regarding the treatment effect mid-way through the trial, even though they claim not to have looked at any A versus B comparisons.  Even an inkling that the pooled mean physical functioning outcome was too high could suggest a lack of treatment effect.

Obviously they were looking at baseline data in order to notice that “too many” subjects had non-disabled physical functioning.  There should have been no concern about imbalance between the groups in that regard, because they planned to adjust for baseline physical functioning, which would remove any chance imbalance.  [I assume that is the case—I haven’t seen the trial’s SAP.]  No, it seems that (once again) the investigators were second-guessing their own protocol and worrying about having “too little” room for improvement.  If the decision to change the primary endpoint (by adding a co-primary endpoint) was based on what they could see in this unblinded study, that would incur bias.

I find it astonishing that the investigators’ remedy for the perceived problem of “too many” non-disabled subjects was to add a co-primary endpoint.  If one is concerned about low power, the last thing one would ordinarily think of is adding a co-primary endpoint which reduces power, because the adjustment to control type I error makes it less likely to correctly declare statistical significance when the alternative hypothesis is true.

Furthermore, although mid-trial changes in protocol can be implemented without bias in so-called adaptive trial designs, it is important to note that such adaptations are contemplated a priori and built into the design of the study before it begins. This is the so-called “adaptive-by-design” feature.  Other ad hoc or post-hoc adaptations are to be avoided, especially in unblinded studies with self-reported endpoints.

So that’s the blunt assessment from an unimpeachable expert. The statisticians and numbers experts out there will understand the inner details of Professor Levin’s comments much better than me, but the gist is certainly clear. That The Lancet has once again chosen to publish work of this low caliber is sad but predictable, given the journal’s record in this domain. Although the 2011 paper was “fast-tracked” to publication, editor Richard Horton stated in a radio interview not long after the publication of the results that it had undergone “endless rounds” of peer review. He has not explained this apparent contradiction, despite my efforts to extract an answer from him.

This new publication again raises questions about the thoroughness and integrity of The Lancet’s reviewing and editorial processes. And the decision confirms what has been demonstrated repeatedly in this whole saga. Those who have tied their prestige and reputations to the PACE paradigm, like the editors of The Lancet, are willing to make themselves look ridiculous in their misguided and unconvincing efforts to defend this indefensible enterprise.

**********

[June 30, 2017: Explanation for changes]:

In the original version, I quoted the following phrase from the new Lancet article and suggested it represented a change in direction for Professor White and his colleagues: “It is important to note that this trial was not designed to test causative factors in chronic fatigue syndrome, and the relative efficacy of a behavioural intervention does not imply that chronic fatigue syndrome is caused by psychological factors.”

I was wrong. Similar phrasing appeared in the 2011 Lancet article as well as other publications. My mistake was that I forgot to recognize the distinction in the PACE vocabulary between “causative” and “perpetuating” factors of the illness. I have removed the inaccuracy and statements arising from it, and have revised the section to accommodate the correction. For full transparency, I include the original paragraphs below. Of course, I apologize to Professor White and his colleagues for the error.

**********

[Original version}:

In this new article, Professor White appears to be back-pedaling away from a central claim in PACE. GETSET avoids arguing that chronic fatigue syndrome (per the article’s usage) is “reversible” with a graded exercise program. The investigators also fail to mention that a 2013 paper in Psychological Medicine purported to have proven that people could “recover” with GET. Instead, they here refer to the intervention as a “management approach.”

They also insist that any apparent benefit for this management approach would not suggest anything about the cause of the illness. “It is important to note that this trial was not designed to test causative factors in chronic fatigue syndrome, and the relative efficacy of a behavioural intervention does not imply that chronic fatigue syndrome is caused by psychological factors,” they write.

To those who understand the history of this body of research, this statement represents somewhat of a surprising shift. By directly contradicting the description of GET in The Lancet’s 2011 paper, the statement appears to eviscerate the rationale for the graded exercise intervention in the first place. The description in 2011 is very clear. Deconditioning and avoidance of activity are the causes of the continuing symptoms, and the syndrome is “reversible” by addressing these specific problems:

GET was done on the basis of deconditioning and exercise intolerance theories of chronic fatigue syndrome. These theories assume that the syndrome is perpetuated by reversible physiological changes of deconditioning and avoidance of activity. These changes result in the deconditioning being maintained and an increased perception of effort, leading to further inactivity. The aim of treatment was to help the participant gradually return to appropriate physical activities, reverse the deconditioning, and thereby reduce fatigue and disability.

There is no room in this 2011 description of GET for whatever other “causative factors” Professor White now seems to acknowledge could be implicated in the disease. Presumably those possible “causative factors” include ongoing pathological organic processes independent of the “reversible physiological changes” arising from deconditioning. (Professor White has long acknowledged that acute biological illnesses can trigger the onset of chronic fatigue syndrome. This initial illness is then presumed to have launched the downward cycle of deconditioning and fear of activity–the factors seen as responsible for perpetuating the symptoms after the acute illness has been resolved.)

The difficulty for this new study is that GET in PACE is based on the hypothesis that people are experiencing “reversible physiological changes” arising from nothing other than serious deconditioning. Does Professor White still believe in this hypothesis, or not? On the evidence of the current paper, he has disavowed it, without explicitly acknowledging this disavowal.

Since the PACE version of GET was designed to address reversible symptoms occurring in the explicit absence of organic disorders, what is Professor White’s current rationale for recommending a graded exercise approach? And if the underlying rationale for the intervention is no longer the absence of organic disorders, how can PACE itself or the NICE guidelines or Cochrane’s systematic reviews be credibly cited in support of it?

If I interpret the new paper correctly, it appears that Professor White and his team do not believe the PACE hypothesis that deconditioning and avoidance of activity are the sole causes of the perpetuation of the symptoms. They acknowledge that other possible “causative factors” could be involved. Given that change in perspective about the illness itself, I don’t understand the basis on which they are recommending graded exercise. I think they need to provide a new and plausible scientific explanation to support the continued testing of this approach on human subjects diagnosed with this disease.

This is especially so given the wealth of information that has emerged since 2011. The investigators mention patient surveys on reported harms from graded exercise, but they choose to ignore the growing peer-reviewed literature from leading medical research centers around the world. They also ignore the major 2015 report from the U.S. Institute of Medicine (now called the Academy of Medicine), which identified “post-exertional malaise” as the cardinal symptom, in the process renaming it “exertion intolerance.”

If “post-exertional malaise” or “exertion intolerance” is the cardinal symptom, then graded exercise in any form could be contraindicated. Professor White and his colleagues obviously do not have to agree with this reasonable interpretation of the recent research and reports. But the failure to mention and discuss these findings in an article investigating a graded exercise intervention demonstrates the investigators’ apparent unwillingness or inability to grapple with the current state of scientific knowledge in the field.

The glorious TWiVerati un-impact their email backlog, anwering questions about viruses, viruses, viruses, viruses, viruses, and more. You should listen – our fans ask great questions!

Click arrow to play
Download TWiV 447 (67 MB .mp3, 110 min)
Subscribe (free): iTunesRSSemail

Show notes at microbe.tv/twiv

Become a patron of TWiV!

by David Tuller, DrPH

[June 25, 2017: The last section of this post, about the PLoS One study, has been revised and corrected.]

I have tip-toed around the question of research misconduct since I started my PACE investigation. In my long Virology Blog series in October 2015, I decided to document the trial’s extensive list of flaws—or as many as I could fit into 15,000 words, which wasn’t all of them—without arguing that this constituted research misconduct. My goal was simply to make the strongest possible case that this was very bad science and that the evidence did not support the claims that cognitive behavior therapy and graded exercise therapy were effective treatments for the illness.

Since then, I have referred to PACE as “utter nonsense,” “complete bullshit,” “a piece of crap,” and “this f**king trial.” My colleague and the host of Virology Blog, Professor Racaniello, has called it a “sham.” Indeed, subsequent events have only strengthened the argument against PACE, despite the unconvincing attempts of the investigators and Sir Simon Wessely to counter what they most likely view as my disrespectful and “vexatious” behavior.

Virology Blog’s open letters to The Lancet and Psychological Medicine have demonstrated that well-regarded experts from the U.S, U.K. and many other countries find the methodological lapses in PACE to be such egregious violations of standard scientific practice that the reported results cannot be taken seriously. In the last few months, more than a dozen peer-reviewed commentaries in the Journal of Health Psychology, a respected U.K.-based academic publication, have further highlighted the international dismay at the study’s self-evident and indisputable lapses in judgement, logic and common sense.

And here’s a key piece of evidence that the trial has lost all credibility among those outside the CBT/GET ideological brigades: The U.S. Centers for Disease Control still recommends the therapies but now insists that they are only “generic” management strategies for the disease. In fact, the agency explicitly denies that the recommendations are related to PACE. As far as I can tell, since last year the agency no longer cites the PACE trial as evidence anywhere on its current pages devoted to the illness. (If there is a reference tucked away in there somewhere, I’m sure a sharp-eyed sleuth will soon let me know.)

It must be said that the CDC’s history with this illness is awful—another “bad science” saga that I documented on Virology Blog in 2011. In past years, the agency cited PACE prominently and has collaborated closely with British members of the biopsychosocial school of thought. So it is ridiculous and—let’s be frank—blatantly dishonest for U.S. public health officials to now insist that the PACE-branded treatments they recommend have nothing to do with PACE and are simply “generic” management strategies. Nevertheless, it is significant that the agency has decided to “disappear” PACE from its site, presumably in response to the widespread condemnation of the trial.

Many of the PACE study’s myriad flaws represent bad science but clearly do not rise to the level of research misconduct. Other fields of medicine, for example, have abandoned the use of open-label trials with subjective outcomes because they invite biased results; Jonathan Edwards, an emeritus professor of medicine from University College London, has made this point repeatedly. But clearly large segments of the psychological and psychiatric fields do not share this perspective and believe such trials can provide reliable and authoritative evidence.

Moreover, the decision to use the very broad Oxford criteria to identify patients is bad science because it conflates the symptom of “chronic fatigue” with the specific disease entity known often as “chronic fatigue syndrome” but more appropriately called “myalgic encephalomyelitis.” This case definition generates heterogeneous samples that render it virtually impossible for such studies to identify accurate information about causes, diagnostic tests and treatments. Although a 2015 report from the National Institutes of Health recommended that it should be “retired” from use, the Oxford definition remains in the published literature. Studies relying on it should be discredited and their findings ignored or dismissed. But that’s probably as far as it goes.

Many definitions of “research misconduct” exist, but they generally share common elements. In Britain, the Medical Research Council, the main funder of PACE, endorses the definition from Research Councils U.K., an organization which outlines its principles in a statement called “Policy and Guidelines on Governance of Good Research Conduct.” In exploring this question, I will focus here on just two of the planks of the definition cited by the MRC: “misrepresentation of interests” and “misrepresentation of data.”

Let me be clear: I am not trained as a bioethicist. I have never been involved in determining if any particular study involves research misconduct. And I am not making any such claim here. However, when a clinical trial includes so many documented flaws that more than 100 experts from around the world are willing and even eager to sign a letter demanding immediate retraction of key findings, the question of whether there has been research misconduct will inevitably arise. Although people with different perspectives could clearly disagree on the answer, the final and authoritative determination will likely not emerge until the PACE study and the details involved in its conduct and the publication of the results are subjected to a fully independent investigation.

In the meantime, let’s look at how research misconduct is defined and examine some of the possible evidence that might be reviewed. For starters, the cited definition of “misrepresentation of interests” includes “the failure to declare material interests either of the researcher or of the funders of the research.”

I have repeatedly pointed out that the investigators have misled participants about their “material interests” in whether the trial reached certain conclusions—namely, that CBT and GET are effective treatments. The three main investigators have had longstanding links with insurance companies, advising them that rehabilitative approaches such as the interventions under study could get ME/CFS claimants off benefits and back to work. No reliable evidence actually supports this claim—certainly the PACE results failed to confirm it. And yet the investigators did not disclose these consulting and/or financial ties in the information leaflets and consent forms provided to participants.

Why is that a problem? Well, the investigators promised in their protocol to adhere to the Declaration of Helsinki, among other ethical guidelines. The declaration, an international human rights document enacted after WWII to protect human research subjects, is very specific about what researchers must do in order to obtain informed consent: They must tell prospective participants of “any possible conflicts of interest” and “institutional affiliations.”

Without such disclosures, in fact, any consent obtained is not informed but, per Helsinki’s guidelines, uninformed. Investigators cannot simply pick and choose from among their protocol promises and decide which ones they will implement and which ones they won’t. They cannot decide not to disclose “any possible conflicts of interest,” once they have promised to do so, even if it is inconvenient or uncomfortable or might make people reluctant to enter a trial. I have interviewed four PACE participants. Two said they would likely or definitely not have agreed to be in the study had they been told of these conflicts of interest; in fact, one withdrew her consent for her data to be used after she had already completed all the trial assessments because she found out about these insurance affiliations later on and was outraged at not having been told from the start.

The PACE investigators have responded to this concern, but their answers do not actually address the criticism, as I have previously pointed out. It is irrelevant that they made the appropriate disclosures in the journals that published their work; the Declaration of Helsinki does not concern itself with protecting journal editors and journal readers but with protecting human research subjects. The investigators have also argued that insurance companies were not directly involved in the study, thereby implying that no conflict of interest in fact existed. This is also a specious argument, relying as it does on an extremely narrow interpretation of what constitutes a conflict of interest.

Shockingly, the PACE trial’s ethical review board approved the consent forms, even without the disclosures clearly mandated by the Declaration of Helsinki. The Lancet and Psychological Medicine have been made aware of the issue but have no apparent problem with this breach of research ethics. Notwithstanding such moral obtuseness, the fact remains that the PACE investigators made a promise to disclose “any possible conflicts of interest” to trial participants, and failed to honor it. Case closed. In the absence of legitimate informed consent, they should not have been allowed to publish any of the data they collected from their 641 participants.

Does this constitute “misrepresentation of material interests” within the context of the applicable definition of research misconduct? I will leave it to others to make that determination. Certainly the PACE authors and their cheerleaders—including Sir Simon, Esther Crawley, Lancet editor Richard Horton and Psychological Medicine editors Robin Murray and Kenneth Kendler—would reject any such interpretation.

Turning to the category of “misrepresentation of data,” the MRC/RCUK definition cites the “suppression of relevant findings and/or data, or knowingly, recklessly or by gross negligence, presenting a flawed interpretation of data.” One of the PACE trial’s most glaring problems, of course, is the odd fact that 13% of participants met the physical function outcome threshold at baseline. (A smaller number, slightly more than one percent, met the fatigue outcome threshold at baseline.) In the Lancet study, participants who met these very poor outcome thresholds were referred to as being “within normal ranges” for these indicators. In the Psychological Medicine paper, these same participants were referred to as being “recovered” for these indicators.

Of course, it was obvious from the papers themselves that some participants could have met these thresholds at baseline. But the number of participants who actually did meet these thresholds at baseline became public only after the information was released pursuant to a freedom-of-information request. (This was an earlier data request than the one that eventually led to the release of all the raw trial data for some of the main results.)

The decision-making behind this earlier release remains a mystery to me, since the data make clear that the study is bogus. While the bizarre nature of the overlap in entry and outcome thresholds already raised serious questions about the trial’s credibility, the fact that a significant minority of participants actually met both the “disability” and “normal range”/”recovery” thresholds for physical function at baseline certainly adds salient and critical information. Any interpretation of the study made without that benefit of that key information is by definition incomplete and deficient.

Given the logical impossibility of meeting an outcome threshold at baseline, it is understandable why the authors made no mention of the fact that so many participants were simultaneously found to be “disabled” and “within normal range”/“recovered” for physical function. Any paper on breast cancer or multiple sclerosis or any other illness recognized as a medical disease would clearly have been rejected if it featured such an anomaly.

The PACE team compounded this error by highlighting these findings as evidence of the study’s success. At the press conference promoting the Lancet paper, Trudie Chalder, one of the three principal investigators, touted these “normal range” results by declaring that twice as many people in the CBT and GET groups as in the other groups “got back to normal”—even though some of these “back-to-normal” participants still qualified as “disabled” under the study’s entry criteria. Moreover, the PACE authors themselves were allowed a pre-publication review of an accompanying Lancet commentary about the PACE trial by two Dutch colleagues. The commentary argued that the “normal range” analyses represented a “strict criterion” for recovery and declared that 30 percent of the participants had met this recovery standard.

Yet this statement is clearly preposterous, given that participants who met this “strict criterion” could have had scores indicating worse health than the scores required to demonstrate disability at trial entry. The ensuing headlines and news stories highlighted both Professor Chalder’s statement that CBT and GET were effective in getting people “back to normal” and that 30 percent had “recovered” according to a “strict definition.” This misinformation has since impacted treatment guidelines around the world.

I have previously criticized the authors’ attempts to explain away this problem. They have essentially stated that it makes no difference if some participants were “recovered” on one “recovery” threshold at baseline because the study included other “recovery” criteria as well. Moreover, they point out that the “normal range” analyses in The Lancet were not the main findings—instead, they have argued, the comparison of averages between the groups, the revised primary outcome of the study, was the definitive evidence that the treatments work.

Sorry. Those excuses simply do not wash. The inclusion of these overlapping entry and outcome thresholds, and the failure to mention or explain in the papers themselves how anyone could be “within normal range” or “recovered” while simultaneously being sick enough to enter the study, casts doubt on the entire enterprise. No study including such a bogus analysis should ever have passed peer review and been published, much less in journals presumed to subject papers to rigorous scientific scrutiny. That The Lancet and Psychological Medicine have rejected the calls of international experts to address the issue is a disgrace.

But does this constitute “misrepresentation of data” within the context of the applicable definition of research misconduct? Again, I leave it to others to make that determination. I know some people—in particular, the powerful cohort of PACE supporters—have reviewed the same set of facts and have expressed little or no concern about this unusual aspect of the trial.

[This section about the PLoS One study has been revised and corrected. At the end of the post, I have explained the changes. For full transparency, I have also re-posted the original paragraphs for anyone who wants to track the changes.]

Now let’s turn to the PLoS One paper published in 2012, which has been the subject of much dispute over data access. And yet that dispute is a distraction. We don’t need the data to determine that the paper included an apparently false statement that has allowed the investigators to claim that CBT and GET are the most “cost-effective” treatments from the societal perspective—a concept that factors in other costs along with direct health-care costs. PLoS One, like the other journals, has failed to address this concern. (The journal did post an “expression of concern” recently over the authors’ refusal to share data from the trial in accordance with the journal’s policies.)

The PACE statistical analysis plan included three separate assumptions for how to measure the costs of what they called “informal care”–the care provided by family and friends—in assessing cost-effectiveness from the societal perspective. The investigators promised to analyze the data based on valuing this informal care at: 1) the cost of a home-care worker; 2) the minimum wage; and 3) zero cost. The latter, of course, is what happens in the real world—families care for loved ones without getting paid anything by anyone.

In PLoS One, the main analysis for assessing informal care presented only the results under a fourth assumption not mentioned in the statistical analysis plan—valuing this care at the mean national wage. The paper did not explain the reasons for this switch. Under this new assumption, the authors reported, CBT and GET proved more cost-effective than the two other PACE treatment arms. The paper did not include the results based on any of the three ways of measuring informal care promised in the statistical analysis plan. But the authors noted that sensitivity analyses using alternative approaches “did not make a substantial difference to the results” and that the findings were “robust” under other assumptions for informal care.

Sensitivity analyses are statistical tests used to determine whether, and to what extent, different assumptions lead to changes in results. The “alternative approaches” mentioned in the study as being included in the sensitivity analyses were the first two approaches cited in the statistical analysis plan—valuing informal care at the cost of a home-care worker and at minimum wage. The paper did not explain why it had dropped any mention of the third promised method of valuing informal care—the zero-cost assumption.

In the comments, a patient-researcher, Simon McGrath, pointed out that this claim of “robust” results under other assumptions could not possibly be accurate, given that the minimum wage was much lower than the mean national wage and would therefore alter the results and the sensitivity analyses. In response, Paul McCrone, the King’s College London expert in health economics who served as the study’s lead author, conceded the point.

“You are quite correct that valuing informal care at a lower rate will reduce the savings quite substantially, and could even result in higher societal costs for CBT and GET,” wrote Professor McCrone. So much for the paper’s claim that sensitivity analyses showed that alternative assumptions “did not make a substantial difference to the results” and were “robust” no matter how informal care was valued.

Surprisingly, given this acknowledgement, Professor McCrone did not explain why the paper included a contradictory statement about the sensitivity analyses under alternative assumptions. Nor did he offer to correct the paper to conform to this revised interpretation he presented in his comments. Instead, he presented a new rationale for highlighting the results based on the assumption that unpaid informal care was being valued at the mean national wage, rather than using the other assumptions outlined in the protocol.

“In our opinion, the time spent by families caring for people with CFS/ME has a real value and so to give it a zero cost is controversial,” Professor McCrone wrote. “Likewise, to assume it only has the value of the minimum wage is also very restrictive. In other studies we have costed informal care at the high rate of a home care worker. If we do this then this would show increased savings shown [sic] for CBT and GET.”

This concern for patients’ families is certainly touching and, in a general sense, laudable. But it must be pointed out that what they did in earlier studies is irrelevant to PACE, given that they had included the assumptions they planned to use in their statistical analysis plan. Moreover, it does not explain why Professor McCrone and his colleagues then decided to include an apparently false statement about the sensitivity analyses in the paper.

Another patient-researcher, Tom Kindlon, pointed out in a subsequent comment that the investigators themselves chose the alternative assumptions, which they were now dismissing as unfair to caregivers. “If it’s ‘controversial’ now to value informal care at zero value, it was similarly ‘controversial’ when they decided before the data was looked at, to analyse the data in this way,” wrote Kindlon. “There is not much point in publishing a statistical plan if inconvenient results are not reported on and/or findings for them misrepresented.”

Whatever their reasons, the PACE investigators’ inclusion in the paper of the apparently false statement about the sensitivity analyses represents a serious lapse in professional ethics and judgement. So does the unwillingness to correct the paper itself, given the exchanges in the comments. Does this constitute “misrepresentation of data” within the context of the MRC/RCUK definition of research misconduct?

As I have said, I will leave it to others to make that determination. I look forward to the day when an international group of experts finally pursues a thorough investigation of how and why everything went so terribly wrong with this highly influential five-million-pound trial.

A post-script: I did not contact the PACE authors prior to posting this blog. After my initial series ran in October 2015, Virology Blog posted their full response to my concerns. Since then, I have repeatedly tried to solicit their comments for subsequent blog posts, and they have repeatedly declined to respond. I saw no point in repeating that exercise this time around. I also did not try to solicit a response from Professor McCrone, since he has not responded to multiple earlier requests seeking an explanation for why the PLoS One paper contains the apparently false statement about sensitivity analyses.

However, I would be happy to post on Virology Blog a response of any length from any of the investigators, should they decide to send one. I would of course also correct any documented factual errors in what I have written, which is something I have done whenever necessary throughout my journalism career. (June 25, 2017: Of course, I have now made such corrections, per my professional obligations.)

**********

Next post: The Lancet’s awful new GET trial

**********

*Explanation for the changes: In the original version, I should have made clear that my concerns involved an analysis of what the investigators called cost-effectiveness from the societal perspective, which included not only the direct health-care costs but other considerations as well, including the informal costs. I also mistakenly wrote that the paper only presented the results under the assumption that informal care was valued at the cost of a home-care worker. In fact, for unexplained reasons, the paper’s main analysis was based on none of the three assumptions mentioned in the statistical analysis plan but on a fourth assumption based on the national mean wage.

In addition, I mistakenly assumed, based on the statistical analysis plan, that the sensitivity analyses conducted for assessing the impact of different approaches included both the minimum wage and zero-cost assumptions. In fact, the sensitivity analyses cited in the paper focused on the assumptions that informal care was valued at the cost of a home-care worker and at the minimum wage. The zero-cost assumption also promised in the protocol was not included at all. I apologize to Professor McCrone and his colleagues for the errors and am happy to correct them.

However, this does not change the fact that Professor McCrone’s subsequent comments contradicted the paper’s claim that, per the sensitivity analyses, changes in how informal care was valued “did not make a substantial difference to the results” and that the findings were “robust” for the alternative assumptions. This apparently false claim in the paper itself still needs to be explained or corrected. The paper also does not explain why the investigators included the zero-cost assumption in the detailed statistical analysis plan and then decided to drop it entirely in the paper itself.

**********

Here is the original version of the section on the PLoS One paper, for anyone who wants to compare the two and track the changes:

Now let’s turn to the PLoS One paper published in 2012, which has been the subject of much dispute over data access. And yet that dispute is a distraction—we don’t need the data to determine that the paper included an apparently false statement that has allowed the investigators to claim that CBT and GET are the most “cost-effective” treatments. PLoS One, like the other journals, has failed to address this concern, despite an open letter about it posted on Virology Blog last year. (The journal did post an “expression of concern” recently over the authors’ refusal to share data from the trial in accordance with the journal’s policies.)

The PACE statistical analysis plan included three separate assumptions for how to measure the costs of “informal care”–the care provided by family and friends. The investigators promised to provide results based on valuing this informal care at: 1) the average wage paid to health-care workers; 2) the minimum wage; and 3) at zero pay. The latter, of course, is what happens in the real world—families care for loved ones without getting paid anything by anyone.

In PLoS One, the main analysis only presented the results under the first assumption—costing the informal care at the average wage of a health-care worker. Under that assumption, the authors reported, CBT and GET proved more cost-effective than the two other PACE treatment arms. The paper did not include the results based on the other two ways of measuring “informal care” but declared that “alternative approaches were used in the sensitivity analyses and these did not make a substantial difference to the results.” (Sensitivity analyses are statistical tests used to determine whether, and to what extent, different assumptions lead to changes in results.)

Yet in the comments, two patient researchers contradicted this statement, pointing out that the claim that all three assumptions would essentially yield the same results could not possibly be accurate. In response, Paul McCrone, the King’s College London expert in health economics who served as the study’s lead author, conceded the point. Let me repeat that: Professor McCrone agreed that the cost savings would indeed be lower under the minimum wage assumption, and that under the third assumption any cost advantages for CBT and GET would disappear.

“If a smaller unit cost for informal care is used, such as the minimum wage rate, then there would remain a saving in informal care costs in favour of CBT and GET but this would clearly be less than in the base case used in the paper,” wrote Professor McCrone. “If a zero value for informal care is used then the costs are based entirely on health/social care (which were highest for CBT, GET and APT) and lost employment which was not much different between arms.” So much for the paper’s claim that sensitivity analyses showed that alternative assumptions “did not make a substantial difference to the results.”

Surprisingly, given these acknowledged facts, Professor McCrone did not explain why the paper included a completely contradictory statement. Nor did he offer to correct the paper itself to conform to his revised interpretation of the results of the sensitivity analyses. Instead, he presented a new rationale for highlighting only the results based on the assumption that unpaid informal care was being reimbursed at the average salary of a health-care worker.

“In our opinion, the time spent by families caring for people with CFS/ME has a real value and so to give it a zero cost is controversial,” Professor McCrone wrote. “Likewise, to assume it only has the value of the minimum wage is also very restrictive. In other studies we have costed informal care at the high rate of a home care worker. If we do this then this would show increased savings shown [sic] for CBT and GET.”

This concern for patients’ families is certainly touching and, in a general sense, laudable. But it must be pointed out that what they did in earlier studies is irrelevant to PACE, given that they had included the alternative assumptions in their own statistical analysis plan. Moreover, it does not explain why Professor McCrone and his colleagues then decided to include an apparently false statement about the sensitivity analyses in the paper.

One of the commenters, patient-researcher Tom Kindlon from Dublin, pointed out in a subsequent comment that the investigators themselves chose the alternative assumptions that they were now dismissing as unfair to caregivers. “If it’s ‘controversial’ now to value informal care at zero value, it was similarly ‘controversial’ when they decided before the data was looked at, to analyse the data in this way,” he wrote. “There is not much point in publishing a statistical plan if inconvenient results are not reported on and/or findings for them misrepresented.”

Whatever their reasons, the PACE investigators’ inclusion in the paper of the apparently false statement about the sensitivity analyses represents a serious lapse in professional ethics and judgement. So does the unwillingness to correct the paper itself to reflect Professor McCrone’s belated acknowledgement of the actual results from the sensitivity analyses, rather than the inaccurate results reported in the paper. Does this constitute “misrepresentation of data” within the context of the MRC/RCUK definition of research misconduct?

**********

If you appreciate my work, please consider supporting my crowdfunding effort, which ends June 30th.

By David Tuller, DrPH

Last week, I e-mailed a letter to Sue Paterson, director of legal services at the University of Bristol, to express my concerns about Professor Esther Crawley’s false claim that I had libeled her in reporting on her research for Virology Blog. On Friday, I received a two-sentence response from Ms. Paterson. She addressed it to “Mr. Tuller” and wrote this:

“Thank you for your email of 14 June. Your comments have been noted.”

I wasn’t sure what to make of this terse reply. Was that all that Bristol’s director of legal services had to say about the fact that a well-known faculty member issued an absurd allegation during a high-profile event sponsored by this august institution? Had she simply noted my concerns herself, or had she actually conveyed them to Professor Crawley, as I’d requested? Did the university feel any sense of responsibility for what had occurred? Would Ms. Paterson’s “noting” of my comments be followed by any further missive or perhaps even an apology after the university investigated the event? Who knows?

I responded at somewhat greater length but prefer for the moment to keep the exact phrasing private. My tone was what I would describe as very pointed but within bounds, although I have come to realize that the British tend to interpret “within bounds” somewhat more narrowly than Americans.

Here’s the first paragraph:

“Thank you for your response. (For the record, it should Dr. Tuller, not Mr. Tuller. I have a doctorate in public health, as I indicated in the sign-off to my letter. However, please feel free to call me David.)” 

I have a feeling the pro-PACE camp might have difficulty with the “Dr” thing because my behavior—like tearing up Lancet papers at public events—does not fit their preconceived notions of how people with advanced academic degrees should act. However, this group apparently thinks it’s fine to accuse patients of being “vexatious” just because they want to know the actual answers that the PACE investigators promised to provide in exchange for five million pounds of public funds.

Anyway, my letter went on from there. To paraphrase: I noted that the prolonged silence from Professor Crawley indicated to any reasonable observer that she could not defend her allegation, and that I took this as her tacit acknowledgment of error. I also noted that Ms. Paterson’s own minimalist, content-free response included no documentation or evidence that anything I wrote about Professor Crawley’s research was inaccurate. I stated that, as far as I was concerned, no further communication about the matter was necessary, since at this point it was obvious to all that I had not written “libellous blogs” about Professor Crawley.

I also wrote that I hoped someone would explain to Professor Crawley the distinction between opinions she dislikes and libel. And I expressed the expectation that the offending slide would be retired for good and that Professor Crawley would no longer repeat her false libel accusation in public. I explained as well that whatever she said about me in private was obviously her own business.

I have no idea if I will hear back again from Ms. Paterson or anyone else in Bristol’s legal department, but I will provide an update if I do.

The TWiV hosts review an analysis of gender parity trends at virology conferences, and the origin and unusual pathogenesis of the 1918 pandemic H1N1 influenza virus.

Click arrow to play
Download TWiV 446 (68 MB .mp3, 112 min)
Subscribe (free): iTunesRSSemail

Become a patron of TWiV!

Show notes at microbe.tv/twiv

PlaqueOn the wall of a Columbia University Medical Center building just across the street from my laboratory is a plaque commemorating two participants in the discovery of a mosquito vector for yellow fever virus.

The plaque reads:

Aristides Agramonte, Jesse William Lazear, Graduates of the Columbia University College of Physicians and Surgeons, class of 1892. Acting Assistant Surgeons, U.S. Army. Members of the USA Yellow Fever Commission with Drs. Walter Reed and James Carroll. Through devotion and self-sacrifice they helped to eradicate a pestilence of man.

Yellow fever, known in tropical countries since the 15th century, was responsible for devastating epidemics associated with high rates of mortality. The disease can be mild, with symptoms that include fever and nausea, but more severe cases are accompanied by major organ failure. The name of the illness is derived from yellowing of the skin (jaundice) caused by destruction of the liver. For most of its history, little was known about how yellow fever was spread, although it was clear that the disease was not transferred directly from person to person.

Cuban physician Carlos Juan Finlay proposed in 1880 that a bloodsucking insect, probably a mosquito, was involved in yellow fever transmission. The United States Army Yellow Fever Commission was formed in 1899 to study the disease, in part because of its high incidence among soldiers occupying Cuba. Also known as the Reed Commission, it comprised four infectious disease specialists: U.S. Army Colonel Walter Reed (who was the chair); Columbia graduates Lazear and Agramonte, and James Carroll. Lazear confirmed Finlay’s hypothesis in 1900 when he acquired yellow fever after being experimentally  bitten by mosquitos who had fed on sick patients. Days later, he died of the disease.

The results of the Reed Commission’s study proved conclusively that mosquitoes are the vectors for this disease. Aggressive mosquito control in Cuba led to a drastic decline in cases by 1902.

The nature of the yellow fever agent was established in 1901, when Reed and Carroll injected filtered serum from the blood of a yellow fever patient into three healthy individuals. Two of the volunteers developed yellow fever, causing Reed and Carroll to conclude that a “filterable agent,” which we now know as yellow fever virus, was the cause of the disease.

Sometimes you don’t have to wander far to find some virology history.

Update 6/16/17: The statement on the plaque that Agramonte and Lazear “helped to eradicate a pestilence of man” is of course incorrect, as yellow fever has never been eradicated. Recent large outbreaks of yellow fever in Brazil and Angola are examples of the continuing threat the virus poses, despite the availability of a vaccine since 1938.

By David Tuller, DrPH

This morning I e-mailed the following letter to Sue Paterson, the University of Bristol’s Director of Legal Services and Deputy University Secretary, to protest Professor Esther Crawley’s accusation that I libeled her in blogging about her work. I cc’d the office of the university’s vice-chancellor, Professor Hugh Brady.

**********

Dear Ms. Paterson:

I have recently learned that Professor Esther Crawley of the University of Bristol’s Centre for Child and Adolescent Health, in her inaugural lecture on February 24th of this year, accused me of libel. During her talk, she showed a slide with the phrase “libellous blogs,” accompanied by a screen shot of one of my blog posts on Virology Blog. While that slide was on the screen, she also mentioned “libellous blogs,” obviously referring to the Virology Blog post, among others.

This libel accusation is false. Given that Professor Crawley made this unsupported charge in such a high-profile academic setting, I felt that it was important to bring the matter to your attention and express my surprise and displeasure. (I have also cc’d the office of the university’s vice-chancellor, Professor Hugh Brady.)

Virology Blog is a well-regarded science site hosted by Professor Vincent Racaniello, a prominent virologist at Columbia University. (I have also cc’d Professor Racaniello.) For the last year and a half, I have been writing an investigative series for Virology Blog called “Trial by Error,” about the many flaws of the PACE trial and related research, including Professor Crawley’s work. In accusing me of libel, she was also accusing my colleague, Professor Racaniello, of publishing libellous material. Professor Crawley used this slide again during a talk in April to the British Renal Society. I have written several subsequent posts about the libel accusation itself.

It is certainly true that the post highlighted in the slide, titled “The New FITNET Trial for Kids,” is harsh on Professor Crawley’s recent work. It is my opinion, as a public health expert from the University of California, Berkeley, that her research and the FITNET-NHS protocol are highly problematic in their presentation of the illness variously called chronic fatigue syndrome, myalgic encephalomyelitis, ME/CFS, or CFS/ME. In the post in question, I outlined these issues and carefully documented the facts on which I based my arguments. My concerns are shared by many leading scientists and experts in study design and research methodology.

In my post, I explained how Professor Crawley has misstated the NICE guidelines in both her research and her FITNET-NHS proposal, in ways that appear to eliminate post-exertional malaise as a required symptom. I also noted that she has conflated the symptom of “chronic fatigue,” a hallmark of many illnesses, with the specific disease entity she prefers to call “chronic fatigue syndrome.” As many have previously noted, this conflation generates samples that are far too heterogeneous to yield reliable and valid conclusions about prevalence, causes and treatments.

I acknowledge that I have expressed myself in sharp, colorful and–some would say–offensive terms. That just makes me sharp, colorful and possibly offensive. It does not make me libellous. Professor Crawley has a right to disagree with my interpretation of the facts and explain why I am wrong. And she is free to make her points in hard-hitting language, as I have chosen to do. But without providing evidence or documentation that what I wrote was inaccurate, she has no legitimate grounds to accuse Professor Racaniello and me of libel.

I have e-mailed Professor Crawley several times asking her to explain her charge of libel, or to apologize. In my e-mails, I have let her know that I would be happy to post her full statement on Virology Blog. In other words, I have offered her the opportunity to make her case, at whatever length she wants, in the same forum in which I purportedly libeled her. Moreover, should she document any factual errors in my work, I am of course happy to correct the public record, as I have done throughout my career as a journalist. Even though she has not so far responded with evidence to back up her accusation, the offer to post her full statement on Virology Blog and correct any documented factual errors still stands.

My main goal in sending this letter is to let you know that Professor Crawley’s  accusation will not deter me from my work. Nor will it impact Professor Racaniello’s support for this project, which involves accurate reporting and opinionated commentary on PACE and other issues involving ME/CFS. In the meantime, I suggest that someone should explain to Professor Crawley that  accusing other academics or anyone of libel without providing evidence—and then refusing to respond to reasonable requests for clarification–is unacceptable, unjustified and reckless on many levels. Professor Crawley should not make public accusations that she cannot or will not defend when challenged.

I have not cc’d Professor Crawley on this letter. Because she has declined to respond to my recent requests for an explanation and my offers to publish her full statement on Virology Blog, I see no point in further efforts to communicate with her. I therefore trust you will convey to Professor Crawley the concerns I have expressed here on behalf of Professor Racaniello and myself, as well as our determination to keep pursuing this investigation.

Sincerely—

David Tuller, DrPH

 

**********

If you appreciate my PACE-busting efforts, I urge you to help me continue with this project by supporting my crowdfunding campaign:

https://www.crowdrise.com/virology-blogs-trial-by-error-more-reporting-on-pace-mecfs-and-related-issues1

From Nido2017 in Kansas City, Vincent  meets up with three virologists to talk about their careers and their work on nidoviruses.

Show notes at microbe.tv/twiv

Click arrow to play
Download TWiV 445 (39 MB .mp3, 64 min)
Subscribe (free): iTunesRSSemail

Become a patron of TWiV!

Carl SaganAt the ASM Microbe 2017 meeting last week in New Orleans, Ed Yong interviewed astronaut Kate Rubins for the keynote address. The large theatre was packed, and overflow crowds watched the event on monitors throughout the New Orleans Convention Center. But I think that a scientist should have interviewed Dr. Rubins.

Rich Condit and I had the good fortune to interview astronaut Astro Kate for TWiV 444 at ASM Microbe 2017. Several hours later, she was on stage with Ed Yong. It’s clear why ASM wanted Yong speaking with Rubins: he would draw the biggest possible audience. His science writing is outstanding, and his first book, I Contain Multitudes, sold very well. In fact, Ed was at ASM Microbe to autograph copies of his book.

In 2016 the keynote speaker at ASM Microbe was Bill Gates, for the same reason: to draw a crowd. He was interviewed by Dr. Richard Besser, formerly of ABC News.

I have nothing against Ed Yong; I think he’s doing a great job communicating science. But I think that a scientist should have interviewed Kate Rubins. Why? Because the public views scientists as the most trustworthy spokepersons for science (source: ResearchAmerica). Not bloggers, or journalists, or elected officials, but scientists. And I want scientists to showcase their field, especially in front of other scientists.

What living scientist would have been as popular as Ed Yong at ASM 2017? Surely Steven Hawking, or Neil deGrasse Tyson, who are widely known. But they are not microbiologists. The only life scientist who is as well known as Ed Yong and would draw a big crowd might be Richard Dawkins. Bill Nye is not on this list because he’s an engineer, not a scientist, but he would be a huge draw, bigger than Yong. I would not be surprised to see him at a future ASM Microbe meeting.

We need more celebrity life scientists who are loved by millions, who can explain the nuts and bolts of biology, microbiology, biotechnology, cell biology, and more, and who draw huge crowds. I’m not one of them – my blog and podcasts have many followers, but I would not draw like Ed Yong did at ASM Microbe (our TWiV with astronaut Kate Rubins attracted 50 people). But I believe that my work in science communication shows young scientists that they can appeal to a broad range of science minded people, and perhaps become very popular themselves.

Let the Yong-Rubins keynote be a call to early career life scientists to communicate their science, build their visibility, and become the next Carl Sagan, who reached millions with his television shows and books. It’s not easy, especially combined with a career in research and teaching. But Sagan and others have shown that it can be done. And hopefully you will one day be a big draw at a keynote address!