De-discovering pathogens: Viral contamination strikes again

Spin column

Qiagen spin column at right. The silica layer is white. The spin column is placed in the microcentrifuge tube, left, to remove liquids and elute nucleic acids.

Do you remember the retrovirus XMRV, initially implicated as the cause of chronic fatigue syndrome, and later shown to be a murine virus that contaminated human cells grown in mice? Another virus thought to be associated with human disease has recently been shown to be a contaminant, derived from a piece of laboratory plasticware that is commonly used to purify nucleic acids from clinical samples.

During a search for the causative agent of seronegative hepatitis (disease not caused by hepatitis A, B, C, D, or E virus) in Chinese patients, a novel virus was discovered in sera by next generation sequencing. This virus, provisionally called NIH-CQV, has a single-stranded DNA genome that is a hybrid between parvoviruses and circoviruses. When human sera were screened by polymerase chain reaction (PCR), 63 of 90 patient samples (70%) were positive for the virus, while sera from 45 healthy controls were negative. Furthermore, 84% of patients were positive for IgG antibodies against the virus, and 31% were positive for IgM antibodies (suggesting a recent infection). Among healthy controls, 78% were positive for IgG and all were negative for IgM. The authors concluded that this virus was highly prevalent in some patients with seronegative hepatitis.

A second independent laboratory also identified the same virus (which they called PHV-1) in sera from patients in the United States with non-A-E hepatitis, while a third group identified the virus in diarrheal stool samples from Nigeria.

The first clue that something was amiss was the observation that the novel virus identified in all three laboratories shared 99% nucleotide and amino acid identity. This would not be expected in virus samples from such geographically, temporally, and clinically diverse samples. Another problem was that in the US non-A-E study, all patient sample pools were positive for viral sequences. These observations suggested the possibility of viral contamination.

When nucleic acids were re-purified from the US non-A-E samples using a different method, none of the samples were positive for the novel virus. Presence of the virus was ultimately traced to the use of column-based purification kits manufactured by Qiagen, Inc. Nearly the entire novel viral genome could be detected by deep sequencing in water that was passed through these columns.

The nucleic acid purification columns contaminated with the novel virus were used to purify nucleic acid from patient samples. These columns (pictured), produced by a number of manufacturers, are typically a few inches in length and contain a silica gel membrane that binds nucleic acids. The clinical samples are added to the column, which is then centrifuged briefly to remove liquids (hence the name ‘spin’ columns). The nucleic acid adheres to the silica gel membrane. Contaminants are washed away, and then the nucleic acids are released from the silica by the addition of a buffer.

Why were the Qiagen spin columns contaminated with the parvovirus-circovirus hybrid? A search of the publicly available environmental metagenomic datasets revealed the presence of sequences highly related to PHV-1 (87-99% nucleotide identity). The datasets containing PHV-1 sequences were obtained from sampled seawater off the Pacific coast of North America, and coastal regions of Oregon and Chile. Silica, a component of spin columns, may be produced from diatoms. If the silica in the Qiagen spin columns was produced from diatoms, and if PHV-1 is a virus of ocean-dwelling diatoms, this could explain the source of contamination.

In retrospect it was easy to be fooled into believing that NIH-CQV might be a human pathogen because it was only detected in sick, and not healthy patients. Why antibodies to the virus were detected in samples from sick and healthy patients remains to be explained. However NIH-CQV/PHV-1 is likely not associated with any human illness: when non-Qiagen spin columns were used, PHV-1 was not found in any patient sample.

The lesson to be learned from this story is clear: deep sequencing is a very powerful and sensitive method and must be applied with great care. Every step of the virus discovery process must be carefully controlled, from the water used to the plastic reagents. Most importantly, laboratories involved in pathogen discovery must share their sequence data, something that took place during this study.

Trust science, not scientists.

A retrovirus makes chicken eggshells blue

Araucana eggWhen you purchase chicken eggs at the market, they usually have white or brown shells. But some breeds of chicken produce blue or green eggs. The blue color is caused by insertion of a retrovirus into the chicken genome, which activates a gene involved in the production of blue eggs.

The Araucana, a chicken breed from Chile, and Dongxiang and Lushi chickens in China lay blue eggs. Blue eggshell color is controlled by an autosomal dominant gene: eggs produced by homozygote chickens are darker blue than those from heterozygotes. The gene causing blue eggshell color is called oocyan (O) and was previously mapped to the short arm of chromosome 1.

To further refine the location of the O gene, genetic crosses were performed using molecular markers on chromosome 1. The O gene was then located in a ~120 kb region which contained four genes. Only the SLCO1B3 was expressed in the uterus of Dongxiang chickens that produce blue eggs; it was not expressed in chickens that produce brown eggs.

Sequence analysis of the SLCO1B3 revealed that an endogenous avian retrovirus called EAV-HP has inserted just upstream of the gene. This insertion places a promoter sequence in front of the SLCO1B3 gene. As a consequence, the SLCO1B3 gene is transcribed. In chickens that produce brown eggs, no retrovirus is inserted before the SLCO1B3 gene, and no mRNA encoding the protein is produced.

The retrovirus insertion has occurred at different positions in the Chilean and Chinese chicken genomes. This observation indicates that the insertion arose independently during breeding of chicken strains several hundred years ago to produce blue egg layers. The chicken genome contains multiple copies of endogenous retroviruses, which can duplicate and move to other locations. We can assume that a random insertion upstream of the SLCO1B3 gene was selected for by breeding procedures that were aimed at producing blue egg-laying chickens.

The SLCO1B3 gene encodes a membrane transporter protein that mediates the uptake of a wide range of organic compounds into the cell. The blue eggshell color is produced by deposition of biliverdin on the eggshell as it develops in the uterus. Biliverdin is one component of bile salts, which are transported by SLCO1B3, providing a plausible hypothesis for the role of the protein in making blue eggshells.

Blue eggshell color is another example of the important roles that retroviruses have played in animal development. One other is the help provided by retroviruses in producing the placenta of mammals. Not all retroviral insertions are beneficial – integration next to an oncogene can lead to transformation and oncogenesis.

TWiV 242: I want my MMTV

On episode #242 of the science show This Week in Virology, the complete TWiV team talks about how two different viruses shape the evolution of an essential housekeeping protein.

You can find TWiV #242 at

Dual virus-receptor duel

transferrin receptorViruses are obligate intracellular parasites: they must enter a cell to reproduce. To gain access to the cell interior, a virus must first bind to one or more specific receptor molecules on the cell surface. Cell receptors for viruses do not exist only to serve viruses: they also have cellular functions. An example is the transferrin receptor, which regulates iron uptake and assists in the entry of viruses from three different families. It might appear that such dual-use proteins cannot evolve to block virus entry because their cellular function would then be compromised. A study of two viruses that bind to the same cell surface receptor protein reveals how a cellular protein can change to prevent infection without affecting its role in the cell.

The virus-cell receptor interaction is one of the many arenas where the evolution of host-virus conflict can be studied. Because the virus-receptor interaction is essential for viral replication, host cells with a mutation in the receptor gene that prevents virus infection survive and eventually dominate the population. A virus could overcome this block with an amino acid change allowing binding to the altered receptor. Mutations that alter the interaction to favor the virus or the host are called ‘positively selected’ mutations. Such back-and-forth evolution between viruses and their host cells has been called host-virus arms races. Most have been identified by studying antiviral genes. This study is unusual in that it involves a housekeeping gene that has been usurped for viral attachment.

Evidence for positive selection of host genes can be detected by comparing gene sequences of phylogenetically related species. Nonsynonymous mutations lead to a change in the amino acid sequence, while synonymous mutations do not. The rate at which nonsynonymous mutations occur in the genome is typically much slower than synonymous mutations. The reason for this difference is that most mutations that change the amino acid sequence of a protein are lethal to the host. When genes have been subjected to positive selection by a virus, the ratio of nonsynonymous to synonymous mutations is higher, typically in host amino acids that interact with viral proteins. Computer programs have been designed to scan gene sequences and identify codons which are under positive selection by virtue of a high ratio of nonsynonymous to synonymous mutations.

To determine if the transferrin receptor (TfR1) has evolved to prevent virus attachment, sequences of the protein from seven different rodent species were compared. The analysis revealed that much of the protein is highly conserved, but a small part, comprising six amino acids, is evolving rapidly. Three of these amino acids  are located on the part of TfR1 that binds arenaviruses, and three are at the binding site for the retrovirus mouse mammary tumor virus (MMTV) (see illustration). Changing these three amino acids of TfR1 of the house mouse, which is susceptible to MMTV, to the sequence found in TfR1 of the MMTV-resistant vesper mouse, blocked entry of the virus into cells. In turn, changing these three amino acids of TfR1 of the MMTV-resistant short-tailed zygodont to the sequence of the house mouse enabled virus entry into cells. None of these changes had an effect on ferritin binding by TfR1.

Evidence for positive selection can also be detected in viral genes encoding proteins that interact with the host. The arenavirus glycoprotein, GP, is known to bind to TfR1. Ten GP amino acids were identified that are under positive selection, and four of these directly contact TfR1.

These findings demonstrate that there has been an arms race between TfR1 and both an arenavirus and retrovirus. An interesting question is whether human TfR1 will enter into an arms race with arenaviruses. As these viruses emerge into the human population, it is expected that humans with mutations that make them less susceptible to infection or severe disease will be positively selected. Amino acid 212 of human TfR1, which is near the positively selected resides in murine TfR1, varies in the human population. When this amino acid change (leucine to valine) is introduced into TfR1, it confers some protection against arenavirus entry. Curiously, this polymorphism has only been found in Asian populations, where arenaviruses that bind TfR1 are not found. The polymorphism is probably neutral with respect to TfR1 function, and if TfR1-binding arenaviruses are introduced into Asia, this change could be positively selected.

Because all viruses depend on many host proteins for replication, it will be interesting to use this approach to see how other highly conserved cell proteins balance cell function with the ability to resist virus infections. There are like to be many cell proteins that cannot change to evade viral use without destroying their cell function. Fortunately for cells there are exceptions.

TWiV 206: Viral turducken

On episode #206 of the science show This Week in Virology, Vincent, Alan, Dickson, and Kathy discuss how the innate immune response to viral infection influences the production of pluripotent stem cells, and the diverse mobilome of giant viruses.

You can find TWiV #206 at

Museum pelts help date the koala retrovirus

friendly-male-koalaThe genomes of most higher organisms contain sequences from retroviral genomes called endogenous retroviruses (ERVs). These are DNA copies of retroviral RNAs that are integrated into the germ line DNA of the host, and passed from parent to offspring. In most species the infections that lead to germ line ERVs appear to have occurred millions of years ago. The Koala retrovirus, KoRV, is the only retrovirus that we know of that is currently invading the germ line of its host species. A study of Koala pelts preserved in museums suggests how recently the virus infected this animal.

The koala is native to Australia, and all koalas in northern Australia are infected with KoRV. However not all animals in the southeast or on southern islands are infected. It is believed that KoRV crossed into koalas from another species (possibly the Asian mouse Mus caroli) some time within the past two hundred years. To test this hypothesis, DNA was extracted from 28 koala skins that were held in museums and which had been collected from the late 1800s to the 1980s. Polymerase chain reaction was used to detect KoRV DNA in the koala genome. The results show that KoRV was already widespread in Northern Australian koalas by the late 1800s. It has since spread slowly because the virus is not ubiquitous in southern koalas. The slow dispersal may due to the sedentary and solitary nature of koalas. Examination of mitochondrial DNA from the koala skins confirmed that there has been limited movement of the animals with Australia.

The sequence of the KoRV gene encoding the viral glycoprotein, env, was also determined. The results reveal that env sequences from museum specimens are remarkably similar to those of KoRV found in contemporary koalas. At first glance this result might not seem surprising: the endogenous KoRV genomes are evolving at the same slow rate as the host DNA into which they are integrated. However, there appear to have been multiple transmissions and germ line invasions by KoRV, leading the authors to suggest that in all cases very similar retroviruses were involved.

Infection with KoRV in captive animals is believed to cause immunosuppression, leading to fatal lymphomas or Chlamydia infection. A Chlamydia epidemic is believed to have killed many koalas in 1887-1889, consistent with the PCR results indicating that KoRV was widely present at that time.

Update: I had meant to discuss the possibility of dating the invasion of Koalas by KoRV by using older samples, but neglected to include this in the original article. Several days after it was published, Professor Paul Young sent me a note expressing exactly this sentiment:

What would be even better would be to have access to fossilised material that predates European settlement, that we could examine. We collaborated with an “Ancient DNA” specialist and tried this several years ago but we weren’t able to recover usable template DNA. Still worth some future effort though.

Avila-Arcos MC et al (2012). 120 years of koala retrovirus evolution determined from museum skins. Mol Biol Evol. 2012 Sep 14.

TWiV Special: A paradigm for pathogen de-discovery

On this special episode of the science show This Week in Virology, Vincent and Ian review a multicenter blinded analysis which finds no association between chronic fatigue syndrome/myalgic encephalomyelitis and XMRV or polytropic murine leukemia virus.

You can find this TWiV Special at

A viral mashup in snakes

snake inclusion body diseaseIf you know anything about snakes you might be familiar with snake inclusion body disease, or IBD. This transmissible and fatal disease affects snakes of a variety of species but has been best studied in boas. The name comes from the presence of large masses (inclusions) in the cytoplasm of cells from infected snakes. IBD might be caused by a novel arenavirus.

To identify an etiologic agent of IBD, RNA was extracted from multiple organs of snakes with the disease, and subjected to deep sequencing. This analysis revealed the presence of two distinct arenaviruses. One virus, called CASV (California Academy of Sciences virus) was found in diseased annulated tree boas, and the second, GGV (Golden Gate virus) was detected in boa constrictors. These sequences were found in 6 of 8 IBD snakes but not in 18 disease-free controls.

The finding of arenaviruses in snakes is interesting because these viruses are thought to infect only mammals. Rodents are believed to be the natural host of arenaviruses, which are classified as Old World or New World depending on where they are isolated. In rodents, arenavirus infection is typically asymptomatic. When arenaviruses infect humans, severe disease can result, such as hemorrhagic fever caused by Lassa virus. How CASV and GGV are transmitted to snakes is not known. One possibility is that they are introduced into snakes when they consume mice. The viruses might be transmitted among snakes by contact or via vectors such as blood-sucking mites. The genome sequences of CASV an GGV are very different from those of rodent arenaviruses. If similar viruses circulate in rodents, they have not yet been detected; alternatively, CASV- and GGV-like viruses might have diverged from Old- and New World arenaviruses after many years of transmission among snakes.

Another surprise emerged from analysis of the CASV and GGV viral proteins. Arenaviral genomes encode four main proteins: an RNA polymerase, L; a nucleoprotein, NP; a transmembrane glycoprotein, GPC, and a zinc-binding protein, Z. The amino acid sequences of CASV and GGV L, and NP, but not Z and GPC, resemble those of known arenaviruses. The CASV and GGV glycoproteins are instead related to glycoproteins of filoviruses and retroviruses. This observation suggests that recombination took place between the genomes and arenaviruses and filoviruses or retroviruses, likely a very long time ago.

Whether these novel arenaviruses actually cause snake IBD is not proven by this work. This question is underscored by the observation that no arenaviruses were detected in two of the 8 IBD positive snakes in this study. In addition, two of the virus-positive snakes that were diagnosed with IBD did not have symptoms of the disease. It is possible that the arenaviruses are present but do not cause symptoms. As the authors write,

….sequencing can only ever identify candidate etiologic agents, and demonstration of causality requires significant additional experimental effort.

This additional work would include the demonstration that infectious virus can be consistently recovered from diseased snakes, and that the disease can be induced by inoculation of snakes with the virus. As a first step towards answering these questions, kidney and liver extracts were added to cultured boa constrictor kidney cells. By 5 days post-infection, viral RNA could be detected in the cell supernatant, but it is not known if the viruses produced are infectious.

This work shows convincingly that the host range of arenaviruses is much broader than we thought: they do not just infect mammals. The zoonotic pool continues to grow, and there are now more potential sources of new human arenaviruses. The work also emphasizes that our knowledge of all the viruses on the planet remains miniscule.

Mark D. Stenglein, Chris Sanders, Amy L. Kistler, J. Graham Ruby, Jessica Y. Franco, Drury R. Reavill, Freeland Dunker, and Joseph L. DeRisi. 2012. Identification, Characterization, and In Vitro Culture of Highly Divergent Arenaviruses from Boa Constrictors and Annulated Tree Boas: Candidate Etiological Agents for Snake Inclusion Body Disease. mBio 3:e00180-12.

TWiV 194: Five postdocs in North America

On episode #194 of the science show This Week in Virology, Vincent returns to Madison, Wisconsin and meets with postdocs to discuss their science and their careers.

You can find TWiV #194 at

Cleaning up after XMRV

XMRVThe retrovirus XMRV does not cause prostate cancer or chronic fatigue syndrome – that hypothesis was disproved by the finding that the virus was produced in the laboratory in the 1990s by passage of a prostate tumor in nude mice. A trio of new papers on the virus attempt to address questions about the serological detection of XMRV in prostate cancer, and further emphasize that XMRV is not a human pathogen.

Absence of XMRV and Closely Related Viruses in Primary Prostate Cancer Tissues Used to Derive the XMRV-Infected Cell Line 22Rv1. The human cell line 22Rv1, which was established from a human prostate tumor (CWR22), produces infectious XMRV. It was previously shown that DNA from various passages of the prostate tumor in nude mice (called xenografts), did not contain XMRV, but cells from the mice do contain two related proviruses called PreXMRV-1 and PreXMRV-2 which recombined to form XMRV between 1993-1996. In a new study samples of the original prostate tumor CWR22 were examined for the presence of XMRV or related viruses. PCR assays targeting the viral gag, pol, and env sequences failed to provide evidence of XMRV in CWR22 tissue. These assays could detect endogenous murine leukemia virus DNA in mouse DNA, indicating that the CWR22 tumor contained neither XMRV nor related viruses. In addition, no XMRV sequences were detected when sections from the CWR22 tumor were examined by in situ hybridization. The same assay previously detected XMRV sequences in stromal cells of prostate tumors. The authors conclude that “Our findings conclusively show an absence of XMRV or related viruses in prostate of patient CWR22, thereby strongly supporting a mouse origin of XMRV.”

An important question not addressed by this study is why XMRV was originally detected in multiple prostate tumors obtained from patients at the Cleveland Clinic. The authors seem to be working on this problem, as they state that “…the sequence of XMRV present in 22Rv1 cells is virtually identical with XMRV cloned using human prostate samples, thus suggesting laboratory contamination with XMRV nucleic acid from 22Rv1 cells as the source. Further experiments designed to confirm or refute this hypothesis are currently underway.”

No biological evidence of XMRV in blood or prostatic fluid from prostate cancer patients. Samples from individuals with prostate cancer were tested for the presence of infectious XMRV and for antibodies against the virus. Neither infectious virus nor antibodies were detected in blood plasma (n = 29) or prostate secretions (n = 5). Among these were five specimens that had previously tested positive for XMRV DNA, including two from the original study. The authors conclude that the results “support the conclusion from other studies that XMRV has not entered the human population”.

Susceptibility of human lymphoid tissue cultured ex vivo to Xenotropic murine leukemia virus-related virus (XMRV) infection. Although XMRV is not known to cause human disease, whether it has to potential to do so is unknown. The virus can infect a variety of cultured human cells including peripheral blood mononuclear cells and neuronal cells. In this study the authors placed human tonsillar tissue in culture and infected it with XMRV. Proviral (integrated) DNA could be detected in the cells several weeks after infection and virus particles were released into the medium. However these released viruses could not infect fresh tonsillar tissue, possibly due to modification by innate antiviral restriction factors such as APOBEC, which is known to inhibit XMRV infectivity.

Based on their findings the authors conclude that “laboratories working with XMRV producing cell lines should be aware of the potential biohazard risk of working with this replication-competent retrovirus”.

It is clear that XMRV does not cause chronic fatigue syndrome; the original findings of Lombardi and colleagues linking the virus to this disease have been retracted by the journal. However there are still two papers in the literature that report the presence of XMRV in prostate – the original XMRV discovery paper and one from Ila Singh’s laboratory. In both papers XMRV detection in tissues was accomplished by using serological procedures. Based on the papers summarized here, the assays did not detect XMRV – but a satisfactory explanation for the positive signals has not yet been provided.