A huge host contribution to virus mutation rates

HIV-1 mutation rateThe high mutation rate of RNA viruses enables them to evolve in the face of different selection pressures, such as entering a new host or countering host defenses. It has always been thought that the sources of such mutations are the enzymes that copy viral RNA genomes: they make random errors which they cannot correct. Now it appears that a cell enzyme makes an even greater contribution the mutation rate of an RNA virus.

Deep sequencing was used to determine the mutation rate of HIV-1 in the blood of AIDS patients by searching for premature stop codons in open reading frames of viral RNA. Because stop codons terminate protein synthesis, they do not allow production of infectious viruses. Therefore they can be used to calculate the mutation rate in the absence of selection. The mutation rate calculated in this way, 0.000093 mutations per base per cell, was slightly higher than previously calculated from studies in cell culture.

When HIV-1 infects a cell, the enzyme reverse transcriptase converts its RNA genome to DNA, which then integrates into the host cell genome. Identification of stop codons in integrated viral DNA should provide an even better estimate of the mutation rate of reverse transcriptase, because mutations that block the production of infectious virus have not yet been removed by selection. The mutation rate calculated by this approach was 0.0041 mutations per base per cell, or one mutation every 250 bases. This mutation rate is 44 times higher than the value calculated from viral RNA in patient plasma (illustrated).

Sequencing of integrated viral DNA from many patients revealed that the vast majority of mutations leading to insertion of stop codons – 98% – were the consequence of editing by the cellular enzyme APOBEC3G. This enzyme is a deaminase that changes dC to dU in the first strand of viral DNA synthesized by reverse transcriptase. APOBEC3G constitutes an intrinsic defense against HIV-1 infection, because extensive mutation of the viral DNA reduces viral infectivity. Indeed, most integrated HIV proviruses are not infectious as a consequence of APOBEC3G-induced mutations. That infection proceeds at all is due to incorporation of the viral protein vif in the virus particles. Vif binds APOBEC3G, leading to its degradation in cells.

The mutation rate of integrated HIV-1 DNA calculated by this method is much higher than that of other RNA viruses. This high mutation rate is driven by the cellular enzyme, APOBEC3G. At least half of the mutations observed in plasma viral RNAs are also contributed by this enzyme.

It has always been thought that error-prone viral RNA polymerases are largely responsible for the high mutation rates of RNA viruses. The results of this study add a new driver of viral variation, a cellular enzyme. APOBEC enzymes are known to introduce mutations in the genomes of other viruses, including hepatitis B virus, papillomaviruses, and herpesviruses. Furthermore, the cellular adenosine deaminase enzyme can edit the genomes of RNA viruses such as measles virus, parainfluenza virus, and respiratory syncytial virus. Cellular enzymes may therefore play a much greater role in the generation of viral diversity than previously imagined.

TWiV 360: From Southeastern Michigan

On episode #360 of the science show This Week in Virology, Vincent visits the University of Michigan where he and Kathy speak with Michael, Adam, and Akira about polyomaviruses, virus evolution, and virus assembly, on the occasion of naming the department of Microbiology & Immunology a Milestones in Microbiology site.

You can find TWiV #360 at www.microbe.tv/twiv. Or you can watch the video below.

TWiV 337: Steamer

On episode #337 of the science show This Week in Virology, Vincent meets up with Michael and Steve to discuss their finding of a transmissible tumor in soft-shell clams associated with a retrovirus-like element in the clam genome.

You can find TWiV #337 at www.microbe.tv/twiv.

Retroviral influence on human embryonic development

EmbryogenesisAbout eight percent of human DNA is viral: it consists of retroviral genomes produced by infections that occurred many years ago. These endogenous retroviruses are passed from parent to child in our DNA. Some of these viral genomes are activated for a brief time during human embryogenesis, suggesting that they may play a role in development.

There are over 500,000 endogenous retroviruses in the human genome, about 20 times more than human genes. They were acquired millions of years ago after retroviral infection. In this process, viral RNA is converted to DNA, which then integrates into cell DNA. If the retroviral infection takes place in the germ line, the integrated DNA may be passed on to offspring.

The most recent human retroviral infections leading to germ line integration took place with a subgroup of human endogenous retroviruses called HERVK(HML-2). The human genome contains ~90 copies of these viral genomes, which might have infected human ancestors as recently as 200,000 years ago. HERVs do not produce infectious virus: not only is the viral genome silenced – no mRNAs are produced – but they are littered with lethal mutations that have accumulated over time.

A recent study revealed that HERVK mRNAs are produced during normal human embryogenesis. Viral RNAs were detected beginning at the 8-cell stage, through epiblast cells in preimplantation embryos, until formation of embryonic stem cells (illustrated). At this point the production of HERVK mRNA ceases. Viral capsid protein was detected in blastocysts, and electron microscopy revealed the presence of virus-like particles similar to those found in reconstructed HERVK particles. These results indicate that retroviral proteins and particles are present during human development, up until implantation.

Retroviral particles in blastocysts are accompanied by induction of synthesis of an antiviral protein, IFITM1, that is known to block infection with a variety of viruses, including influenza virus. A HERVK protein known as Rec, produced in blastocysts, binds a variety of cell mRNAs and either increases or decreases their association with ribosomes.

Is there a function for HERVK expression during human embryogenesis? The authors speculate that modulation of the ribosome-binding activities of specific cell mRNAs by the viral Rec protein could influence aspects of early development. As Rec sequences are polymorphic in humans, the effects could even extend to individuals. In addition, HERVK induction of IFITM1 might conceivably protect embryos against infection with other viruses.

The maintenance of open reading frames in HERV genomes, over many years of evolution, suggests a functional role for these elements. Evidence for such function comes from the syncytin proteins, which  are essential for placental development: the genes encoding these proteins originated from HERV glycoproteins. However, not all endogenous retroviruses are beneficial: a number of malignant diseases have been associated with HERV-K expression.

TWiV 320: Retroviruses and cranberries

On episode #320 of the science show This Week in Virology, Vincent speaks with John Coffin about his career studying retroviruses, including working with Howard Temin, endogenous retroviruses, XMRV, chronic fatigue syndrome and prostate cancer, HIV/AIDS, and his interest in growing cranberries.

You can find TWiV #320 at www.microbe.tv/twiv.

Amyotrophic lateral sclerosis and viruses

Many people have a new awareness of the disease known as amyotrophic lateral sclerosis, or ALS, thanks to the Ice Bucket Challenge initiated by the ALS Association. Fewer might know that retroviruses have been proposed to play a role in the development of the disease.

I previously summarized a 2008 paper on ALS in a piece called Retroviruses and amyotrophic lateral sclerosisSera from some ALS patients had previously been shown to contain elevated levels of reverse transcriptase, an enzyme found in retrovirus particles. In the 2008 paper, RNAs encoding this enzyme were reported in the brains of ALS patients, and their origin appears to be the human endogenous retrovirus HERV-K.

The progress made in understanding the relationship of endogenous retroviruses with ALS is summarized in a review published in August of 2014 entitled Retroviruses and amyotrophic lateral sclerosis (the paper is open access). The authors conclude:

A comprehensive study of the expression or reactivation of endogenous retroviral elements in ALS has not yet been undertaken. The literature on HERV-W involvement in ALS is difficult to interpret. Two independent reports, however, have shown increased HERV-K expression in both serum and brain tissue in ALS patients. It remains unknown if HERV-K expression is an epiphenomenon or plays a pathophysiological role in the disease.

I am pleased to participate in the Ice Bucket Challenge to help raise awareness of ALS and raise money to work on the disease.

De-discovering pathogens: Viral contamination strikes again

Spin column

Qiagen spin column at right. The silica layer is white. The spin column is placed in the microcentrifuge tube, left, to remove liquids and elute nucleic acids.

Do you remember the retrovirus XMRV, initially implicated as the cause of chronic fatigue syndrome, and later shown to be a murine virus that contaminated human cells grown in mice? Another virus thought to be associated with human disease has recently been shown to be a contaminant, derived from a piece of laboratory plasticware that is commonly used to purify nucleic acids from clinical samples.

During a search for the causative agent of seronegative hepatitis (disease not caused by hepatitis A, B, C, D, or E virus) in Chinese patients, a novel virus was discovered in sera by next generation sequencing. This virus, provisionally called NIH-CQV, has a single-stranded DNA genome that is a hybrid between parvoviruses and circoviruses. When human sera were screened by polymerase chain reaction (PCR), 63 of 90 patient samples (70%) were positive for the virus, while sera from 45 healthy controls were negative. Furthermore, 84% of patients were positive for IgG antibodies against the virus, and 31% were positive for IgM antibodies (suggesting a recent infection). Among healthy controls, 78% were positive for IgG and all were negative for IgM. The authors concluded that this virus was highly prevalent in some patients with seronegative hepatitis.

A second independent laboratory also identified the same virus (which they called PHV-1) in sera from patients in the United States with non-A-E hepatitis, while a third group identified the virus in diarrheal stool samples from Nigeria.

The first clue that something was amiss was the observation that the novel virus identified in all three laboratories shared 99% nucleotide and amino acid identity. This would not be expected in virus samples from such geographically, temporally, and clinically diverse samples. Another problem was that in the US non-A-E study, all patient sample pools were positive for viral sequences. These observations suggested the possibility of viral contamination.

When nucleic acids were re-purified from the US non-A-E samples using a different method, none of the samples were positive for the novel virus. Presence of the virus was ultimately traced to the use of column-based purification kits manufactured by Qiagen, Inc. Nearly the entire novel viral genome could be detected by deep sequencing in water that was passed through these columns.

The nucleic acid purification columns contaminated with the novel virus were used to purify nucleic acid from patient samples. These columns (pictured), produced by a number of manufacturers, are typically a few inches in length and contain a silica gel membrane that binds nucleic acids. The clinical samples are added to the column, which is then centrifuged briefly to remove liquids (hence the name ‘spin’ columns). The nucleic acid adheres to the silica gel membrane. Contaminants are washed away, and then the nucleic acids are released from the silica by the addition of a buffer.

Why were the Qiagen spin columns contaminated with the parvovirus-circovirus hybrid? A search of the publicly available environmental metagenomic datasets revealed the presence of sequences highly related to PHV-1 (87-99% nucleotide identity). The datasets containing PHV-1 sequences were obtained from sampled seawater off the Pacific coast of North America, and coastal regions of Oregon and Chile. Silica, a component of spin columns, may be produced from diatoms. If the silica in the Qiagen spin columns was produced from diatoms, and if PHV-1 is a virus of ocean-dwelling diatoms, this could explain the source of contamination.

In retrospect it was easy to be fooled into believing that NIH-CQV might be a human pathogen because it was only detected in sick, and not healthy patients. Why antibodies to the virus were detected in samples from sick and healthy patients remains to be explained. However NIH-CQV/PHV-1 is likely not associated with any human illness: when non-Qiagen spin columns were used, PHV-1 was not found in any patient sample.

The lesson to be learned from this story is clear: deep sequencing is a very powerful and sensitive method and must be applied with great care. Every step of the virus discovery process must be carefully controlled, from the water used to the plastic reagents. Most importantly, laboratories involved in pathogen discovery must share their sequence data, something that took place during this study.

Trust science, not scientists.

A retrovirus makes chicken eggshells blue

Araucana eggWhen you purchase chicken eggs at the market, they usually have white or brown shells. But some breeds of chicken produce blue or green eggs. The blue color is caused by insertion of a retrovirus into the chicken genome, which activates a gene involved in the production of blue eggs.

The Araucana, a chicken breed from Chile, and Dongxiang and Lushi chickens in China lay blue eggs. Blue eggshell color is controlled by an autosomal dominant gene: eggs produced by homozygote chickens are darker blue than those from heterozygotes. The gene causing blue eggshell color is called oocyan (O) and was previously mapped to the short arm of chromosome 1.

To further refine the location of the O gene, genetic crosses were performed using molecular markers on chromosome 1. The O gene was then located in a ~120 kb region which contained four genes. Only the SLCO1B3 was expressed in the uterus of Dongxiang chickens that produce blue eggs; it was not expressed in chickens that produce brown eggs.

Sequence analysis of the SLCO1B3 revealed that an endogenous avian retrovirus called EAV-HP has inserted just upstream of the gene. This insertion places a promoter sequence in front of the SLCO1B3 gene. As a consequence, the SLCO1B3 gene is transcribed. In chickens that produce brown eggs, no retrovirus is inserted before the SLCO1B3 gene, and no mRNA encoding the protein is produced.

The retrovirus insertion has occurred at different positions in the Chilean and Chinese chicken genomes. This observation indicates that the insertion arose independently during breeding of chicken strains several hundred years ago to produce blue egg layers. The chicken genome contains multiple copies of endogenous retroviruses, which can duplicate and move to other locations. We can assume that a random insertion upstream of the SLCO1B3 gene was selected for by breeding procedures that were aimed at producing blue egg-laying chickens.

The SLCO1B3 gene encodes a membrane transporter protein that mediates the uptake of a wide range of organic compounds into the cell. The blue eggshell color is produced by deposition of biliverdin on the eggshell as it develops in the uterus. Biliverdin is one component of bile salts, which are transported by SLCO1B3, providing a plausible hypothesis for the role of the protein in making blue eggshells.

Blue eggshell color is another example of the important roles that retroviruses have played in animal development. One other is the help provided by retroviruses in producing the placenta of mammals. Not all retroviral insertions are beneficial – integration next to an oncogene can lead to transformation and oncogenesis.

TWiV 242: I want my MMTV

On episode #242 of the science show This Week in Virology, the complete TWiV team talks about how two different viruses shape the evolution of an essential housekeeping protein.

You can find TWiV #242 at www.microbe.tv/twiv.

Dual virus-receptor duel

transferrin receptorViruses are obligate intracellular parasites: they must enter a cell to reproduce. To gain access to the cell interior, a virus must first bind to one or more specific receptor molecules on the cell surface. Cell receptors for viruses do not exist only to serve viruses: they also have cellular functions. An example is the transferrin receptor, which regulates iron uptake and assists in the entry of viruses from three different families. It might appear that such dual-use proteins cannot evolve to block virus entry because their cellular function would then be compromised. A study of two viruses that bind to the same cell surface receptor protein reveals how a cellular protein can change to prevent infection without affecting its role in the cell.

The virus-cell receptor interaction is one of the many arenas where the evolution of host-virus conflict can be studied. Because the virus-receptor interaction is essential for viral replication, host cells with a mutation in the receptor gene that prevents virus infection survive and eventually dominate the population. A virus could overcome this block with an amino acid change allowing binding to the altered receptor. Mutations that alter the interaction to favor the virus or the host are called ‘positively selected’ mutations. Such back-and-forth evolution between viruses and their host cells has been called host-virus arms races. Most have been identified by studying antiviral genes. This study is unusual in that it involves a housekeeping gene that has been usurped for viral attachment.

Evidence for positive selection of host genes can be detected by comparing gene sequences of phylogenetically related species. Nonsynonymous mutations lead to a change in the amino acid sequence, while synonymous mutations do not. The rate at which nonsynonymous mutations occur in the genome is typically much slower than synonymous mutations. The reason for this difference is that most mutations that change the amino acid sequence of a protein are lethal to the host. When genes have been subjected to positive selection by a virus, the ratio of nonsynonymous to synonymous mutations is higher, typically in host amino acids that interact with viral proteins. Computer programs have been designed to scan gene sequences and identify codons which are under positive selection by virtue of a high ratio of nonsynonymous to synonymous mutations.

To determine if the transferrin receptor (TfR1) has evolved to prevent virus attachment, sequences of the protein from seven different rodent species were compared. The analysis revealed that much of the protein is highly conserved, but a small part, comprising six amino acids, is evolving rapidly. Three of these amino acids  are located on the part of TfR1 that binds arenaviruses, and three are at the binding site for the retrovirus mouse mammary tumor virus (MMTV) (see illustration). Changing these three amino acids of TfR1 of the house mouse, which is susceptible to MMTV, to the sequence found in TfR1 of the MMTV-resistant vesper mouse, blocked entry of the virus into cells. In turn, changing these three amino acids of TfR1 of the MMTV-resistant short-tailed zygodont to the sequence of the house mouse enabled virus entry into cells. None of these changes had an effect on ferritin binding by TfR1.

Evidence for positive selection can also be detected in viral genes encoding proteins that interact with the host. The arenavirus glycoprotein, GP, is known to bind to TfR1. Ten GP amino acids were identified that are under positive selection, and four of these directly contact TfR1.

These findings demonstrate that there has been an arms race between TfR1 and both an arenavirus and retrovirus. An interesting question is whether human TfR1 will enter into an arms race with arenaviruses. As these viruses emerge into the human population, it is expected that humans with mutations that make them less susceptible to infection or severe disease will be positively selected. Amino acid 212 of human TfR1, which is near the positively selected resides in murine TfR1, varies in the human population. When this amino acid change (leucine to valine) is introduced into TfR1, it confers some protection against arenavirus entry. Curiously, this polymorphism has only been found in Asian populations, where arenaviruses that bind TfR1 are not found. The polymorphism is probably neutral with respect to TfR1 function, and if TfR1-binding arenaviruses are introduced into Asia, this change could be positively selected.

Because all viruses depend on many host proteins for replication, it will be interesting to use this approach to see how other highly conserved cell proteins balance cell function with the ability to resist virus infections. There are like to be many cell proteins that cannot change to evade viral use without destroying their cell function. Fortunately for cells there are exceptions.