Exaptation: A cell enzyme becomes a viral capsid protein

Alphalipothrixvirus virionThe acquisition of a capsid is thought to be a key event in the evolution of viruses from the self-replicating genetic elements that existed during the pre-cellular stage on Earth. The origin of viral capsids has been obscure because their components are not similar to cellular proteins. The discovery that a viral capsid protein evolved from a CRISPR-associated nuclease provides insight into how viruses emerged.

Thermoproteus tenax virus 1 (TTV1) infects the hyperthemophilic archaeon Thermoproteus tenax, which grows at 86°C. The enveloped virus particles are flexible filaments 400 nm long and 40 nm in diameter (illustrated; image credit) built with four capsid proteins, TP1-TP4. The basic proteins TP1 and TP2 bind the 16 kb double-stranded DNA genome to form the nucleocapsid.

Thirty years after the discovery of TTV1, the capsid proteins remained ORFans – meaning that they had no sequence homology with viral or cellular proteins. Recently a more sensitive homology analysis revealed that TP1 is similar to Cas4, a nuclease that is a part of the prokaryotic CRISPR-Cas defense system.

Although TP1 clearly matches the Cas4 protein, it is not complete: codons at the carboxy-terminus are missing. A re-examination of the TTV1 genome sequence revealed a previously undetected open reading frame of 74 codons just downstream of the TP1 gene which are the missing C-terminal residues of the Cas4 nuclease. It is not known if this protein, called gp7, is produced in infected cells; it is not part of the virus particle.

Together the TP1 and gp7 proteins represent a full length Cas4 nuclease. TP1 is probably not catalytically active due to amino acid changes in the active site of the enzyme.

Why does TP1 lack the carboxy-terminal residues of Cas4? The amino terminus of the TP1 protein comprises a positively charged surface that might be involved in binding the viral DNA genome. The same surface in Cas4 is covered by the carboxy-terminal domain of the protein. This observation suggests that transformation of Cas4 from a nuclease into a viral capsid protein probably required removal of this shielding domain, so that the protein could bind the DNA genome.

How did a nuclease become a viral capsid protein? An ancestor of TTV1 might have encoded a Cas4-like protein with nuclease activity with a role in genome replication or repair. Mutations causing loss of nuclease activity might have been followed by truncation of the protein to expose the DNA binding domain, which then became a viral capsid protein. Support for this idea comes from the observation that a Cas4-like protein encoded in the genome of another archaeal virus, the rudivirus SIRV2, has nuclease activity.

Exaptation, a change in the function of a protein during evolution, is known to have taken place in the viral world. The case of Cas4 and TP1 shows that capsid components can evolve from proteins with a very different function.

TWiV 365: Blood, feuds, and a foodborne disease

On episode #365 of the science show This Week in Virology, Vincent, Alan, and Kathy trace the feud over genome editing, a new virus discovered in human blood, and the origins of hepatitis A virus.

You can find TWiV #365 at www.microbe.tv/twiv.

A huge host contribution to virus mutation rates

HIV-1 mutation rateThe high mutation rate of RNA viruses enables them to evolve in the face of different selection pressures, such as entering a new host or countering host defenses. It has always been thought that the sources of such mutations are the enzymes that copy viral RNA genomes: they make random errors which they cannot correct. Now it appears that a cell enzyme makes an even greater contribution the mutation rate of an RNA virus.

Deep sequencing was used to determine the mutation rate of HIV-1 in the blood of AIDS patients by searching for premature stop codons in open reading frames of viral RNA. Because stop codons terminate protein synthesis, they do not allow production of infectious viruses. Therefore they can be used to calculate the mutation rate in the absence of selection. The mutation rate calculated in this way, 0.000093 mutations per base per cell, was slightly higher than previously calculated from studies in cell culture.

When HIV-1 infects a cell, the enzyme reverse transcriptase converts its RNA genome to DNA, which then integrates into the host cell genome. Identification of stop codons in integrated viral DNA should provide an even better estimate of the mutation rate of reverse transcriptase, because mutations that block the production of infectious virus have not yet been removed by selection. The mutation rate calculated by this approach was 0.0041 mutations per base per cell, or one mutation every 250 bases. This mutation rate is 44 times higher than the value calculated from viral RNA in patient plasma (illustrated).

Sequencing of integrated viral DNA from many patients revealed that the vast majority of mutations leading to insertion of stop codons – 98% – were the consequence of editing by the cellular enzyme APOBEC3G. This enzyme is a deaminase that changes dC to dU in the first strand of viral DNA synthesized by reverse transcriptase. APOBEC3G constitutes an intrinsic defense against HIV-1 infection, because extensive mutation of the viral DNA reduces viral infectivity. Indeed, most integrated HIV proviruses are not infectious as a consequence of APOBEC3G-induced mutations. That infection proceeds at all is due to incorporation of the viral protein vif in the virus particles. Vif binds APOBEC3G, leading to its degradation in cells.

The mutation rate of integrated HIV-1 DNA calculated by this method is much higher than that of other RNA viruses. This high mutation rate is driven by the cellular enzyme, APOBEC3G. At least half of the mutations observed in plasma viral RNAs are also contributed by this enzyme.

It has always been thought that error-prone viral RNA polymerases are largely responsible for the high mutation rates of RNA viruses. The results of this study add a new driver of viral variation, a cellular enzyme. APOBEC enzymes are known to introduce mutations in the genomes of other viruses, including hepatitis B virus, papillomaviruses, and herpesviruses. Furthermore, the cellular adenosine deaminase enzyme can edit the genomes of RNA viruses such as measles virus, parainfluenza virus, and respiratory syncytial virus. Cellular enzymes may therefore play a much greater role in the generation of viral diversity than previously imagined.

Viral variation in single cells

QuasispeciesIt is well known that virus populations display phenomenal diversity. Virus populations are dynamic distributions of nonidentical but related members called a quasispecies. This diversity is restricted in single cells, but is restored within two infectious cycles.

Single cells infected with vesicular stomatitis virus (VSV) were isolated using a glass microcapillary, and incubated overnight to allow completion of virus replication. Replication in a single cell imposes a genetic bottleneck, as few viral genomes are present. Virus-containing culture fluids were then subjected to plaque assay, during which 2 viral replication cycles took place. For each infected cell, 7-10 plaques were picked and used for massive parallel genome sequencing. A total of 881 plaques from 90 individual cells were analyzed in this way. Of the 532 single nucleotide differences  identified, 36 were also present in the parental virus stock.

An interesting observation was that over half of the infected cells contained multiple parental variants. However, the multiplicity of infection (MOI) that was used should have only resulted in multiple infections in 15% of the cells. The results cannot be explained by RNA recombination as this process occurs at a very low rate in VSV-infected cells. The key is that MOI only describes the infectious virus particles that are delivered to cells.  Because the particle-to-pfu ratio of VSV is high, it seems likely that many cells received both infectious and non-infectious particles. Furthermore, it is known that some RNA viruses may be transmitted to other cells in groups, either by aggregation of particles or within a membrane vesicle.

The conclusion from these results is very important: a single plaque-forming unit can contain multiple, genetically diverse particles.  Plaque purification has been used for years in virology to produce clonal virus stocks, but at least for VSV, a plaque is not produced by a single viral genome.

The 496 single nucleotide changes that were not present in the parent virus arose after the bottleneck imposed by single cell replication. Between 0 and 17 changes were identified in the 7-10 plaques isolated from each cell. The single-cell bottleneck restricted the parental virus diversity to 36 nucleotide changes. In contrast, within 2 viral generations, the viral diversity was over ten times greater (496 changes). This observation illustrates the capacity of the RNA virus genome to restore diversity after a bottleneck.

The number of changes identified in the 7-10 plaques isolated from each cell, between 0 and 17, shows that some cells produce more diverse progeny than others. At least two sources of this variation were identified. The viral yield per cell varied greatly, from 0 to over 3000 PFU. Greater virus yields means more viral RNA replication, and more change for diversity. Indeed, greater virus yields per cell was associated with more mutations in the progeny.

Another explanation for the variation in single-cell diversity comes from analysis of cell #36. This infected cell produced viruses with 17 changes not found in the parental virus, more than any other cell. One of these changes lead to a single amino acid change in the viral RNA polymerase. This amino acid change appears to increase the mutation rate of the enzyme. Similar mutators – changes that increase the error frequency – have also been described in the poliovirus RNA polymerase.

RNA viruses must carry out error-prone replication to adapt to new environments. A consequence is that RNA virus populations exist close to an error threshold beyond which infectivity is lost. How the balance is maintained is not understood. The results of this study suggest that some infected cells may produce a highly diverse population, while in others a more conserved sequence is maintained. This distribution of diversity might permit the necessary evolvability without the lethality conferred by having too many mutations.

I would be very interested to know if the conclusions of this work would be changed by the ability to determine the sequences of all the viral genomes recovered from a single infected cell. The authors note that this is not technically possible, but surely will be in the future.

TWiV 360: From Southeastern Michigan

On episode #360 of the science show This Week in Virology, Vincent visits the University of Michigan where he and Kathy speak with Michael, Adam, and Akira about polyomaviruses, virus evolution, and virus assembly, on the occasion of naming the department of Microbiology & Immunology a Milestones in Microbiology site.

You can find TWiV #360 at www.microbe.tv/twiv. Or you can watch the video below.

Lassa virus origin and evolution

arenavirusI have a soft spot in my heart for Lassa virus: a non-fictional account of its discovery in Africa in 1969 inspired me to become a virologist. Hence papers on this virus always catch my attention, such as one describing its origin and evolution.

Lassa virus, a member of the Arenavirus family, is very different from Ebolavirus (a filovirus), but both are zoonotic pathogens that may cause hemorrhagic fever. It is responsible for tens of thousands of hospitalizations, and thousands of deaths each year, mainly in Sierra Leone, Guinea, Liberia, and Nigeria. Most human Lassa virus outbreaks are caused by multiple exposures to urine or feces from the multimammate mouse, Mastomys natalensis, which is the reservoir of the virus in nature. In contrast, outbreaks of Ebolavirus infection typically originate with a crossover from an animal reservoir, followed by human to human transmission. Despite being studied for nearly 50 years, until recently the nucleotide sequences of only 12 Lassa virus genomes had been determined.

To remedy this lack of Lassa virus genome information, the authors collected clinical samples from patients in Sierra Leone and Nigeria between 2008 and 2013. From these and other sources they determined the sequences of 183 Lassa virus genomes from humans, 11 viral genomes from M. natalensis, and two viral genomes from laboratory stocks. All the data are publicly available at NCBI. Analysis of the data lead to the following conclusions:

  • Lassa virus forms four clades, three in Nigeria and one in Sierra Leona/Liberia (members of a clade evolved from a common ancestor).
  • Most Lassa virus infections are a consequence of multiple, independent transmissions from the rodent reservoir.
  • Modern-day Lassa virus  strains probably originated at least 1,000 years ago in Nigeria, then spread to Sierra Leone as recently as 150 years ago. The lineage is most likely much older, but how much cannot be calculated from the data.
  • The genetic diversity of Lassa virus in individual hosts is an order of magnitude greater than the diversity of Ebolavirus. Furthermore, Lassa virus diversity in the rodent host is greater than in humans, likely a consequence of the longer, persistent infections that take place in the mouse.
  • The gene encoding the Lassa virus glycoprotein is subject to high selection in hosts, leading to variants that interfere with antibody binding.
  • Genetic variants that arise in one rodent are not transmitted to another.

Perhaps the most important result from this work is the establishment of laboratories in Sierra Leone and Nigeria that can safely collect and process samples from patients infected with Lassa virus, a BSL-4 pathogen.

TWiV 348: Chicken shift

On episode #348 of the science show This Week in Virology, Vincent and Rich discuss fruit fly viruses, one year without polio in Nigeria, and a permissive Marek’s disease viral vaccine that allows transmission of virulent viruses.

You can find TWiV #348 at www.microbe.tv/twiv.

Permissive vaccines and viral virulence

chicken farmA permissive vaccine prevents disease in the immunized host, but does not block virus infection. Would a permissive vaccine lead to the emergence of more virulent viruses?

This hypothesis is based on the notion that viruses which kill their hosts too quickly are not efficiently transmitted, and are therefore removed by selection. However a vaccine that prevents disease, but not viral replication in the host, would allow virulent viruses to be maintained in the host population. It has been suggested that in this scenario, viruses with increased virulence would be selected if such a property aids transmission between hosts.

On the surface this hypothesis seems reasonable, but in my opinion it is flawed. One problem is that increased transmission might not always be associated with increased virulence. The more serious flaw lies in making anthropomorphic assessments of what we think viruses require, such as concluding that increased viral transmission is a desired trait. Our assumptions fail to recognize the main goal of evolution: survival. Evolution does not move a virus along a trajectory aimed at perfection. Change comes about by eliminating those viruses that are not well adapted for the current conditions, not by building a virus that will fare better tomorrow. All the viruses on Earth today transmit well enough, or they would not be here; yet some kill their hosts clearly much faster than others. The fact is that humans have little understanding of what drives virus evolution in large populations. Our assumptions of what constitute the selective forces are usually tainted by anthropomorphism.

This long preamble is an introduction to a series of findings which are purported to support the idea that permissive vaccines (the authors call them ‘leaky’ and ‘imperfect’ vaccines but I dislike both names because they imply defects) can lead to the selection of more virulent viruses. The subject of the paper is Marek’s disease virus (MDV), a herpesvirus that infects chickens. MDV is shed from feather follicles of infected chickens and is spread to other birds when then inhale contaminated dust. Vaccines have been used to prevent MDV infection since the early 1970s. These vaccines prevent disease, but do not block viral replication, and vaccinated, infected birds can shed wild type virus. The virulence of MDV has been increasing since the 1950s, initially from a paralytic disease, to paralysis and death. The authors wonder if the use of permissive Marek’s vaccines has lead to the selection of more virulent viruses.

To address their hypothesis, the authors inoculate vaccinated or unvaccinated chickens with a series of MDV isolates that range from low to high virulence. Unvaccinated chickens inoculated with the most virulent MDV died within a week and shed little virus. In contrast, most vaccinated birds survived infection with virulent viruses, and shed virus for the length of the experiment, 56 days.

A transmission experiment was done to determine if shed virus could infect other birds. The authors infected vaccinated or unvaccinated birds and asked if sentinel, unvaccinated chickens became infected. Unvaccinated birds died within 10 days after infection with virulent MDV, and did not transmit infection. In contrast, vaccinated birds survived at least 30 days, and co-housed sentinel animals became infected and died.

The experiments are well done and the conclusions are clear: more virulent Marek’s disease viruses replicate longer in vaccinated than unvaccinated chickens, and can be readily transmitted to other chickens. But these results do not prove that more virulent MDV arose because of permissive vaccines. Nor do the results prove in general that leaky vaccines lead to selection of more virulent viruses. The results simply show that a vaccine that does not prevent replication will allow transmission of virulent viruses.

To prove that vaccinated chickens can allow the selection of more virulent viruses, vaccinated chickens could be infected with an avirulent virus, and the shed virus collected and used to infect additional, vaccinated birds. This process could be repeated to determine if more virulent viruses arise. While the results of this gain-of-function experiment would be informative, they would be done in a controlled laboratory setting which would not duplicate all the selective forces present on a poultry farm.

The authors note that most human vaccines do prevent replication of infecting virus. They do not mention the one important exception: the Salk poliovirus vaccines. People who are immunized with the Salk vaccine can be infected with poliovirus, which will then replicate in the intestines, be shed in the feces, and transmitted to others. This behavior has been well documented in human populations, yet the virulence of poliovirus has not increased for the 60 years during which the Salk vaccine has been used.

I do not feel that these experimental results have general implications for the use of any animal vaccine. It is unfortunate that the work has been covered in many news sources with the incorrect implication that vaccines may be responsible for the emergence of more virulent viruses.

TWiV 335: Ebola lite

On episode #335 of the science show This Week in Virology, the TWiVumvirate discusses a whole Ebolavirus vaccine that protects primates, the finding that Ebolavirus is not undergoing rapid evolution, and a proposal to increase the pool of life science researchers by cutting money and time from grants.

You can find TWiV #335 at www.microbe.tv/twiv.

Describing a viral quasispecies

QuasispeciesVirus populations do not consist of a single member with a defined nucleic acid sequence, but are dynamic distributions of nonidentical but related members called a quasispecies (illustrated at left). While next-generation sequencing methods have the capability of describing a quasispecies, the errors associated with this technology have limited progress in our understanding of the genetic structure of virus populations. A new method called CirSeq reduces next-generation sequencing errors to allow an accurate description of viral quasispecies.

The key to eliminating sequencing errors is a clever approach based on the conversion of viral RNAs to circular molecules. When copied with reverse transcriptase, tandemly repeated cDNAs are produced (illustrated below). Mutations in the original viral RNA will be shared by all repeats derived from a circle, but not errors produced during copying or sequencing. The latter can be computationally subtracted, reducing sequencing error to a point that is much lower than the estimated mutation rate of an RNA virus.CirSeq

CirSeq was used to characterize poliovirus populations produced by seven serial passages in HeLa cells. The calculated mutation frequency, 2 X 10-4 mutations per nucleotide, was substantially lower compared with estimates determined by conventional sequence analysis. Over 200,000 sequence reads per nucleotide position were used to detect >16,500 variants per population per passage. This number represents ~74% of all possible alleles. Many mutations were detected at nearly all positions in the viral RNA. Most mutations occur at a frequency between 1 in 1000 to 1 in 100,000. The conclusion is that the virus population produced in HeLa cells consists mainly of genomes with the consensus sequence, and small amounts of many variant genomes. These variants are only those that give rise to viable viruses; lethal mutations are not observed.

CirSeq was also used to calculate the mutation rate of poliovirus. The rates vary according to type: transitions occurred at a rate of 2.5 X 10-5 to 2.6 X 10-4 substitutions per site, while transversions were observed at a rate of 1.2 X 10-6 to 1.5 X 10-5 substitutions per site. Nucleotide-specific differences in mutation rate were also observed: C to U and G to A transitions were 10 times more frequent than U to C and A to G. These rates are consistent with previously determined values using other methods.

This method can also be used to determine the fitness of each base at every position in the genome, according to changes observed during the seven passages in HeLa cells. This analysis allows determination of which bases are neutral, and which are selected, and when combined with analysis of protein structure, can provide new insights into viral functions.

By enabling a sequencing approach that gives an accurate description of virus populations at a single-nucleotide level, CirSeq can be used to provide an unprecedented view of how virus populations change during evolution.