Viral variation in single cells

QuasispeciesIt is well known that virus populations display phenomenal diversity. Virus populations are dynamic distributions of nonidentical but related members called a quasispecies. This diversity is restricted in single cells, but is restored within two infectious cycles.

Single cells infected with vesicular stomatitis virus (VSV) were isolated using a glass microcapillary, and incubated overnight to allow completion of virus replication. Replication in a single cell imposes a genetic bottleneck, as few viral genomes are present. Virus-containing culture fluids were then subjected to plaque assay, during which 2 viral replication cycles took place. For each infected cell, 7-10 plaques were picked and used for massive parallel genome sequencing. A total of 881 plaques from 90 individual cells were analyzed in this way. Of the 532 single nucleotide differences  identified, 36 were also present in the parental virus stock.

An interesting observation was that over half of the infected cells contained multiple parental variants. However, the multiplicity of infection (MOI) that was used should have only resulted in multiple infections in 15% of the cells. The results cannot be explained by RNA recombination as this process occurs at a very low rate in VSV-infected cells. The key is that MOI only describes the infectious virus particles that are delivered to cells.  Because the particle-to-pfu ratio of VSV is high, it seems likely that many cells received both infectious and non-infectious particles. Furthermore, it is known that some RNA viruses may be transmitted to other cells in groups, either by aggregation of particles or within a membrane vesicle.

The conclusion from these results is very important: a single plaque-forming unit can contain multiple, genetically diverse particles.  Plaque purification has been used for years in virology to produce clonal virus stocks, but at least for VSV, a plaque is not produced by a single viral genome.

The 496 single nucleotide changes that were not present in the parent virus arose after the bottleneck imposed by single cell replication. Between 0 and 17 changes were identified in the 7-10 plaques isolated from each cell. The single-cell bottleneck restricted the parental virus diversity to 36 nucleotide changes. In contrast, within 2 viral generations, the viral diversity was over ten times greater (496 changes). This observation illustrates the capacity of the RNA virus genome to restore diversity after a bottleneck.

The number of changes identified in the 7-10 plaques isolated from each cell, between 0 and 17, shows that some cells produce more diverse progeny than others. At least two sources of this variation were identified. The viral yield per cell varied greatly, from 0 to over 3000 PFU. Greater virus yields means more viral RNA replication, and more change for diversity. Indeed, greater virus yields per cell was associated with more mutations in the progeny.

Another explanation for the variation in single-cell diversity comes from analysis of cell #36. This infected cell produced viruses with 17 changes not found in the parental virus, more than any other cell. One of these changes lead to a single amino acid change in the viral RNA polymerase. This amino acid change appears to increase the mutation rate of the enzyme. Similar mutators – changes that increase the error frequency – have also been described in the poliovirus RNA polymerase.

RNA viruses must carry out error-prone replication to adapt to new environments. A consequence is that RNA virus populations exist close to an error threshold beyond which infectivity is lost. How the balance is maintained is not understood. The results of this study suggest that some infected cells may produce a highly diverse population, while in others a more conserved sequence is maintained. This distribution of diversity might permit the necessary evolvability without the lethality conferred by having too many mutations.

I would be very interested to know if the conclusions of this work would be changed by the ability to determine the sequences of all the viral genomes recovered from a single infected cell. The authors note that this is not technically possible, but surely will be in the future.

Reassortment of the influenza virus genome

Mutation is an important source of RNA virus diversity that is made possible by the error-prone nature of RNA synthesis. Viruses with segmented genomes, such as influenza virus, have another mechanism for generating diversity: reassortment.

When an influenza virus infects a cell, the individual RNA segments enter the nucleus. There they are copied many times to form RNA genomes for new infectious virions. The new RNA segments are exported to the cytoplasm, and then are incorporated into new virus particles which bud from the cell.

If a cell is infected with two different influenza viruses, the RNAs of both viruses are copied in the nucleus. When new virus particles are assembled at the plasma membrane, each of the 8 RNA segments may originate from either infecting virus. The progeny that inherit RNAs from both parents are called reassortants. This process is illustrated in the diagram below, which shows a cell that is co-infected with two influenza viruses L and M. The infected cell produces both parental viruses as well as a reassortant R3 which inherits one RNA segment from strain L and the remainder from strain M.


One example of the evolutionary importance of reassortment is the exchange of RNA segments between mammalian and avian influenza viruses that give rise to pandemic influenza. For example, the 2009 H1N1 pandemic strain is a reassortant of avian, human, and swine influenza viruses, as illustrated.


Reassortment can only occur between influenza viruses of the same type. Why influenza A viruses never exchange RNA segments with influenza B or C viruses is not understood. However, the reason is probably linked to the packaging mechanism that ensures that each influenza virion contains at least one copy of each RNA segment.

Trifonov, V., Khiabanian, H., & Rabadan, R. (2009). Geographic Dependence, Surveillance, and Origins of the 2009 Influenza A (H1N1) Virus New England Journal of Medicine DOI: 10.1056/NEJMp0904572

Pushing viruses over the error threshold

The capacity of RNA viruses to produce prodigious numbers of mutations is a powerful advantage. But remember that selection and survival must balance genetic fidelity and mutation rate. Many mutations are not compatible with viral replication. Consequently, if mutation rates are high, at some point accumulating base changes lead to lethal mutagenesis – the population is driven to extinction. The error threshold is a mathematical measurement of the genetic information that must be maintained to ensure survival of the population. The results of experiments clearly show that RNA viruses evolve close to their error threshold.

The first clues that viruses exist on the edge came from studies in which cells infected with the RNA viruses poliovirus and vesicular stomatitis virus were treated with base analogs that act as mutagens. These compounds, such as 5-azacytidine, are incorporated into the growing RNA chain during replication, leading to incorporation of incorrect bases. Treatment of virus infected cells with such drugs dramatically inhibits the production of new infectious virus particles, but stimulates mutagenesis of the viral RNA by only 2- or 3-fold. When the same experiment is done with cells infected with a DNA virus, the mutation frequency increases by several orders of magnitude.

A direct demonstration of error catastrophe was achieved by treating poliovirus-infected cells with the mutagen ribavirin. Treatment with a concentration of ribavirin that causes a 9.7-fold increase in mutagenesis lead to a 99.3% loss in poliovirus infectivity. A graph of the results shows that poliovirus exists at the edge of viability.


The line on the graph shows the infectivity of poliovirus RNA as a percentage of untreated viral RNA. The introduction of 2 mutations per genome reduces infectivity to about 30% of wild type RNA, while infectivity is nearly eliminated with 7 mutations per genome.

Ribavirin pushes poliovirus beyond the error threshold because it causes the RNA polymerase to make more errors than it already does. What would be the property of a poliovirus mutant resistant to this drug? A poliovirus mutant resistant to ribavirin was selected by passing the virus in the presence of increasing concentrations of the drug. Resistance to ribavirin is caused by a single amino acid change, G64S, in the viral RNA-dependent RNA polymerase. RNA synthesis of ribavirin-resistant poliovirus is characterized by fewer errors, in the absence of drug, compared with the parental virus. Therefore the mechanism of resistance to the mutagen ribavirin is achieved through an amino acid changes that makes a viral RNA polymerase with greater fidelity.

Now that we have an RNA polymerase that makes fewer errors, we can use it to limit viral diversity and test the theory that viral populations, not individual mutants, are the target of selection.

Crotty, S. (2001). RNA virus error catastrophe: Direct molecular test by using ribavirin Proceedings of the National Academy of Sciences, 98 (12), 6895-6900 DOI: 10.1073/pnas.111085598

Pfeiffer, J. (2003). A single mutation in poliovirus RNA-dependent RNA polymerase confers resistance to mutagenic nucleotide analogs via increased fidelity Proceedings of the National Academy of Sciences, 100 (12), 7289-7294 DOI: 10.1073/pnas.1232294100

Viral quasispecies and bottlenecks

The genome sequence of an RNA virus population clusters around a consensus or average sequence, but each genome is different. A rare genome with a particular mutation may survive a selection event, and the mutation will then be found in all progeny genomes. The selection process is illustrated in this diagram:


The diagram on the left shows a small subset of the viral genomes that are present in a virus stock. Genomes are indicated by lines, and mutations are shown by different symbols. The consensus sequence for this population is shown as a line at the bottom. There are no mutations in the consensus sequence, even though every viral genome contains mutations. One of these genomes, indicated by the arrow, is able to survive a selection event (also called a genetic bottleneck), such as passage to a new host. This virus multiplies in the host and a new population of viruses emerges, shown by the diagram on the right. The consensus sequence for this population indicates that three mutations selected to survive the bottleneck are found in every member of the population. Error-prone replication ensures that the members of the new population have many other mutations in their genomes.

The type of population selection illustrated above most likely took place during the emergence of the new influenza H1N1 virus that is currently circulating globally. Imagine that the upper left diagram represents the sequences of one viral RNA segment of an influenza virus that is infecting a pig. The animal sneezes and several million viral particles are inhaled by a human who happens to be nearby. Of all the virions inhaled by the worker, only the one near the arrow can replicate efficiently in human cells. The three mutations are then present in that RNA segment of all the viruses that multiply in the human’s respiratory tract. Imagine similar selection events leading to a new population of viruses that are well adapted for transmission from person to person.

The quasispecies theory predicts that viruses are not just a collection of random mutants, but an interactive group of variants. Diversity of the population is critical for propagation of the viral infection. Recently it became experimentally feasible to test the idea that viral populations, not individual mutants, are the target of selection. We’ll examine those data next.

The error-prone ways of RNA synthesis

Now that we have examined influenza viral RNA synthesis, it’s a good time to step back and look at a very important property of this step in viral replication. All nucleic acid polymerases insert incorrect nucleotides during chain elongation. This misincorporation is one of the major sources of diversity that allows viral evolution to take place at an unprecedented scale. Put another way, viruses are so successful because they make a lot of mistakes.

Nucleic acids are amazing molecules not only because they can encode proteins, but because they can be copied or replicated. Copying is done by nucleic acid polymerases that ‘read’ a strand of DNA or RNA and synthesize the complementary strand. Let’s start by examining DNA synthesis. Below is a DNA chain, which consists of the bases A, G, C or T strung together in a way that codes for a specific protein. In this example, the template strand is at the bottom, and consists of the bases A, C, C, T, G, A, C, G, and G (from left to right). A DNA polymerase is copying this template strand to form a complementary strand. So far the complementary bases T, G, G, A, and C have been added to the growing  DNA chain. The next step is the addition of a T, which is the complementary base for the A on the template strand:


So far all is well. But all nucleic acid polymerases are imperfect – they make mistakes now and then. This means that they insert the wrong base. In the next step below, the DNA polymerase has inserted an A instead of the correct G:


Insertion of the wrong base leads to a mutation – a change in the sequence of the DNA. In general, it’s not a good idea to make new DNAs with a lot of mutations, because the encoded protein won’t function well (but there are exceptions, as well will see). But in this case, there is a solution – DNA-dependent DNA polymerases (enzymes that copy DNA templates into DNA) have proofreading abilities. The proofreader is an enzyme called exonuclease, which recognizes the mismatched A-C base pair, and removes the offending A. DNA polymerase then tries again, and this time inserts the correct G:


Even though DNA polymerases have proofreading abilities, they still make mistakes – on the order of about one misincorporation per 107 to 109 nucleotides polymerized. But the RNA polymerases of RNA viruses are the kings of errors – these enzymes screw up as often as one time for every 1,000 – 100,000 nucleotides polymerized. This high rate of mutation comes from the lack of proofreading ability in RNA polymerases. These enzymes make mistakes, but they can’t correct them. Therefore the mutations remain in the newly synthesized RNA.

Given a typical RNA viral genome of 10,000 bases, a mutation frequency of 1 in 10,000 corresponds to an average of 1 mutation in every replicated genome. If a single cell infected with poliovirus produces 10,000 new virus particles, this error rate means that in theory, about 10,000 new viral mutants have been produced. This enormous mutation rate explains why RNA viruses evolve so readily. For example, it is the driving force behind influenza viral antigenic drift.

Here is a stunning example of the consequences of RNA polymerase error rates. Tens of millions of humans are infected with HIV-1, and every infected person produces billions of viral genomes per day, each with one mutation. Over 1016 genomes are produced daily on the entire planet. As a consequence, thousands of mutants arise by chance every day that are resistant to every combination of antiviral compounds in use or in development.

I cannot overemphasize the importance of error-prone nucleic acid synthesis in RNA viral evolution and disease production. We’ll spend the next two days examining the consequences of error prone replication. First we’ll consider the implications for viruses as a population, and then we’ll discuss the outcome when a virus produces an RNA polymerase that makes fewer mistakes. Please bear with me as we diverge slightly from influenza virus; these concepts will be an important and enduring component of your toolbox of virology knowledge.