Evolution proceeds by selection of mutants that arise by error-prone duplication of nucleic acid genomes. It is believed that mutations that are selected in a gene are dependent on those that have preceded them, an effect known as epistasis. Analysis of a sequence of changes in the influenza virus nucleoprotein provides clear evidence that stability explains the epistasis observed during evolution of a protein.
Evolutionary biologist John Maynard Smith used an analogy with a word game to explain how epistasis constrains the evolution of a protein. In this game, single letter changes are made to a four letter word to convert it to another valid word:
WORD->WORE->GORE->GONE->GENE
Although all the intermediates are valid words, the sequence of changes is important. For example, the G in GENE, if introduced into WORD would produce GORD which is not a word. D must be changed to E before W is changed to G. In a similar way mutations in a gene are likely to depend on the changes that have previously taken place.
Whether similar constraints affect protein evolution has been studied with the nucleoprotein (NP) of influenza virus. Between 1968 and 2007, 39 mutations appeared in the NP RNA of influenza virus H3N2. Because sequences of this viral RNA are available each year, it was possible to deduce the order in which these changes appeared in the viral genome (illustrated; figure credit). Plasmids encoding 39 different NP proteins were then constructed which represent viral NP sequences present from 1968 through 2007. All of the NP proteins were found to support similar levels of viral RNA synthesis.
The 39 mutations were then introduced singly into the NP RNA, and RNA synthesis was measured. Three of the altered proteins had large decreases in activity. Their presence also substantially reduced the growth of infectious viruses. However when these NP changes were combined with the amino acid changes that preceded it during evolution, replication was normal. The three NP changes that reduce viral RNA synthesis and replication also decrease the thermal stability of the protein.
These findings show that, from 1968-2007, three amino acid changes were fixed in the influenza virus NP protein whose deleterious effects on protein stability were compensated by previously accumulated changes in the protein. The three amino acids are located in a part of the protein that harbors sequences recognized by T cells. These changes likely allow the virus to escape the host immune response.
Protein stability clearly mediates the epistasis observed in the influenza virus NP protein. It will be important to determine which other protein properties determine the sequence of mutations that are fixed in a viral genome. Influenza viruses are ideal for this work because sequences of all of the viral RNAs are determined for multiple isolates on an annual basis. Studies of what regulates epistasis for other RNA and DNA viruses are also needed to provide an understanding of the constraints of viral evolution.