Retroviruses R us

HERV-HAbout eight percent of human DNA is viral – remnants of ancestral infections with retroviruses. These endogenous retroviral sequences do not produce infectious viruses, and most are considered to be junk DNA. But some of them provide important functions. The protein called syncytin, which is essential for formation of the placenta, originally came to the genome of our ancestors, and those of other mammals, via a retrovirus infection. Another amazing role of endogenous retroviruses is that they regulate the stem cells that are the precursors of all the cells in our body.

The genetic material of retroviruses is RNA, but during infection it is converted to DNA which then integrates into the chromosome of the cell.  If the infected cell happens to be a germ cell, then the viral DNA, now called called an endogenous retrovirus, becomes a permanent part of the animal and its offspring. One of our endogenous retroviruses, called HERV-H, infected human ancestors about 25 million years ago. HERV-H has been found to be important for the properties of human embryonic stem cells.

Embryonic stem cells (ES cells), which are derived from the inner cell mass of a blastocyst (which forms 4-5 days after implantation), are pluripotent – they can differentiate into every cell type in the human body. Being pluripotent means expressing a very different set of genes compared with somatic cells – the cells of skin, muscle, organs, to name a few. The genes that are expressed in ES cells are controlled by a small number of key proteins that regulate mRNA synthesis. If these proteins – just four – are produced in a differentiated cell, it will turn into an ES cell – an induced, pluripotent embryonic stem cell, or iPSC. This observation garnered Shinya Yamanaka the Nobel Prize in 2012.

The first clue that HERV-H might be important for the pluripotency of ES cells was the finding that this DNA is preferentially expressed in human ES cells (the figure [credit] shows the expression of HERV-H in ES and two other cell types). When the levels of HERV-H RNAs are reduced (by RNA interference) in ES cells, the morphology of the cells changes – they become fibroblast-like, a sign of differentiation. In contrast, when fibroblasts are reprogrammed to become iPSCs, the levels of HERV-H RNAs rise. These findings suggest that HERV-H is essential for keeping ES cells pluripotent, and for making somatic cells pluripotent.

The HERV-H DNA in our genome is flanked by viral sequences called long terminal repeats, or LTRs. These provide initiation sites for the synthesis of viral mRNAs. In human ES cells the HERV-H LTRs appear to be enhancing the transcription of nearby human genes that are important for maintaing pluripotency. In an interesting twist, the HERV-H viral RNA is important for this activity: it appears to bind proteins involved in the regulation of mRNAs important for pluripotency. This observation explains why reducing HERV-H viral RNA leads to loss of pluripotency.

The HERV-H RNA made in human ES cells is not translated into protein because it contains many mutations that have accumulated over the past 25 million years. Therefore HERV-H is a long, non-coding RNA (lncRNA), a relatively recently discovered class of regulatory RNAs. There are about 35,000 lncRNAs in human cells that are involved in controlling a variety of processes such as splicing, translation and epigenetic modifications. Now we know that endogenous retroviruses can also produce lncRNAs.

Without endogenous retroviruses, humans might not be recognizable as the Homo sapiens that today walk the Earth. They might also be egg-layers – but the eggs would be white. Viruses don’t just make us sick.

TWiV 268: Transmission is inevitable

On episode #268 of the science show This Week in Virology, Vincent, Alan, Kathy, and Ashlee discuss fomites in physicians offices, plant virus factories involved in aphid transmission, and clues from the bat genome about flight and immunity.

You can find TWiV #268 at

TWiV 267: Snow in the headlights

On episode #267 of the science show This Week in Virology, Vincent, Alan, Rich and Kathy review a protease essential for influenza pathogenesis in mice, and directionality of rhinovirus RNA exit from the capsid.

You can find TWiV #267 at

TWiV 246: Pandora, pandemics, and privacy

On episode #246 of the science show This Week in Virology, Vincent, Alan, Rich, and Kathy discuss the huge Pandoravirus, virologists planning H7N9 gain of function experiments, and limited access to the HeLa cell genome sequence.

You can find TWiV #246 at

We recorded this episode of TWiV as a Google hangout on air. Consequently the audio is not the same quality as you might be used to. But the tradeoff is that you can see each of us on video.


Pandoravirus, bigger and unlike anything seen before

pandoravirusThe discovery of the giant Mimivirus and Megavirus amazed virologists (and also many others). Their virions (750 nanometers) and DNA genomes (1,259,000 base pairs) were the biggest ever discovered, shattering the notions that viruses could not be seen with a light microscope, and that viral genomes were smaller than bacterial genomes. Now two even bigger viruses have been discovered, which are physically and genetically unlike any previously known viruses. They have been called Pandoraviruses.

Both new viruses were isolated by culturing environmental samples in the amoeba Acanthamoeba castellaniPandoravirus salinus was isolated from shallow marine sediment in a river at the coast of central Chile, and Pandoravirus dulcis was obtained from mud at the bottom of a freshwater pond near Melbourne, Australia. The P. salinus genome is at least 2.77 megabases in length (there is some uncertainty in the actual length due to the presence of repeated sequences at the ends of the DNA), while the P. dulcis genome is 2.47 megabases in length. The smaller P. dulcis genome is a subset of the P. salinus genome.

These new genomes are twice as large as those of previously described viruses, and bigger than the genomes of intracellular bacteria such as Tremblaya (138,927 base pairs) and Rickettsia (1,111,523 bp), some free living bacteria, and many free living Archaea.

While the huge sizes of the Pandoravirus virion and genomes are amazing, I find three other features of these viruses even more remarkable. The first is their atypical replication cycle. The virions are taken into amoebae by phagocytic vacuoles, and upon fusing with the vacuole membrane, the virion contents are released into the cytoplasm via a pore on the virion apex. Within 2-4 hours the cell nucleus is reorganized, and by 8-10 hours new particles appear where the nucleus once was. Pandoravirus DNA and virions are synthesized and assembled simultaneously, in contrast to eukaryotic DNA viruses and phages which fill pre-formed capsids with DNA. Virions are released by 10-15 hours as the cells lyse.

A second amazing feature is that most of the P. salinus open reading frames encode brand-new proteins. Of the 2,556 putative protein coding sequences in the P. salinus genome, 93% have no recognizable counterparts among known proteins. Some of the genes found in large DNA viruses are present, such as those encoding DNA polymerase and DNA-dependent RNA polymerase, and several amino acyl-tRNA synthetases, like members of the Megaviridae. Curiously, many of the Pandoravirus coding regions contain intervening sequences, which must be removed by RNA splicing. This process is known to occur only in the cell nucleus, suggesting that some Pandoravirus transcription occurs in that organelle. The lack of gene homology leads to authors to conclude that ‘no microorganism closely related to P. salinus has ever been sequenced’.

I am also impressed by what the authors describe as the ‘alien morphological features’ of the virions. The oval-shaped particles are 1 micron in length and 0.5 microns in diameter, easily visible by light microscopy. They are wrapped in a three-layered envelope with a pore at one end of the particle, and resemble nothing that has ever been seen before (see photograph).

How much bigger can viruses get? I don’t know the answer but I would guess even bigger than Pandoraviruses. The membranous Pandoravirus particle could easily accommodate even larger genomes. How big can a virus get and still be a virus? The answer to that question is easy: it is a virus as long as it requires a cell for replication.

These remarkable findings further emphasize the need for scientists to pursue their curiosity, and not only work on problems of obvious medical relevance. As the authors write,

This work is a reminder that our census of the microbial diversity is far from comprehensive and that some important clues about the fundamental nature of the relationship between the viral and the cellular world might still lie within unexplored environments.

Continuing their playful naming of giant viruses, the authors note that the name Pandoravirus reflects their ‘lack of similarity with previously described microorganisms and the surprises expected from their future study’.

Henrietta Lacks (HeLa) genome sequence published then withdrawn

HeLa cellsEarlier this month the European Molecular Biology Laboratory (EMBL) published the DNA sequence of the genome of HeLa cells, the cell line that is widely used for research in virology, cell biology, and many other areas. This cell line was produced from a tumor taken from Henrietta Lacks in 1951. Unfortunately the EMBL did not receive permission from Ms. Lacks’ family to publish her genome sequence, and have withdrawn the information from public databases.

The history of HeLa cells has been well chronicled in Johns Hopkins Magazine and by Rebecca Skloot in The Immortal Life of Henrietta Lacks. In early 1951, Ms. Lacks was found to have a malignant tumor of the cervix. During her examination at Johns Hopkins Hospital in Baltimore, MD, a sample of the tumor was removed and used to produce the HeLa cell line. But Ms. Lacks’ family never learned about the important cells that were derived from her until 24 years after her death.

It is quite clear that permission to publish the HeLa cell genome sequence should have been obtained from the Lacks family. This issue are discussed in an opinion piece by Rebecca Skloot in the New York Times.

I was honored to work with Rebecca Skloot during the preparation of Immortal Life, and I am flattered that Ms. Skloot thanked me in the afterward of the book. I have also written about my work with HeLa cells (that’s me in the photo with a spinner of the cells). You might also be interested in my conversation with Philip Marcus, who was the first to produce single cell clones of HeLa cells.

TWiV 215: Illuminating rabies and unwrapping a SARI

On episode #215 of the science show This Week in Virology, Vincent, Alan, and Kathy review the finding that rabies virus infection alters but does not kill neurons, and provide an update on the novel coronavirus in the Middle East.

You can find TWiV #215 at

An RNA virus that infects Archaea?

Nymph Lake, Yellowstone National ParkEvery different life form on earth can probably be infected with at least one type of virus, if not many more. Most of these viruses have not yet been discovered: just over 2,000 viral species are recognized. While the majority of the known viruses infect bacteria and eukaryotes, there are only about 50 known viruses of the Archaea, and these all have DNA genomes. The first archaeal RNA viruses might have been recently discovered in a hot, acidic spring in Yellowstone National Park.

Archaea are single-cell organisms that are similar in size and shape to bacteria, but are evolutionarily and biochemically quite distinct. They inhabit a broad range of environments including those with extreme conditions such as high temperature, acidity, and salinity. Identification of archaeal RNA viruses is important because their study could provide information about the ancestors of RNA viruses that infect eukaryotes. Direct sequencing of viral communities from the environment, known as viral metagenomics, is one approach being taken to discover archaeal viruses.

The acidic (pH <4) and hot (>80°C) springs in Yellowstone National Park were examined for the presence of archaeal RNA viruses because these bodies of water contain mainly Archaea. Samples were obtained from 28 different sites and extracted nucleic acids were treated with DNAase (to remove DNA genomes) and then reverse transcriptase (to copy RNA to DNA). If reverse transcription was reduced by treatment with RNAse, it was concluded that the sample contained mostly RNA. The results narrowed the sample size to three, all from Nymph Lake. New samples obtained twelve months later also showed a predominance of RNA and were used for metagenomic analysis by deep sequencing.

Analysis of the RNA viral sequences revealed coding regions for a predicted RNA dependent RNA polymerase (RdRp), a hallmark of RNA viruses. One assembled sequence of 5,662 nucleotides, believed to be a complete viral genome, encodes a single open reading frame containing a RdRp and a putative capsid protein similar to that of the positive-strand RNA containing nodaviruses, tetraviruses, and birnaviruses. Another viral sequence encoded a protein with 70% amino acid homology to the predicted RdRp. The sequences are from a novel virus which does not belong to any known virus family.

These results clearly show that at least two related but distinct RNA viruses are present in Nymph Lake. However whether or not the hosts of these viruses are Archaea or Bacteria cannot be determined by these metagenomic analyses. What is needed to resolve this question is old-fashioned virology:  isolating RNA virus particles that can infect an archaeal host and produce new infectious viruses.

B Bolduc, DP Shaughnessy, YI Wolf, EV Koonin, FF Roberto and M Young J. Virol. 2012, 86(10):5562. DOI: 10.1128/JVI.07196-11.

Museum pelts help date the koala retrovirus

friendly-male-koalaThe genomes of most higher organisms contain sequences from retroviral genomes called endogenous retroviruses (ERVs). These are DNA copies of retroviral RNAs that are integrated into the germ line DNA of the host, and passed from parent to offspring. In most species the infections that lead to germ line ERVs appear to have occurred millions of years ago. The Koala retrovirus, KoRV, is the only retrovirus that we know of that is currently invading the germ line of its host species. A study of Koala pelts preserved in museums suggests how recently the virus infected this animal.

The koala is native to Australia, and all koalas in northern Australia are infected with KoRV. However not all animals in the southeast or on southern islands are infected. It is believed that KoRV crossed into koalas from another species (possibly the Asian mouse Mus caroli) some time within the past two hundred years. To test this hypothesis, DNA was extracted from 28 koala skins that were held in museums and which had been collected from the late 1800s to the 1980s. Polymerase chain reaction was used to detect KoRV DNA in the koala genome. The results show that KoRV was already widespread in Northern Australian koalas by the late 1800s. It has since spread slowly because the virus is not ubiquitous in southern koalas. The slow dispersal may due to the sedentary and solitary nature of koalas. Examination of mitochondrial DNA from the koala skins confirmed that there has been limited movement of the animals with Australia.

The sequence of the KoRV gene encoding the viral glycoprotein, env, was also determined. The results reveal that env sequences from museum specimens are remarkably similar to those of KoRV found in contemporary koalas. At first glance this result might not seem surprising: the endogenous KoRV genomes are evolving at the same slow rate as the host DNA into which they are integrated. However, there appear to have been multiple transmissions and germ line invasions by KoRV, leading the authors to suggest that in all cases very similar retroviruses were involved.

Infection with KoRV in captive animals is believed to cause immunosuppression, leading to fatal lymphomas or Chlamydia infection. A Chlamydia epidemic is believed to have killed many koalas in 1887-1889, consistent with the PCR results indicating that KoRV was widely present at that time.

Update: I had meant to discuss the possibility of dating the invasion of Koalas by KoRV by using older samples, but neglected to include this in the original article. Several days after it was published, Professor Paul Young sent me a note expressing exactly this sentiment:

What would be even better would be to have access to fossilised material that predates European settlement, that we could examine. We collaborated with an “Ancient DNA” specialist and tried this several years ago but we weren’t able to recover usable template DNA. Still worth some future effort though.

Avila-Arcos MC et al (2012). 120 years of koala retrovirus evolution determined from museum skins. Mol Biol Evol. 2012 Sep 14.

TWiV 187: The mummy

On episode #187 of the science show This Week in Virology, Vincent and Rich discuss recovery of a hepatitis B viral genome from a 16th century Korean mummy, and personal omics profiling of an individual over a 14 month period.

You can find TWiV #187 at