Henrietta Lacks (HeLa) genome sequence published then withdrawn

HeLa cellsEarlier this month the European Molecular Biology Laboratory (EMBL) published the DNA sequence of the genome of HeLa cells, the cell line that is widely used for research in virology, cell biology, and many other areas. This cell line was produced from a tumor taken from Henrietta Lacks in 1951. Unfortunately the EMBL did not receive permission from Ms. Lacks’ family to publish her genome sequence, and have withdrawn the information from public databases.

The history of HeLa cells has been well chronicled in Johns Hopkins Magazine and by Rebecca Skloot in The Immortal Life of Henrietta Lacks. In early 1951, Ms. Lacks was found to have a malignant tumor of the cervix. During her examination at Johns Hopkins Hospital in Baltimore, MD, a sample of the tumor was removed and used to produce the HeLa cell line. But Ms. Lacks’ family never learned about the important cells that were derived from her until 24 years after her death.

It is quite clear that permission to publish the HeLa cell genome sequence should have been obtained from the Lacks family. This issue are discussed in an opinion piece by Rebecca Skloot in the New York Times.

I was honored to work with Rebecca Skloot during the preparation of Immortal Life, and I am flattered that Ms. Skloot thanked me in the afterward of the book. I have also written about my work with HeLa cells (that’s me in the photo with a spinner of the cells). You might also be interested in my conversation with Philip Marcus, who was the first to produce single cell clones of HeLa cells.

Spread of koala retrovirus in Australia

friendly-male-koalaThe Koala retrovirus (KoRV) continues to spread within Australia, according to results of a new analysis of a larger sample size from a wider geographical range than was previously studied.

Blood or tissue samples were collected from koalas in different regions of Australia, and polymerase chain reaction (PCR) was used to detect the presence of KoRV proviral DNA, a DNA copy of the retroviral genome integrated into host cell DNA. Most of the koalas from the Australian mainland were positive for KoRV proviral DNA (442/466; 94.8%). All samples from animals in Queensland and New South Wales were KoRV positive. In mainland Victoria 65 of 89 animals contained KoRV DNA (73%). On the Victorian islands prevalence of KoRV ranged from 0% on Philip Island (0/11) to 50% on Snake Island (6/12). On the previously KoRV-free Kangaroo Island (link), 24 of 162 animals (14.8%) were KoRV positive. These results suggest that KoRV initially entered the koala population in the north of Australia and has been slowly spreading to the south. There are also other potential explanations for the results: there may be differences in KoRV susceptibility in northern versus southern animals, and the rate of transmission might differ in the two areas.

The genome of Queensland koalas contain far more copies of KoRV per cell, 165, than animals in Victoria, which ranged from less than one to 1.5 copies per cell. The Queensland koalas are likely fully endogenized – that is, the integrated KoRV DNA is passed from parent to offspring in the germline, and hence every koala cell contains viral DNA. In contrast, in Victoria koalas KoRV has either recently entered the germline (1.5 copies/cell) or has not yet entered this state (<1 copy/cell). In animals with less than one proviral copy per cell, KoRV infection was likely acquired exogenously from one animal to another. The mode of transmission of KoRV among koalas is not known, but might involve animal-animal contact or arthropod transmission.

It seems likely that eventually all wild koalas will be endogenized by KoRV. Whether this process will impact the long-term survival of the species is not known, especially since the disease caused by KoRV infection is poorly understood.

GS Simmons, PR Young, JJ Hanger, K Jones, D Clarke, JJ McKeed, J Meersa. 2012. Prevalence of koala retrovirus in geographically diverse populations in Australia. Austr. Vet. J. 90(10):404-9.

From a food blender to real-time fluorescent imaging

single phage infectionAlthough Avery, MacLeod, and McCarty showed in 1944 that nucleic acid was both necessary and sufficient for the transfer of bacterial genetic traits, protein was still suspected to be a critical component of viral heredity. Alfred Hershey and Martha Chase showed that this hypothesis was incorrect with a simple experiment involving the use of a food blender. The Hershey-Chase conclusion has since been upheld numerous times*, the most recent by a modern-day experiment using real-time fluorescence.

Hershey and Chase made preparations of the tailed bacteriophage T2 with the viral proteins labeled with radioactive sulfur, and the nucleic acids labeled with radioactive phosphorus. The virions were added to a bacterial host, and after a short period of time were sheared from the cell surface by agitation in a blender. After this treatment, the radioactive phosphorus, but not the radioactive sulfur, remained associated with bacterial cells. These infected cells went on to produce new virus particles, showing that DNA contained all the information needed to produce a bacteriophage.

In a modern validation of the Hershey-Chase experiment, bacteriophages are mixed with a cyanine dye which binds to the viral DNA (illustrated). Upon infection of the bacterial host, the phage DNA is injected into the cell together with the dye. In time the dye leaves the phage DNA and binds to the host genome. This process can be observed in real-time (as it happens) by fluorescence microscopy.

This technique was used to visualize single bacteriophages infecting an E. coli host cell. It takes about 5 minutes on average for 80% of bacteriophage lambda DNA to exit the capsid, with a range of 1-20 minutes.

These experiments do not simply provide a visual counterpart to the Hershey-Chase conclusion, but reveal additional insights into how viral DNA leaves the capsid. One interesting observation is that the amount of DNA that remains in the capsid apparently is not the sole determinant of how quickly ejection occurs. The amount of DNA ejected from the capsid does appear to regulate the dynamics of the process.

The kitchen blender experiment contrasts vividly with the complexity of real-time fluorescent imaging. Hershey and Chase did not have the technology to visualize phage DNA entering the host cell; they used what was available to them at the time. While improved technology is important for pushing research forward, simple experiments will always make important contributions to our understanding of science.

*The infectivity of cloned viral DNA is one validation of the Hershey-Chase experiment.

Hershey, AD, Chase, M. 1952. Independent functions of viral protein and nucleic acid in growth of bacteriophage. J. Gen. Physiol. 36:39-56. 

Van Valen, D., Wu, D., Chen, Y-J, Tuson, H, Wiggins, P, Phillips, R. 2012. A single-molecule Hershey-Chase experiment. Current Biol 22:1339-1343. 

A DNA virus with the capsid of an RNA virus

Boiling Springs Lake Lassen NPViral genomes are unusual because they can be based on RNA or DNA, in contrast to all cellular life forms, which have DNA as their genetic information. An unusual new virus has been discovered that appears to have sequences from both an RNA and a DNA virus.

The new virus was identified during a study of viral diversity in an extreme environment, Boiling Spring Lake. You would never want to swim there: it is acidic (pH 2.5) and hot (52° − 95° C). But the lake is not devoid of living things: it is inhabited by various bacteria, Archaea, and unicellular eukaryotes. Where there is life, there are viruses, which leads us to an expedition to determine what kinds of viruses can be found in Boiling Spring Lake.

To answer this question, Goeff Diemer and Kenneth Stedman sequenced viral DNA extracted from purified viral particles from Boiling Spring Lake water. Their analyses revealed the presence of a virus with a circular, single-stranded DNA genome similar to that found in members of the Circoviridae (this virus family includes porcine circovirus and chicken anemia virus). What surprised the investigators was that the gene encoding the viral capsid protein was similar to that from viruses with single-stranded RNA genomes, including viruses that infect plants (Tombusviridae) or fungi. The authors call it ‘RNA-DNA hybrid virus’, or RDHV. The host of RDHV is unknown but could be one of the eukaryotes that inhabit Boiling Spring Lake.

RDHV probably arose when a circovirus acquired the capsid protein of an RNA virus by DNA recombination. This event likely occurred in a cell infected with both viruses. A cellular reverse transcriptase might have converted the circovirus RNA genome to DNA to allow recombination to occur. RDHV is unusual because genetic exchanges among viruses are restricted to those with similar genome types.

To determine if RDHV is an oddity, the authors searched the database of DNA sequences obtained from the Global Ocean Survey. They found three RDHV-like genomes, indicating that these viruses exist in the ocean. Whether they are present elsewhere is a question that should certainly be answered. It is important to determine whether recombination between RNA and DNA viruses is a common means of gene exchange, or whether it is a rarity.

The discovery of RDHV could have implications for viral evolution. It has been suggested that the first organisms that evolved on earth were based on RNA molecules with coding and catalytic capabilities. Later, DNA based life evolved, and today both DNA based and RNA based organisms co-exist. Viruses like RDHV could have emerged during the transition from an RNA to a DNA world, when a new DNA virus captured the gene encoding an RNA virus capsid. In other words, RNA genes that had already evolved were not discarded but appropriated by DNA viruses. This scenario would have required some mechanism for converting RNA into DNA (reverse transcriptases?). The finding of RDHV-like viruses in the ocean suggests that a common ancestor emerged some time ago which diversified into different environments. More RDHV-like viruses must be isolated and studied before we can determine whether or not these viruses are very old, and to deduce their implications for viral evolution.

Diemer GS, Stedman KM. 2012. A novel virus discovered in an extreme environment suggests recombination between unrelated groups of RNA and DNA viruses. Biology Direct 7:13.

Renato Dulbecco, 1914-2012

wee plaques 1952For the second time in a week I note the passing of an important virologist. Renato Dulbecco, together with David Baltimore and Howard Temin, received the 1975 Nobel Prize in Physiology or Medicine for discoveries about how tumor viruses interact with the genetic material of the cell. Dulbecco also devised my favorite virological method, the plaque assay, for determining the virus titer, the number of animal viruses in a sample.

Since the early 1920s bacteriologists had used the plaque assay to quantify the number of infectious bacteriophages (viruses that infect bacteria). Dulbecco noted in 1952 that “research on the growth characteristics and genetic properties of animal viruses has stood greatly in need of improved quantitative techniques, such as those used in the related field of bacteriophage studies.” One limiting factor was the development of suitable animal cell cultures that could be used to determine viral titer. By the 1950s the techniques for reliably producing and propagating human cell cultures were developed, and in 1951 the first immortal human cell line, HeLa, was isolated. Dulbecco took advantage of these advances and showed in 1952 that western equine encephalitis virus formed plaques on monolayers of chicken embryo fibroblasts (figure). Dulbecco also made the important observation that one virus particle is sufficient to produce one plaque. He drew this conclusion from his observation of a linear dependence of the number of plaques on virus concentration. This seminal advance made possible the application of genetic techniques to the study of animal viruses.

Dulbecco’s work on tumor viruses was focused on polyomaviruses – small DNA-containing viruses such as murine polyomavirus and SV40. He found that cells from the natural host of the virus – mice for polyomavirus and monkeys for SV40 – were killed as the viruses replicated and produced new viral progeny. However, these viruses did not replicate in or kill cells from other animals. For example, when hamster cells were infected with murine polyomavirus, no viral replication took place, the cells survived, and a few rare cell were transformed  – their growth properties in culture were altered and they induced tumors when injected into hamsters. Dulbecco later found that the polyomaviral DNA is a circular, double-stranded molecule; and that in non-permissive cells (in which the virus does not replicate) the viral DNA became integrated into the host cell chromosome. He also suspected that a viral protein called T (for tumor) antigen was a key to cell transformation.

Today we understand why polyomaviruses transform cells in which they do not replicate: infection does not kill these cells, and the rare transformed cells contain only viral DNA encoding T antigen. This protein is needed for viral replication in permissive cells because it drives cell proliferation, activating cellular DNA replication systems that are required for producing more viral DNA. In a non-permissive cell, T antigen drives the cell to divide endlessly, immortalizing it and allowing the accumulation of mutations in the cell genome that make the cells tumorigenic.

While the details of how DNA tumor viruses transform cells were being elucidated, other investigators were attempting to understand how another class of viruses – with RNA genomes – had similar effects on cells. In 1951 a young scientist named Howard Temin joined Dulbecco’s laboratory to study how Rous sarcoma virus (RSV) caused tumors. This virus had been discovered by Peyton Rous in 1911, but would only cause tumors in chickens, limiting progress. In Dulbecco’s laboratory, Temin found that RSV induced transformation of cultured chicken embryo fibroblasts – the same types of cells that were being used to develop the plaque assay for animal viruses. Temin took this transformation assay to his own laboratory, where he reasoned that a DNA copy of the RSV viral genome must be integrated into the chromosome of transformed cells. This led him to discover the enzyme reverse transcriptase in RSV particles, which produces a DNA copy of the viral RNA.

By embracing a new technology for the study of animal viruses – cell culture – Dulbecco set the study of both DNA and RNA tumor viruses on a path that would lead to understanding viral transformation, an achievement recognized by the 1975 Nobel Prize.

Dulbecco, R. (1952). Production of Plaques in Monolayer Tissue Cultures by Single Particles of an Animal Virus Proceedings of the National Academy of Sciences, 38 (8), 747-752 DOI: 10.1073/pnas.38.8.747

Har Gobind Khorana, master decoder

genetic codeMost students of elementary biology will have seen the table at left that depicts the genetic code. HG Khorana was one of several scientists who determined, in the 1960s, the amino acids specified by each three-letter combination of bases. As long as I have been at Columbia University I have had a copy of this table on my office wall.

Khorana reminisces how the finding that genes are nucleic acids set the stage for his work on decoding:

While it is always difficult, perhaps impossible, to determine or clearly define the starting point in any area of science, the idea that genes make proteins was an important step and this concept was brought into sharp focus by the specific one gene-one enzyme hypothesis of Beadle and Tatum. The field of biochemical genetics was thus born. The next step was taken when it was established that genes are nucleic acids. The transformation experiments of Avery and coworkers followed by the bacteriophage experiments of Hershey and Chase established this for DNA and the work with TMV-RNA a few years later established the same for RNA. By the early 1950’s it was, therefore, clear that genes are nucleic acids and that nucleic acids direct protein synthesis, the direct involvement of RNA in this process being suggested by the early work of Caspersson and of Brachet.

The first triplet to be decoded was UUU, which specifies the amino acid phenylalanine. This work was done by Nirenberg who found that an RNA consisting of repeating U residues could be translated into a protein containing only phenylalanine. Codons for lysine (AAA) and proline (CCC) were similarly discovered using RNAs containing only A or C. Khorana used both enzymatic and chemical synthesis of oligonucleotides to decipher much of the remaining code. A good account of his work can be found in his Nobel lecture (pdf).

Khorana, together with Robert Holley (structure of tRNA) and Marshall Nirenberg, received the Nobel Prize in Physiology or Medicine in 1968 for their work on deciphering the genetic code.

Viral bioinformatics: Sequence searcher

virology toolboxThis week’s addition to the virology toolbox was written by Chris Upton

Sequence Searcher is a Java program that allows users to search for specific sequence motifs in protein or DNA sequences. For example, it can be used to identify restriction enzyme cleavage sites or find similar sequence patterns among multiple sequences. Most searches run in a few seconds.

Sequence Searcher is part of the Virology.ca suite of programs available at the University of Victoria.

Help files:

Some of the key features of Sequence Searcher include:

  • Searching through multiple sequences
  • Use of regular expressions or fuzzy search patterns.
  • Searching for patterns on both strands of a DNA sequence
  • Graphical representation of results and ability to save search results
  • It can run on multiple computer platforms (Java)

For DNA, the searches are conducted by finding the motif within a sequence from the 5’ to 3’ end on the top strand. The searches are also processed from the 5’ to 3’ end of the bottom strand. As a result, bases are numbered from 1 starting at the 5’ at either the top or bottom strand.

Regular expression and fuzzy pattern searches are available:

Fuzzy searches provide the option for the program to allow a certain number of mismatches from a sequence input at any position.  Note that the maximum number of mismatches that the program allows is 40% of the length of the sequence motif.

Regular expression allows for inputs of precise motifs along with considerable user-specified flexibility at specific positions.

figure 1

Figure 1. The input tab is where you can import DNA or protein sequences (must be in FASTA format) and type in the specific pattern to search within in the sequence(s). The search type can be selected as “Regular expression” or “Fuzzy” by using the drop down menu.

figure 2

Figure 2. When a search has been completed, the results tab is presented in a table format. The results in the table can be sorted depending on the column header (sequence, match, start, stop, confidence, and strand). The results can also be filtered by sequence and strand by selecting the drop down menus at the top.

Marass, F., & Upton, C. (2009). Sequence Searcher: A Java tool to perform regular expression and fuzzy searches of multiple DNA and protein sequences BMC Research Notes, 2 (1) DOI: 10.1186/1756-0500-2-14

Unexpected endogenous viruses

circovirus parvovirus genomeDuring the replication of retroviruses, a double-stranded DNA copy of the viral RNA genome is synthesized by reverse transcription and integrated into the genomes of the infected cell. When retroviral DNA is integrated into the DNA of germ line cells, it is passed on to future generations in Mendelian fashion as an endogenous provirus. Until very recently, retroviruses were the only known endogenous viruses. This honor has now been extended to other RNA viruses, and to circoviruses and parvoviruses, which possess single-stranded DNA genomes. Such integration events constitute a fossil record from which it is possible to determine the age of viruses.

The first non-retroviral endogenous virus described was bornavirus, a virus with a negative-stranded RNA genome. Bornaviral sequences were found in the genomes of humans, non-human primates, rodents, and elephants. Phylogenetic analyses revealed that these sequences entered the primate genome over 40 million years ago. Endogenous filovirus (ebolavirus, marburgvirus) sequences were subsequently identified in the genomes of bats, rodents, shrews, tenrecs and marsupials. Based on these analyses it was estimated that filoviruses are at least tens of millions of years old. The presence of endogenous bornavirus and filovirus sequences were subsequently confirmed and extended to 19 different vertebrate species. Endogenous hepadnaviruses probably entered the genome of the zebra finch 19 million years ago.

Recent additions to the endogenous virus catalog are the circoviruses and parvoviruses. The genome of circoviruses are composed of single-stranded DNA, while those of parvoviruses are linear single-stranded DNAs with base-paired ends (figure). Phylogenetic analyses of these endogenous viral sequences reveal that both virus families are 40 to 50 million years old. Examination of insect genomes has revealed endogenous viral sequences from members of the Bunyaviridae, Rhabdoviridae, Orthomyxoviridae, Reoviridae, and Flaviviridae.

With the exception of retroviruses, these endogenous viral sequences have no role in viral replication – they are accidentally integrated into host DNA. Such sequences are highly mutated and typically comprise only fragments of the viral genome, and therefore cannot give rise to infectious virus. Whether these sequences confer any biological advantage to the host is an interesting question. It is possible that some of the endogenous viral sequences are copied into RNA, or translated into protein, and could have consequences for the host. For example, it has been suggested that synthesis of the bornaviral N protein from endogenous sequences might render the host resistant to infection with bornaviruses.

How are non-retroviral genomes integrated into the host DNA? For viruses with an RNA genome, the nucleic acid must enter the nucleus (perhaps accidentally for viruses without a nuclear phase) and be converted to a DNA copy by reverse transcriptase encoded by endogenous retroviruses. Hepadnaviruses encode a reverse transcriptase which produces the genomic DNA from an RNA template. In all cases, recombination could lead to integration of viral DNA into the host chromosome.

Almost half of the human genome is made up of mobile genetic elements, which includes endogenous proviruses and other sequences derived from retroviruses such as retrotransposons, retroposons, and processed pseudogenes. It seems likely that even more diverse viral sequences lurk in cellular genomes, awaiting discovery.

Horie M, Honda T, Suzuki Y, Kobayashi Y, Daito T, Oshida T, Ikuta K, Jern P, Gojobori T, Coffin JM, & Tomonaga K (2010). Endogenous non-retroviral RNA virus elements in mammalian genomes. Nature, 463 (7277), 84-7 PMID: 20054395

Taylor DJ, Leach RW, & Bruenn J (2010). Filoviruses are ancient and integrated into mammalian genomes. BMC evolutionary biology, 10 PMID: 20569424

Belyi VA, Levine AJ, & Skalka AM (2010). Unexpected inheritance: multiple integrations of ancient bornavirus and ebolavirus/marburgvirus sequences in vertebrate genomes. PLoS pathogens, 6 (7) PMID: 20686665

Gilbert C, & Feschotte C (2010). Genomic fossils calibrate the long-term evolution of hepadnaviruses. PLoS biology, 8 (9) PMID: 20927357

Katzourakis A, & Gifford RJ (2010). Endogenous viral elements in animal genomes. PLoS genetics, 6 (11) PMID: 21124940

Belyi VA, Levine AJ, & Skalka AM (2010). Sequences from ancestral single-stranded DNA viruses in vertebrate genomes: the parvoviridae and circoviridae are more than 40 to 50 million years old. Journal of virology, 84 (23), 12458-62 PMID: 20861255

TWiV 106: Making viral DNA II

the 5prime end problemHosts: Vincent Racaniello, Dickson Despommier, and Rich Condit

On episode #106 of the podcast This Week in Virology, Vincent, Dickson, and Rich continue Virology 101 with a second installment of their discussion of how viruses with DNA genomes replicate their genetic information.

[powerpress url=”http://traffic.libsyn.com/twiv/TWiV106.mp3″]

Click the arrow above to play, or right-click to download TWiV #106 (69 MB .mp3, 95 minutes)

Subscribe to TWiV (free) in iTunes , at the Zune Marketplace, by the RSS feed, or by email, or listen on your mobile device with Stitcher Radio.

Links for this episode:

  • Figures for this episode (pdf)
  • Letters read on TWiV 106
  • Video of this episode – download .mov or .wmv or view below

Weekly Science Picks

Rich – Google Health
Dickson – The Neandertal genome
Vincent – Lab techniques videos (thanks, Erik!)

Send your virology questions and comments (email or mp3 file) to twiv@microbe.tv or leave voicemail at Skype: twivpodcast. You can also post articles that you would like us to discuss at microbeworld.org and tag them with twiv.

TWiV 96: Making viral DNA

Hosts: Vincent Racaniello, Dickson Despommier, and Rich Condit

On episode #96 of the podcast This Week in Virology, Vincent, Dickson, and Rich continue Virology 101 with a discussion of how viruses with DNA genomes replicate their genetic information.

[powerpress url=”http://traffic.libsyn.com/twiv/TWiV096.mp3″]

Click the arrow above to play, or right-click to download TWiV #96 (65 MB .mp3, 90 minutes)

Subscribe to TWiV (free) in iTunes , at the Zune Marketplace, by the RSS feed, or by email, or listen on your mobile device with Stitcher Radio.

Links for this episode:

Weekly Science Picks

Rich – Breast milk sugars give infants a protective coat (NY Times and PNAS article)
Vincent – The Great American University by Jonathan R. Cole

Send your virology questions and comments (email or mp3 file) to twiv@microbe.tv or leave voicemail at Skype: twivpodcast. You can also post articles that you would like us to discuss at microbeworld.org and tag them with twiv.