TWiV 394: Cards in a hand

Vincent and Alan speak with Erica Ollmann Saphire about her career and her work on understanding the functions of proteins of Ebolaviruses, Marburg virus, and other hemorrhagic fever viruses, at ASM Microbe 2016 in Boston, MA.

You can find TWiV #394 at, or listen or watch the video below.

Click arrow to play
Download TWiV 394 (65 MB .mp3, 89 min)
Subscribe (free): iTunesRSSemailGoogle Play Music

Become a patron of TWiV!

TWiV 220: Flu watches the clock while T7 gets a CAT scan

On episode #220 of the science show This Week in Virology, Vincent, Rich, Alan, and Kathy discuss regulation of influenza virus replication by splicing, and the bacteriophage T7 random walk.

You can find TWiV #220 at

TWiM 30: Unraveling melioidosis and insulin resistance

On episode #30 of the science show This Week in Microbiology, Vincent, Elio, and Michael review how a toxin from Burkholderia pseudomallei inhibits protein synthesis, and the role of the gut microbiome in modulating insulin resistance in mice lacking an innate immune sensor.

You can find TWiM #30 at

Viral bioinformatics: Sequence searcher

virology toolboxThis week’s addition to the virology toolbox was written by Chris Upton

Sequence Searcher is a Java program that allows users to search for specific sequence motifs in protein or DNA sequences. For example, it can be used to identify restriction enzyme cleavage sites or find similar sequence patterns among multiple sequences. Most searches run in a few seconds.

Sequence Searcher is part of the suite of programs available at the University of Victoria.

Help files:

Some of the key features of Sequence Searcher include:

  • Searching through multiple sequences
  • Use of regular expressions or fuzzy search patterns.
  • Searching for patterns on both strands of a DNA sequence
  • Graphical representation of results and ability to save search results
  • It can run on multiple computer platforms (Java)

For DNA, the searches are conducted by finding the motif within a sequence from the 5’ to 3’ end on the top strand. The searches are also processed from the 5’ to 3’ end of the bottom strand. As a result, bases are numbered from 1 starting at the 5’ at either the top or bottom strand.

Regular expression and fuzzy pattern searches are available:

Fuzzy searches provide the option for the program to allow a certain number of mismatches from a sequence input at any position.  Note that the maximum number of mismatches that the program allows is 40% of the length of the sequence motif.

Regular expression allows for inputs of precise motifs along with considerable user-specified flexibility at specific positions.

figure 1

Figure 1. The input tab is where you can import DNA or protein sequences (must be in FASTA format) and type in the specific pattern to search within in the sequence(s). The search type can be selected as “Regular expression” or “Fuzzy” by using the drop down menu.

figure 2

Figure 2. When a search has been completed, the results tab is presented in a table format. The results in the table can be sorted depending on the column header (sequence, match, start, stop, confidence, and strand). The results can also be filtered by sequence and strand by selecting the drop down menus at the top.

Marass, F., & Upton, C. (2009). Sequence Searcher: A Java tool to perform regular expression and fuzzy searches of multiple DNA and protein sequences BMC Research Notes, 2 (1) DOI: 10.1186/1756-0500-2-14

Viral bioinformatics: Introduction to multiple sequence alignment

This week’s addition to the virology toolbox was written by Chris Upton

Generating multiple sequence alignments (MSA) is one of the most commonly used bioinformatics techniques. The “sequences” to be compared can be DNA (promoters, genes, genomes) or proteins. Note that the length and number of sequences to be aligned has an impact on the methods (algorithms) that can be used; what is suitable for aligning 20 proteins probably won’t work for alignment of 5 poxvirus genomes (200 kb each).

Some useful links:

So you see, there lots of options (did you say: “too many!”?). Further confusion may arise because 1) the same algorithm may be used in many different software programs, and 2) referencing a software package may give no clue to the algorithm used. For many molecular biologists, Clustal is synonymous with sequence alignment. However, newer algorithms such as T-Coffee and MUSCLE are often offered in current software packages, and may be faster and more accurate.

Specialized alignment tools are almost always needed for long, genome sized DNA sequences.

In this set of posts, I’ll provide some information on favorite general MSA tools (that are free) that should be useful to the average molecular virologist. The lists noted above provide a multitude of tools, but many are for specific analyses.

Detection of antigens or antibodies by ELISA

A more rapid method than Western blot analysis to detect a specific protein in a cell, tissue, organ, or body fluid is enzyme-linked immunosorbent assay, or ELISA. This method, which does not require fractionation of the sample by gel electrophoresisis, is based on the property of proteins to readily bind to a plastic surface.

To detect viral proteins in serum or clinical samples, a capture antibody, directed against the protein, is linked to a solid support such as a plastic 96 well microtiter plate, or a bead. The clinical specimen is added, and if viral antigens are present, they will be captured by the bound antibody. The bound viral antigen is then detected by using a second antibody linked to an enzyme. A chromogenic molecule – one that is converted by the enzyme to an easily detectible product – is then added. The enzyme amplifies the signal because a single catalytic enzyme molecule can generate many product molecules.

To detect antibodies to viruses, viral protein is linked to the plastic support, and then the clinical specimen is added. If antibodies against the virus are present in the specimen, they will bind to the immobilized antigen. The bound antibodies are then detected by using a second antibody that binds to the first antibody.

ELISA is used in both experimental and diagnostic virology. It is a highly sensitive assay that can detect proteins at the picomolar to nanomolar range (10-12 to 10-9 moles per liter). It is the mainstay for the diagnosis of infections by many different viruses, including HIV-1, HTLV-1, adenovirus, and cytomegalovirus.

Influenza virus RNA: Translation into protein


figure 1

Let’s resume our discussion of the influenza virus genome. Last time we established that there are eight negative-stranded RNAs within the influenza virion, each coding for one or two proteins. Now we’ll consider how proteins are made from these RNAs.

Figure 1 shows influenza RNA segment 2, which encodes two proteins: PB1 and PB1-F2. The (-) strand viral RNA is copied to form a (+) strand mRNA, which in turn is used as a template for protein synthesis. Figure 2 (below) shows the nucleotide sequence of the first 180 bases of this mRNA.

The top line, mostly in small letters, is the nucleotide sequence of the viral mRNA. During translation this sequence is read in triplets, each of which specifies an amino acid (the one-letter code for amino acids is used here). Translation usually begins with an ATG which specifies the amino acid methionine; the next triplet, gat, specifies aspartic acid, and so on. Only the first 60 amino acids of the PB1 protein are shown; the protein contains a total of 758 amino acids.

Most of the influenza viral RNAs code for only one protein. However, RNA 2 (and two other RNAs) code for two proteins. In the case of RNA 2, the second protein is made by translation of what is known as an overlapping reading frame.

On the second line of the RNA sequence in figure 2 is an atg highlighted in red. You can see that this atg is not in the reading frame of the PB1 protein. However, it is the start codon for the second protein encoded in RNA 2, the PB1-F2 protein (F2 stands for frame 2, because the protein is translated from the second open reading frame). Figure 3 shows how PB1-F2 is translated. The sequence of the viral RNA is shown from the beginning, except that reading frame 1, which begins at the first ATG, is not translated. Rather, we have begun translation with the internal atg, which is in the second reading frame. This open reading frame encodes the PB1-F2 protein which, in this case, is 90 amino acids in length (its length varies in different isolates). The protein is much shorter than PB1 because translation stops at a termination codon (tga) long before the end of the RNA. Because PB1-F2 is encoded in reading frame 2, its amino acid sequence is completely different from that of PB1.

figure 2

figure 2

figure 3

figure 3

The sequences used for this example are from the 1918 H1N1 strain of influenza. Notice the amino acid of PB1-F2 which is highlighted in blue. This amino acid has an important role in the biological function of the protein, which we will consider in a future post.

My apologies if the figures and text are not optimally aligned. A blog post is not the optimal format for such information, but in the interest of time I have not explored other options. Suggestions for improvement are welcome.

Send your questions to