The development of high throughput nucleic acid sequencing tools has rapidly increased the pace of virus discovery in the past 20 years. Yet in that time, while the largest DNA genomes have increased by nearly ten times, the largest known RNA viral genome has only increase in size by a tenth. This situation has now changed with the discovery of new RNA viruses of planarians and mollusks.
Until very recently, the biggest RNA virus genome known was 33.5 kb (ball python nidovirus), which is much larger than the average sized RNA virus genome of 10 kb. The reason for the difference is that RNA polymerases make errors, and most do not have proofreading capabilities. The nidoviruses encode a proofreading exoribonuclease which improves replication fidelity and allows for larger genomes. But how much larger can these genomes get?
Even with a proofreading enzyme, the size of the biggest RNA virus genome is much smaller than the minimal cellular DNA genome, which is 200 kb. Why the difference? Is the inherent fragility of RNA a problem? But in the RNA World – when self-replicating RNA molecules existed before the evolution of DNA and proteins – didn’t RNA molecules achieve large sizes? Two new preprints show that we can find larger virus RNAs, suggesting that we have not yet reached the size limit.
A close study of the transcriptome of a planarian revealed a new nidovirus with an RNA genome of 41,103 nucleotides. Viral RNA was detected in cells of the secretory system in whole animals, leading to authors to call this virus Planarian Secretory Cell Nidovirus, or PSCNV. Virus-like particles were seen in planarians, but no attempts were made to isolate them and infect new animals.
The genome of PSCNV is unusual because it encodes a single, long open reading frame of 13,556 amino acids. It is the longest viral open reading frame (ORF) discovered so far. All the other known nidoviruses encode multiple open reading frames. Phylogenetic analysis of known nidoviruses suggests that PSCNV arose from viruses with multiple ORFs, after which their single ORF expanded in size.
The other nidovirus with a large RNA genome was discovered by searching all the available RNA sequences of the mollusk Aplysia californica. With a simple nervous system of 20,000 neurons, the California sea hare has been studied as a model system in many laboratories. Apylsia californica nido-like virus (AcNV) has an RNA genome of 35,906 nucleotides with ORFs that encode two polyproteins.
Analysis of transcriptome data from multiple cell types reveals that AcNV appears to infect multiple tissues, but with a particular predilection for the CNS.
From the perspective of genome size, the discovery of PSCNV and AcNV suggest that viruses with even larger RNAs remain to be discovered. In both cases the viruses were identified from sequences that had been deposited in public databases. The obvious conclusion is that you get what you look for. Nevertheless, many organisms have not yet had their genomes sequenced and it is likely that many RNA viruses remain to be discovered in them. Declaring an upper limit on RNA genome size no longer seems reasonable if we have not sampled every species.
I wonder if PSCNV and AcNV have any effects on their hosts. PSCNV particles were observed in laboratory stocks of planarians, and sequences of both viruses were found in apparently normal animals. Can uninfected planarians or Aplysia be found, and if so what would be the effect of adding virus to them? This experiment would require obtaining virus stocks which so far has not reported for either isolate. It’s time for virologists to step in and do their magic.