Epstein-Barr virus (EBV) is a human herpesvirus that infects more than 9 out of 10 adults. EBV was the first cancer-causing virus to be discovered when, in 1964, it was identified in a type of blood cancer known as Burkitt’s lymphoma. Since these early discoveries EBV has been found to play roles in many additional common and rare cancers (e.g. Hodgkin’s lymphomas, nasopharyngeal carcinomas, AIDs-related lymphomas, and more). EBV causes lifelong infections, known as latent infections, where the virus exists in a dormant state that produces few, or no, proteins in order to “hide” from the immune system. The virus does, however, produce a number of ribonucleic acids (RNAs) that affect host white blood cells in order to maintain viral infection. It is through these interactions, in part, that infected white blood cells may become cancerous.
A combination of sequence analysis, RNA structural bioinformatics, and RNA-Seq were used to discover novel conserved structured RNAs in EBV (“Genome-wide analyses of Epstein-Barr virus reveal conserved RNA structures and a novel stable intronic sequence RNA” Moss, W.N. and Steitz, J.A. BMC Genomics 2013, 14:543). I was able to identify most known EBV ncRNAs as well as 249 novel regions in the EBV genome that, when transcribed, fold into unusually thermodynamically stable structures that are evolutionarily conserved within EBV (and, in some cases, also conserved in other herpesviruses). Conserved and stable RNA structures are the hallmarks of functional ncRNAs; thus, our work suggests a substantial fraction (~30%) of the EBV genome may generate RNAs with non-coding functions.
Of these novel structured RNAs, two are of particular interest. Within a repeat region (the EBV W repeats) a small (81 nt) intron (occurring 7X) accumulates to very high levels in EBV-infected cells. Indeed, this small stable intronic sequence (sis)RNA is the third most abundant EBV-generated RNA during latent infection, is highly conserved in sequence and structure between EBV and related herpesviruses, and likely mediates important roles in latency and, possibly, in EBV-related cancers. Also in this region is a long (2,791 nt) intron that also accumulates to high levels and, interestingly, contains extensive global, stable and conserved RNA structure. This structure includes a massive (586 nt) RNA hairpin that is the site of Adenosine-to-Inosine (A-to-I) editing.
As latent infection is marked by limited protein production, RNA represents a massive untapped reservoir of potential therapeutic targets. Additional studies of EBV non-coding RNAs (e.g. the sisRNAs) will fully characterize their structures and functions, which will facilitate the development of RNA-targeting drugs. For example, additional knowledge of the sisRNA-1 and -2 structures will allow for the rational design of small-molecules, aptamers, or oligonucleotides that can bind and inhibit the functions of these likely important latency-associated RNAs.
Influenza virus is serious threat to human health and one of the major killers of the 20th century (over 50 million deaths). One aspect of influenza virus biology that has yet to be fully characterized is the role of RNA structure. Bioinformatics scans for putative conserved structure were made throughout influenza A coding regions (“Identification of Potential Conserved RNA Secondary Structure throughout Influenza A Coding Regions,” Moss, W.N., Priore, S.F., and Turner, D.H., RNA (2011), 17, 991-1011). Predictions were based on nucleotide and codon sequence analysis, coupled with the determination of regions that showed unusual thermodynamic stability. Five of the twenty predicted structural regions occurred at or near to known functional annotations and were then modeled based on the thousands of available influenza A sequences. When experimental data were obtained for one of the predicted regions, they greatly enriched our understanding of possible functions. In a region encompassing a splicing donor site, a pseudoknot or a hairpin was proposed based on the bioinformatics data.
Experimental analysis using biophysical and biochemical techniques revealed that both conformations are possible in solution, and that each conformation places splicing regulatory elements into very different structural contexts (“The 3′ Splice Site of Influenza A Segment 7 mRNA Can Exist in Two Conformations: A Pseudoknot and a Hairpin.” Moss, W.N., Dela-Moss, L.I., Kierzek, E., Kierzek, R., Priore, S.F., and Turner, D.H., PLoS ONE (2012), 7, e38323.). Experiments thus allowed us to make a functional hypothesis that the observed conformational switch could be influencing splicing, which has implications for medicine: e.g. by targeting these structures with small molecule or oligonucleotide drugs.
The methods used to discover these RNA structure in influenza A were based on RNA folding thermodynamics. An analysis of the stabilities of influenza A RNA uncovered some very interesting trends. In influenza A folding thermodynamics strongly favored structure in the (+) sense coding RNA, the (-) sense genomic RNA was found to be much less stable. Structure was globally conserved in four of the eight viral RNAs. Finally, the stability of structure was strongly dependent on the host-species specificity of the viral strain: e.g. avian strains were most stable, human strains were least stable, while swine fell in between. (“Influenza A Virus Coding Regions Exhibit Host-Specific Global Ordered RNA Structure.” Priore, S.F., Moss, W.N., and Turner, D.H., PLoS ONE 7(4): e35989). Interestingly, the replication temperatures in each host are 41, 37, and 33°C for avian, swine and human, respectively. This finding may facilitate the rational attenuation of viral strains for the production of vaccine; this my be accomplished by altering RNA structural stability.
The R2 Retrotransposon
The R2 retrotransposon is an RNA transposon found throughout insect species. It site specifically inserts its genome into host 28S rDNA loci. The methods of genome insertion, known as target primed reverse transcription, was first discovered in the R2 from the silk moth Bombyx mori. Durning this process a copy of the R2 encoded protein binds towards the 5′ end and mediates the insertion process.
RNA structure was analyzed in the 5′ R2 RNA protein binging site (for the encoded R2 protein). Unlike influenza virus, where there are thousands of sequences, there are only five available R2 sequences and incorporating experimental data was vitally important (“Secondary Structures for 5′ Regions of R2 Retrotransposon RNAs Reveal a Novel Conserved Pseudoknot and Regions that Evolve under Different Constraints,” Kierzek, E., Christensen, S.M., Eickbush, T.H., Kierzek, R., Turner, D.H., and Moss, W.N., J Mol Biol. (2009), 390,428-42.). In the R2 study, traditional techniques of biochemical structure probing were combined with a new method of interrogating structure via microarray technology and were applied towards the modeling. Interestingly, there were no conserved start codons in these sequences, and the regions where protein coding begins to be conserved occurs after the uncapped 5′ end of the R2 RNA and numerous non-productive start and stop codons. The R2 structured region occurs at this transition point (from RNA only to protein coding only) and includes an interesting pseudoknot fold, which may be used in non-canonocal translation initiation. The structural information derived from this study was made available to the scientific community and the public at large by creating an entry in the Rfam database that was also tied-in to a Wikipedia page (“The R2 Retrotransposon RNA Families,” Moss, W.N., Eickbush, D.G., Lopez, M.J., Eickbush, T.H., and D. H. Turner, D.H. RNA Biology (2011), 8, 714-718).