Evolutionary Modeling and Prediction of Non-coding RNAs in Drosophila
Overview
Affiliations
We performed benchmarks of phylogenetic grammar-based ncRNA gene prediction, experimenting with eight different models of structural evolution and two different programs for genome alignment. We evaluated our models using alignments of twelve Drosophila genomes. We find that ncRNA prediction performance can vary greatly between different gene predictors and subfamilies of ncRNA gene. Our estimates for false positive rates are based on simulations which preserve local islands of conservation; using these simulations, we predict a higher rate of false positives than previous computational ncRNA screens have reported. Using one of the tested prediction grammars, we provide an updated set of ncRNA predictions for D. melanogaster and compare them to previously-published predictions and experimental data. Many of our predictions show correlations with protein-coding genes. We found significant depletion of intergenic predictions near the 3' end of coding regions and furthermore depletion of predictions in the first intron of protein-coding genes. Some of our predictions are colocated with larger putative unannotated genes: for example, 17 of our predictions showing homology to the RFAM family snoR28 appear in a tandem array on the X chromosome; the 4.5 Kbp spanned by the predicted tandem array is contained within a FlyBase-annotated cDNA.
Identification and characterization of novel conserved RNA structures in Drosophila.
Kirsch R, Seemann S, Ruzzo W, Cohen S, Stadler P, Gorodkin J BMC Genomics. 2018; 19(1):899.
PMID: 30537930 PMC: 6288889. DOI: 10.1186/s12864-018-5234-4.
Chen B, Zhang Y, Zhang X, Jia S, Chen S, Kang L Sci Rep. 2016; 6:23330.
PMID: 26996731 PMC: 4800424. DOI: 10.1038/srep23330.
Computational analysis of conserved RNA secondary structure in transcriptomes and genomes.
Eddy S Annu Rev Biophys. 2014; 43:433-56.
PMID: 24895857 PMC: 5541781. DOI: 10.1146/annurev-biophys-051013-022950.
Structure-based whole-genome realignment reveals many novel noncoding RNAs.
Will S, Yu M, Berger B Genome Res. 2013; 23(6):1018-27.
PMID: 23296921 PMC: 3668356. DOI: 10.1101/gr.137091.111.
Developing and applying heterogeneous phylogenetic models with XRate.
Westesson O, Holmes I PLoS One. 2012; 7(6):e36898.
PMID: 22693624 PMC: 3367922. DOI: 10.1371/journal.pone.0036898.