» Articles » PMID: 23892401

Utilizing Sequence Intrinsic Composition to Classify Protein-coding and Long Non-coding Transcripts

Overview
Specialty Biochemistry
Date 2013 Jul 30
PMID 23892401
Citations 947
Authors
Affiliations
Soon will be listed here.
Abstract

It is a challenge to classify protein-coding or non-coding transcripts, especially those re-constructed from high-throughput sequencing data of poorly annotated species. This study developed and evaluated a powerful signature tool, Coding-Non-Coding Index (CNCI), by profiling adjoining nucleotide triplets to effectively distinguish protein-coding and non-coding sequences independent of known annotations. CNCI is effective for classifying incomplete transcripts and sense-antisense pairs. The implementation of CNCI offered highly accurate classification of transcripts assembled from whole-transcriptome sequencing data in a cross-species manner, that demonstrated gene evolutionary divergence between vertebrates, and invertebrates, or between plants, and provided a long non-coding RNA catalog of orangutan. CNCI software is available at http://www.bioinfo.org/software/cnci.

Citing Articles

Interaction between the liver transcriptome and gut microbiota in mice during Toxoplasma gondii infection as identified by integrated transcriptomic and microbiome analysis.

Zou Y, Ma H, Yang X, Wei X, Chen C, Jiang J BMC Microbiol. 2025; 25(1):137.

PMID: 40087603 DOI: 10.1186/s12866-025-03852-5.


The long non-coding RNA MSTRG.32189-PcmiR399b- module regulates phosphate accumulation and disease resistance to in pear.

Yang Y, Lv S, Huang X, He Y, Zhang X, Liu Y Hortic Res. 2025; 12(4):uhae359.

PMID: 40066158 PMC: 11891480. DOI: 10.1093/hr/uhae359.


Comprehensive characterization of lncRNA N-methyladenosine modification dynamics throughout bovine skeletal muscle development.

Mao C, You W, Yang Y, Cheng H, Hu X, Lan X J Anim Sci Biotechnol. 2025; 16(1):36.

PMID: 40045371 PMC: 11884139. DOI: 10.1186/s40104-025-01164-2.


LTR retrotransposon-derived novel lncRNA2 enhances cold tolerance in Moso bamboo by modulating antioxidant activity and photosynthetic efficiency.

Zhao J, Ding Y, Ramakrishnan M, Zou L, Chen Y, Zhou M PeerJ. 2025; 13:e19056.

PMID: 40028216 PMC: 11871892. DOI: 10.7717/peerj.19056.


Full-length transcriptome analysis of a bloom-forming dinoflagellate Scrippsiella acuminata (Dinophyceae).

Li F, Yue C, Deng Y, Tang Y Sci Data. 2025; 12(1):352.

PMID: 40016213 PMC: 11868372. DOI: 10.1038/s41597-025-04699-1.


References
1.
Trapnell C, Williams B, Pertea G, Mortazavi A, Kwan G, van Baren M . Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010; 28(5):511-5. PMC: 3146043. DOI: 10.1038/nbt.1621. View

2.
Kersey P, Staines D, Lawson D, Kulesha E, Derwent P, Humphrey J . Ensembl Genomes: an integrative resource for genome-scale data from non-vertebrate species. Nucleic Acids Res. 2011; 40(Database issue):D91-7. PMC: 3245118. DOI: 10.1093/nar/gkr895. View

3.
Guo X, Gao L, Liao Q, Xiao H, Ma X, Yang X . Long non-coding RNAs function annotation: a global prediction method based on bi-colored networks. Nucleic Acids Res. 2012; 41(2):e35. PMC: 3554231. DOI: 10.1093/nar/gks967. View

4.
. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012; 489(7414):57-74. PMC: 3439153. DOI: 10.1038/nature11247. View

5.
Burge C, Karlin S . Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997; 268(1):78-94. DOI: 10.1006/jmbi.1997.0951. View