LncDC: a Machine Learning-based Tool for Long Non-coding RNA Detection from RNA-Seq Data
Authors
Affiliations
Long non-coding RNAs (lncRNAs) play an essential role in diverse biological processes and disease development. Accurate classification of lncRNAs and mRNAs is important for the identification of tissue- or disease-specific lncRNAs. Here, we present our tool LncDC (Long non-coding RNA detection) that is able to accurately predict lncRNAs with an XGBoost model using features extracted from RNA sequences, secondary structures, and translated proteins. Benchmarking experiments showed that LncDC consistently outperformed six state-of-the-art tools in distinguishing lncRNAs from mRNAs. Notably, the use of sequence and secondary structure (SASS) k-mer score features and flexible ORF features improved the classification capability of LncDC. We anticipate that LncDC will definitely promote the discovery of more and novel disease-specific lncRNAs. LncDC is implemented in Python and freely available at https://github.com/lim74/LncDC .
Computational Resources for lncRNA Functions and Targetome.
Thakur A, Kumar M Methods Mol Biol. 2024; 2883:299-323.
PMID: 39702714 DOI: 10.1007/978-1-0716-4290-0_13.
Simulated Annealing for RNA Design with SIMARD.
Tsang H Methods Mol Biol. 2024; 2847:95-108.
PMID: 39312138 DOI: 10.1007/978-1-0716-4079-1_6.
Discovering the hidden function in fungal genomes.
Gervais N, Shapiro R Nat Commun. 2024; 15(1):8219.
PMID: 39300175 PMC: 11413187. DOI: 10.1038/s41467-024-52568-z.
Comparison and benchmark of deep learning methods for non-coding RNA classification.
Creux C, Zehraoui F, Radvanyi F, Tahi F PLoS Comput Biol. 2024; 20(9):e1012446.
PMID: 39264986 PMC: 11421803. DOI: 10.1371/journal.pcbi.1012446.
Chaudhary U, Banerjee S ACS Pharmacol Transl Sci. 2024; 7(7):1901-1915.
PMID: 39022352 PMC: 11249652. DOI: 10.1021/acsptsci.3c00388.