» Articles » PMID: 25795417

MiRBoost: Boosting Support Vector Machines for MicroRNA Precursor Classification

Overview
Journal RNA
Specialty Molecular Biology
Date 2015 Mar 22
PMID 25795417
Citations 12
Authors
Affiliations
Soon will be listed here.
Abstract

Identification of microRNAs (miRNAs) is an important step toward understanding post-transcriptional gene regulation and miRNA-related pathology. Difficulties in identifying miRNAs through experimental techniques combined with the huge amount of data from new sequencing technologies have made in silico discrimination of bona fide miRNA precursors from non-miRNA hairpin-like structures an important topic in bioinformatics. Among various techniques developed for this classification problem, machine learning approaches have proved to be the most promising. However these approaches require the use of training data, which is problematic due to an imbalance in the number of miRNAs (positive data) and non-miRNAs (negative data), which leads to a degradation of their performance. In order to address this issue, we present an ensemble method that uses a boosting technique with support vector machine components to deal with imbalanced training data. Classification is performed following a feature selection on 187 novel and existing features. The algorithm, miRBoost, performed better in comparison with state-of-the-art methods on imbalanced human and cross-species data. It also showed the highest ability among the tested methods for discovering novel miRNA precursors. In addition, miRBoost was over 1400 times faster than the second most accurate tool tested and was significantly faster than most of the other tools. miRBoost thus provides a good compromise between prediction efficiency and execution time, making it highly suitable for use in genome-wide miRNA precursor prediction. The software miRBoost is available on our web server http://EvryRNA.ibisc.univ-evry.fr.

Citing Articles

Comparison and benchmark of deep learning methods for non-coding RNA classification.

Creux C, Zehraoui F, Radvanyi F, Tahi F PLoS Comput Biol. 2024; 20(9):e1012446.

PMID: 39264986 PMC: 11421803. DOI: 10.1371/journal.pcbi.1012446.


analysis of SARS-CoV-2 genomes: Insights from SARS encoded non-coding RNAs.

Periwal N, Bhardwaj U, Sarma S, Arora P, Sood V Front Cell Infect Microbiol. 2022; 12:966870.

PMID: 36519126 PMC: 9742375. DOI: 10.3389/fcimb.2022.966870.


Integrative genomic phylogeography reveals signs of mitonuclear incompatibility in a natural hybrid goby population.

Hirase S, Tezuka A, Nagano A, Sato M, Hosoya S, Kikuchi K Evolution. 2020; 75(1):176-194.

PMID: 33165944 PMC: 7898790. DOI: 10.1111/evo.14120.


A Brief Survey for MicroRNA Precursor Identification Using Machine Learning Methods.

Guan Z, Li S, Zhang Z, Zhang D, Yang H, Ding H Curr Genomics. 2020; 21(1):11-25.

PMID: 32655294 PMC: 7324890. DOI: 10.2174/1389202921666200214125102.


CL-PMI: A Precursor MicroRNA Identification Method Based on Convolutional and Long Short-Term Memory Networks.

Wang H, Ma Y, Dong C, Li C, Wang J, Liu D Front Genet. 2019; 10:967.

PMID: 31681416 PMC: 6798641. DOI: 10.3389/fgene.2019.00967.


References
1.
Lertampaiporn S, Thammarongtham C, Nukoolkit C, Kaewkamnerdpong B, Ruengjitchatchawalya M . Heterogeneous ensemble approach with discriminative features and modified-SMOTEbagging for pre-miRNA classification. Nucleic Acids Res. 2012; 41(1):e21. PMC: 3592496. DOI: 10.1093/nar/gks878. View

2.
Tempel S, Tahi F . A fast ab-initio method for predicting miRNA precursors in genomes. Nucleic Acids Res. 2012; 40(11):e80. PMC: 3367186. DOI: 10.1093/nar/gks146. View

3.
Agarwal S, Vaz C, Bhattacharya A, Srinivasan A . Prediction of novel precursor miRNAs using a context-sensitive hidden Markov model (CSHMM). BMC Bioinformatics. 2010; 11 Suppl 1:S29. PMC: 3009500. DOI: 10.1186/1471-2105-11-S1-S29. View

4.
Terai G, Komori T, Asai K, Kin T . miRRim: a novel system to find conserved miRNAs with high sensitivity and specificity. RNA. 2007; 13(12):2081-90. PMC: 2080609. DOI: 10.1261/rna.655107. View

5.
Lai E, Tomancak P, Williams R, Rubin G . Computational identification of Drosophila microRNA genes. Genome Biol. 2003; 4(7):R42. PMC: 193629. DOI: 10.1186/gb-2003-4-7-r42. View