» Articles » PMID: 29234465

Ten Quick Tips for Machine Learning in Computational Biology

Overview
Journal BioData Min
Publisher Biomed Central
Specialty Biology
Date 2017 Dec 14
PMID 29234465
Citations 210
Authors
Affiliations
Soon will be listed here.
Abstract

Machine learning has become a pivotal tool for many projects in computational biology, bioinformatics, and health informatics. Nevertheless, beginners and biomedical researchers often do not have enough experience to run a data mining project effectively, and therefore can follow incorrect practices, that may lead to common mistakes or over-optimistic results. With this review, we present ten quick tips to take advantage of machine learning in any computational biology context, by avoiding some common errors that we observed hundreds of times in multiple bioinformatics projects. We believe our ten suggestions can strongly help any machine learning practitioner to carry on a successful project in computational biology and related sciences.

Citing Articles

Distinctively black names and mechanisms of discrimination: Evidence from the early 20th century.

Castro C, Warren J, Helgertz J Soc Sci Res. 2025; 126:103136.

PMID: 39909625 PMC: 11837947. DOI: 10.1016/j.ssresearch.2024.103136.


Research Trends and Dynamics in Single-cell RNA Sequencing for Musculoskeletal Diseases: A Scientometric and Visualization Study.

Cao S, Wei Y, Yue Y, Wang D, Xiong A, Yang J Int J Med Sci. 2025; 22(3):528-550.

PMID: 39898252 PMC: 11783068. DOI: 10.7150/ijms.104697.


[F]FDG PET/CT Radiomics in Cervical Cancer: A Systematic Review.

Hotton J, Beddok A, Moubtakir A, Papathanassiou D, Morland D Diagnostics (Basel). 2025; 15(1.

PMID: 39795593 PMC: 11720459. DOI: 10.3390/diagnostics15010065.


Should Artificial Intelligence Play a Durable Role in Biomedical Research and Practice?.

Bongrand P Int J Mol Sci. 2025; 25(24.

PMID: 39769135 PMC: 11676049. DOI: 10.3390/ijms252413371.


ApisTox: a new benchmark dataset for the classification of small molecules toxicity on honey bees.

Adamczyk J, Poziemski J, Siedlecki P Sci Data. 2025; 12(1):5.

PMID: 39747220 PMC: 11696378. DOI: 10.1038/s41597-024-04232-w.


References
1.
Tarca A, Carey V, Chen X, Romero R, Draghici S . Machine learning and its applications to biology. PLoS Comput Biol. 2007; 3(6):e116. PMC: 1904382. DOI: 10.1371/journal.pcbi.0030116. View

2.
Chicco D, Masseroli M . Software Suite for Gene and Protein Annotation Prediction and Similarity Search. IEEE/ACM Trans Comput Biol Bioinform. 2015; 12(4):837-43. DOI: 10.1109/TCBB.2014.2382127. View

3.
Statnikov A, Wang L, Aliferis C . A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinformatics. 2008; 9:319. PMC: 2492881. DOI: 10.1186/1471-2105-9-319. View

4.
Karimzadeh M, Hoffman M . Top considerations for creating bioinformatics software documentation. Brief Bioinform. 2017; 19(4):693-699. PMC: 6054259. DOI: 10.1093/bib/bbw134. View

5.
Prlic A, Procter J . Ten simple rules for the open development of scientific software. PLoS Comput Biol. 2012; 8(12):e1002802. PMC: 3516539. DOI: 10.1371/journal.pcbi.1002802. View