» Articles » PMID: 34518686

A Guide to Machine Learning for Biologists

Overview
Date 2021 Sep 14
PMID 34518686
Citations 464
Authors
Affiliations
Soon will be listed here.
Abstract

The expanding scale and inherent complexity of biological data have encouraged a growing use of machine learning in biology to build informative and predictive models of the underlying biological processes. All machine learning techniques fit models to data; however, the specific methods are quite varied and can at first glance seem bewildering. In this Review, we aim to provide readers with a gentle introduction to a few key machine learning techniques, including the most recently developed and widely used techniques involving deep neural networks. We describe how different techniques may be suited to specific types of biological data, and also discuss some best practices and points to consider when one is embarking on experiments involving machine learning. Some emerging directions in machine learning methodology are also discussed.

Citing Articles

Machine learning reveals glycolytic key gene in gastric cancer prognosis.

Li N, Zhang Y, Zhang Q, Jin H, Han M, Guo J Sci Rep. 2025; 15(1):8688.

PMID: 40082583 PMC: 11906761. DOI: 10.1038/s41598-025-93512-5.


Risk factors and prediction model of metabolic disorders in adult patients with pituitary stalk interruption syndrome.

Jiang D, Wang S, Xiao Y, Zhi P, Zheng E, Lyu Z Sci Rep. 2025; 15(1):7740.

PMID: 40044792 PMC: 11882961. DOI: 10.1038/s41598-025-91461-7.


Deep learning for hepatocellular carcinoma recurrence before and after liver transplantation: a multicenter cohort study.

Cao S, Yu S, Huang L, Seery S, Xia Y, Zhao Y Sci Rep. 2025; 15(1):7730.

PMID: 40044774 PMC: 11882823. DOI: 10.1038/s41598-025-91728-z.


The landscape of cell lineage tracing.

Feng Y, Liu G, Li H, Cheng L Sci China Life Sci. 2025; .

PMID: 40035969 DOI: 10.1007/s11427-024-2751-6.


Unifying fragmented perspectives with additive deep learning for high-dimensional models from partial faceted datasets.

Wu Y, Wu P, Chambliss A, Wirtz D, Sun S NPJ Biol Phys Mech. 2025; 2(1):5.

PMID: 40012561 PMC: 11850287. DOI: 10.1038/s44341-025-00009-3.


References
1.
Ching T, Himmelstein D, Beaulieu-Jones B, Kalinin A, Do B, Way G . Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface. 2018; 15(141). PMC: 5938574. DOI: 10.1098/rsif.2017.0387. View

2.
Libbrecht M, Noble W . Machine learning applications in genetics and genomics. Nat Rev Genet. 2015; 16(6):321-32. PMC: 5204302. DOI: 10.1038/nrg3920. View

3.
Zou J, Huss M, Abid A, Mohammadi P, Torkamani A, Telenti A . A primer on deep learning in genomics. Nat Genet. 2018; 51(1):12-18. PMC: 11180539. DOI: 10.1038/s41588-018-0295-5. View

4.
Myszczynska M, Ojamies P, Lacoste A, Neil D, Saffari A, Mead R . Applications of machine learning to diagnosis and treatment of neurodegenerative diseases. Nat Rev Neurol. 2020; 16(8):440-456. DOI: 10.1038/s41582-020-0377-8. View

5.
Yang K, Wu Z, Arnold F . Machine-learning-guided directed evolution for protein engineering. Nat Methods. 2019; 16(8):687-694. DOI: 10.1038/s41592-019-0496-6. View