» Articles » PMID: 34414351

Mikropml: User-Friendly R Package for Supervised Machine Learning Pipelines

Overview
Date 2021 Aug 20
PMID 34414351
Citations 20
Authors
Affiliations
Soon will be listed here.
Abstract

Machine learning (ML) for classification and prediction based on a set of features is used to make decisions in healthcare, economics, criminal justice and more. However, implementing an ML pipeline including preprocessing, model selection, and evaluation can be time-consuming, confusing, and difficult. Here, we present mikropml (prononced "meek-ROPE em el"), an easy-to-use R package that implements ML pipelines using regression, support vector machines, decision trees, random forest, or gradient-boosted trees. The package is available on GitHub, CRAN, and conda.

Citing Articles

Is Short-Read 16S rRNA Sequencing of Oral Microbiome Sampling a Suitable Diagnostic Tool for Head and Neck Cancer?.

Yeo K, Wu F, Li R, Smith E, Wormald P, Valentine R Pathogens. 2024; 13(10).

PMID: 39452698 PMC: 11510575. DOI: 10.3390/pathogens13100826.


A cross-cohort analysis of dental plaque microbiome in early childhood caries.

Khan M, Fung D, Schroth R, Chelikani P, Hu P iScience. 2024; 27(8):110447.

PMID: 39104404 PMC: 11298647. DOI: 10.1016/j.isci.2024.110447.


Gut community structure as a risk factor for infection in -colonized patients.

Vornhagen J, Rao K, Bachman M mSystems. 2024; 9(8):e0078624.

PMID: 38975759 PMC: 11334466. DOI: 10.1128/msystems.00786-24.


Seed Imbibition and Metabolism Contribute Differentially to Initial Assembly of the Soybean Holobiont.

Gerna D, Clara D, Antonielli L, Mitter B, Roach T Phytobiomes J. 2024; 8(1):21-33.

PMID: 38818306 PMC: 7616048. DOI: 10.1094/PBIOMES-03-23-0019-MF.


Identification of carbohydrate gene clusters obtained from in vitro fermentations as predictive biomarkers of prebiotic responses.

Kok C, Rose D, Cui J, Whisenhunt L, Hutkins R BMC Microbiol. 2024; 24(1):183.

PMID: 38796418 PMC: 11127362. DOI: 10.1186/s12866-024-03344-y.


References
1.
Pollard T, Chen I, Wiens J, Horng S, Wong D, Ghassemi M . Turning the crank for machine learning: ease, at what expense?. Lancet Digit Health. 2020; 1(5):e198-e199. DOI: 10.1016/S2589-7500(19)30112-8. View

2.
Friedman J, Hastie T, Tibshirani R . Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010; 33(1):1-22. PMC: 2929880. View

3.
Tang S, Davarmanesh P, Song Y, Koutra D, Sjoding M, Wiens J . Democratizing EHR analyses with FIDDLE: a flexible data-driven preprocessing pipeline for structured clinical data. J Am Med Inform Assoc. 2020; 27(12):1921-1934. PMC: 7727385. DOI: 10.1093/jamia/ocaa139. View

4.
Koster J, Rahmann S . Snakemake--a scalable bioinformatics workflow engine. Bioinformatics. 2012; 28(19):2520-2. DOI: 10.1093/bioinformatics/bts480. View

5.
Hagan A, Topcuoglu B, Gregory M, Barton H, Schloss P . Women Are Underrepresented and Receive Differential Outcomes at ASM Journals: a Six-Year Retrospective Analysis. mBio. 2020; 11(6). PMC: 7733940. DOI: 10.1128/mBio.01680-20. View