» Articles » PMID: 23555553

Class Prediction and Feature Selection with Linear Optimization for Metagenomic Count Data

Overview
Journal PLoS One
Date 2013 Apr 5
PMID 23555553
Citations 4
Authors
Affiliations
Soon will be listed here.
Abstract

The amount of metagenomic data is growing rapidly while the computational methods for metagenome analysis are still in their infancy. It is important to develop novel statistical learning tools for the prediction of associations between bacterial communities and disease phenotypes and for the detection of differentially abundant features. In this study, we presented a novel statistical learning method for simultaneous association prediction and feature selection with metagenomic samples from two or multiple treatment populations on the basis of count data. We developed a linear programming based support vector machine with L(1) and joint L(1,∞) penalties for binary and multiclass classifications with metagenomic count data (metalinprog). We evaluated the performance of our method on several real and simulation datasets. The proposed method can simultaneously identify features and predict classes with the metagenomic count data.

Citing Articles

Bacterial clade-specific analysis identifies distinct epithelial responses in inflammatory bowel disease.

DAdamo G, Chonwerawong M, Gearing L, Marcelino V, Gould J, Rutten E Cell Rep Med. 2023; 4(7):101124.

PMID: 37467722 PMC: 10394256. DOI: 10.1016/j.xcrm.2023.101124.


Non-invasive monitoring of multiple wildlife health factors by fecal microbiome analysis.

Pannoni S, Proffitt K, Holben W Ecol Evol. 2022; 12(2):e8564.

PMID: 35154651 PMC: 8826075. DOI: 10.1002/ece3.8564.


Sparse support vector machines with L approximation for ultra-high dimensional omics data.

Liu Z, Elashoff D, Piantadosi S Artif Intell Med. 2019; 96:134-141.

PMID: 31164207 PMC: 6553498. DOI: 10.1016/j.artmed.2019.04.004.


Opportunities and obstacles for deep learning in biology and medicine.

Ching T, Himmelstein D, Beaulieu-Jones B, Kalinin A, Do B, Way G J R Soc Interface. 2018; 15(141).

PMID: 29618526 PMC: 5938574. DOI: 10.1098/rsif.2017.0387.

References
1.
Liu Z, Hsiao W, Cantarel B, Drabek E, Fraser-Liggett C . Sparse distance-based learning for simultaneous multiclass classification and feature selection of metagenomic data. Bioinformatics. 2011; 27(23):3242-9. PMC: 3223360. DOI: 10.1093/bioinformatics/btr547. View

2.
White J, Nagarajan N, Pop M . Statistical methods for detecting differentially abundant features in clinical metagenomic samples. PLoS Comput Biol. 2009; 5(4):e1000352. PMC: 2661018. DOI: 10.1371/journal.pcbi.1000352. View

3.
Schloss P, Westcott S, Ryabin T, Hall J, Hartmann M, Hollister E . Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol. 2009; 75(23):7537-41. PMC: 2786419. DOI: 10.1128/AEM.01541-09. View

4.
Fierer N, Lauber C, Zhou N, McDonald D, Costello E, Knight R . Forensic identification using skin bacterial communities. Proc Natl Acad Sci U S A. 2010; 107(14):6477-81. PMC: 2852011. DOI: 10.1073/pnas.1000162107. View

5.
Liu Z, Lin S, Tan M . Sparse support vector machines with Lp penalty for biomarker identification. IEEE/ACM Trans Comput Biol Bioinform. 2010; 7(1):100-7. DOI: 10.1109/TCBB.2008.17. View