» Articles » PMID: 34828418

Zoo: Selecting Transcriptomic and Methylomic Biomarkers by Ensembling Animal-Inspired Swarm Intelligence Feature Selection Algorithms

Overview
Journal Genes (Basel)
Publisher MDPI
Date 2021 Nov 27
PMID 34828418
Citations 1
Authors
Affiliations
Soon will be listed here.
Abstract

Biological omics data such as transcriptomes and methylomes have the inherent "large p small n" paradigm, i.e., the number of features is much larger than that of the samples. A feature selection (FS) algorithm selects a subset of the transcriptomic or methylomic biomarkers in order to build a better prediction model. The hidden patterns in the FS solution space make it challenging to achieve a feature subset with satisfying prediction performances. Swarm intelligence (SI) algorithms mimic the target searching behaviors of various animals and have demonstrated promising capabilities in selecting features with good machine learning performances. Our study revealed that different SI-based feature selection algorithms contributed complementary searching capabilities in the FS solution space, and their collaboration generated a better feature subset than the individual SI feature selection algorithms. Nine SI-based feature selection algorithms were integrated to vote for the selected features, which were further refined by the dynamic recursive feature elimination framework. In most cases, the proposed Zoo algorithm outperformed the existing feature selection algorithms on transcriptomics and methylomics datasets.

Citing Articles

Machine Learning Methods for Survival Analysis with Clinical and Transcriptomics Data of Breast Cancer.

Doan L, Angione C, Occhipinti A Methods Mol Biol. 2022; 2553:325-393.

PMID: 36227551 DOI: 10.1007/978-1-0716-2617-7_16.

References
1.
Sreejith S, Khanna Nehemiah H, Kannan A . Clinical data classification using an enhanced SMOTE and chaotic evolutionary feature selection. Comput Biol Med. 2020; 126:103991. DOI: 10.1016/j.compbiomed.2020.103991. View

2.
LaBreche H, Nevins J, Huang E . Integrating factor analysis and a transgenic mouse model to reveal a peripheral blood predictor of breast tumors. BMC Med Genomics. 2011; 4:61. PMC: 3178481. DOI: 10.1186/1755-8794-4-61. View

3.
Rousseaux S, Debernardi A, Jacquiau B, Vitte A, Vesin A, Nagy-Mignotte H . Ectopic activation of germline and placental genes identifies aggressive metastasis-prone lung cancers. Sci Transl Med. 2013; 5(186):186ra66. PMC: 4818008. DOI: 10.1126/scitranslmed.3005723. View

4.
Chiesa M, Maioli G, Colombo G, Piacentini L . GARS: Genetic Algorithm for the identification of a Robust Subset of features in high-dimensional datasets. BMC Bioinformatics. 2020; 21(1):54. PMC: 7014945. DOI: 10.1186/s12859-020-3400-6. View

5.
Han Y, Huang L, Zhou F . A dynamic recursive feature elimination framework (dRFE) to further refine a set of OMIC biomarkers. Bioinformatics. 2021; 37(15):2183-2189. DOI: 10.1093/bioinformatics/btab055. View