» Articles » PMID: 34494086

Feature Selection for Support Vector Regression Using a Genetic Algorithm

Overview
Journal Biostatistics
Specialty Public Health
Date 2021 Sep 8
PMID 34494086
Citations 2
Authors
Affiliations
Soon will be listed here.
Abstract

Support vector regression (SVR) is particularly beneficial when the outcome and predictors are nonlinearly related. However, when many covariates are available, the method's flexibility can lead to overfitting and an overall loss in predictive accuracy. To overcome this drawback, we develop a feature selection method for SVR based on a genetic algorithm that iteratively searches across potential subsets of covariates to find those that yield the best performance according to a user-defined fitness function. We evaluate the performance of our feature selection method for SVR, comparing it to alternate methods including LASSO and random forest, in a simulation study. We find that our method yields higher predictive accuracy than SVR without feature selection. Our method outperforms LASSO when the relationship between covariates and outcome is nonlinear. Random forest performs equivalently to our method in some scenarios, but more poorly when covariates are correlated. We apply our method to predict donor kidney function 1 year after transplant using data from the United Network for Organ Sharing national registry.

Citing Articles

Machine learning-driven prediction of medical expenses in triple-vessel PCI patients using feature selection.

Chen K, Huang Y, Liu C, Li S, Chen M BMC Health Serv Res. 2025; 25(1):105.

PMID: 39833782 PMC: 11744989. DOI: 10.1186/s12913-025-12218-6.


Development and validation of an interpretable machine learning for mortality prediction in patients with sepsis.

He B, Qiu Z Front Artif Intell. 2024; 7:1348907.

PMID: 39040922 PMC: 11262051. DOI: 10.3389/frai.2024.1348907.


Identification and verification of diagnostic biomarkers based on mitochondria-related genes related to immune microenvironment for preeclampsia using machine learning algorithms.

Huang P, Song Y, Yang Y, Bai F, Li N, Liu D Front Immunol. 2024; 14:1304165.

PMID: 38259465 PMC: 10800455. DOI: 10.3389/fimmu.2023.1304165.

References
1.
Yang J, Ong C . Feature selection using probabilistic prediction of support vector regression. IEEE Trans Neural Netw. 2011; 22(6):954-62. DOI: 10.1109/TNN.2011.2128342. View

2.
Ibrahim H, Foley R, Tan L, Rogers T, Bailey R, Guo H . Long-term consequences of kidney donation. N Engl J Med. 2009; 360(5):459-69. PMC: 3559132. DOI: 10.1056/NEJMoa0804883. View

3.
Dasgupta S, Huang Y . Selecting Biomarkers for building optimal treatment selection rules using Kernel Machines. J R Stat Soc Ser C Appl Stat. 2020; 69(1):69-88. PMC: 7485396. DOI: 10.1111/rssc.12379. View

4.
Peng S, Xu Q, Ling X, Peng X, Du W, Chen L . Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines. FEBS Lett. 2003; 555(2):358-62. DOI: 10.1016/s0014-5793(03)01275-4. View

5.
Dasgupta S, Goldberg Y, Kosorok M . FEATURE ELIMINATION IN KERNEL MACHINES IN MODERATELY HIGH DIMENSIONS. Ann Stat. 2018; 47(1):497-526. PMC: 6294291. DOI: 10.1214/18-AOS1696. View