» Articles » PMID: 27136190

Feature Selection and Cancer Classification Via Sparse Logistic Regression with the Hybrid L1/2 +2 Regularization

Overview
Journal PLoS One
Date 2016 May 3
PMID 27136190
Citations 18
Authors
Affiliations
Soon will be listed here.
Abstract

Cancer classification and feature (gene) selection plays an important role in knowledge discovery in genomic data. Although logistic regression is one of the most popular classification methods, it does not induce feature selection. In this paper, we presented a new hybrid L1/2 +2 regularization (HLR) function, a linear combination of L1/2 and L2 penalties, to select the relevant gene in the logistic regression. The HLR approach inherits some fascinating characteristics from L1/2 (sparsity) and L2 (grouping effect where highly correlated variables are in or out a model together) penalties. We also proposed a novel univariate HLR thresholding approach to update the estimated coefficients and developed the coordinate descent algorithm for the HLR penalized logistic regression model. The empirical results and simulations indicate that the proposed method is highly competitive amongst several state-of-the-art methods.

Citing Articles

Deep-GenMut: Automated genetic mutation classification in oncology: A deep learning comparative study.

Elsamahy E, Ahmed A, Shoala T, Maghraby F Heliyon. 2024; 10(11):e32279.

PMID: 38912449 PMC: 11190593. DOI: 10.1016/j.heliyon.2024.e32279.


Chronic Obstructive Pulmonary Disease: Novel Genes Detection with Penalized Logistic Regression.

Gohari K, Kazemnejad A, Mostafaei S, Saberi S, Sheidaei A Cell J. 2023; 25(3):203-211.

PMID: 37038700 PMC: 10105299. DOI: 10.22074/cellj.2022.557389.1048.


Prediction models with graph kernel regularization for network data.

Liu J, Chen H, Yang Y J Appl Stat. 2023; 50(6):1400-1417.

PMID: 37025276 PMC: 10071950. DOI: 10.1080/02664763.2022.2028745.


Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization.

Peixoto C, Lopes M, Martins M, Casimiro S, Sobral D, Grosso A BMC Bioinformatics. 2023; 24(1):17.

PMID: 36647008 PMC: 9841719. DOI: 10.1186/s12859-022-05104-z.


A novel meta-analysis based on data augmentation and elastic data shared lasso regularization for gene expression.

Huang H, Rao H, Miao R, Liang Y BMC Bioinformatics. 2022; 23(Suppl 10):353.

PMID: 35999505 PMC: 9396780. DOI: 10.1186/s12859-022-04887-5.


References
1.
Kumar M, Hancock D, Molina-Arcas M, Steckel M, East P, Diefenbacher M . The GATA2 transcriptional network is requisite for RAS oncogene-driven non-small cell lung cancer. Cell. 2012; 149(3):642-55. DOI: 10.1016/j.cell.2012.02.059. View

2.
Friedman J, Hastie T, Tibshirani R . Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010; 33(1):1-22. PMC: 2929880. View

3.
Singh D, Febbo P, Ross K, Jackson D, Manola J, Ladd C . Gene expression correlates of clinical prostate cancer behavior. Cancer Cell. 2002; 1(2):203-9. DOI: 10.1016/s1535-6108(02)00030-2. View

4.
Shipp M, Ross K, Tamayo P, Weng A, Kutok J, Aguiar R . Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med. 2002; 8(1):68-74. DOI: 10.1038/nm0102-68. View

5.
Segal M, Dahlquist K, Conklin B . Regression approaches for microarray data analysis. J Comput Biol. 2004; 10(6):961-80. DOI: 10.1089/106652703322756177. View