» Articles » PMID: 24651484

A Prostate Cancer Model Build by a Novel SVM-ID3 Hybrid Feature Selection Method Using Both Genotyping and Phenotype Data from DbGaP

Overview
Journal PLoS One
Date 2014 Mar 22
PMID 24651484
Citations 5
Authors
Affiliations
Soon will be listed here.
Abstract

Through Genome Wide Association Studies (GWAS) many Single Nucleotide Polymorphism (SNP)-complex disease relations can be investigated. The output of GWAS can be high in amount and high dimensional, also relations between SNPs, phenotypes and diseases are most likely to be nonlinear. In order to handle high volume-high dimensional data and to be able to find the nonlinear relations we have utilized data mining approaches and a hybrid feature selection model of support vector machine and decision tree has been designed. The designed model is tested on prostate cancer data and for the first time combined genotype and phenotype information is used to increase the diagnostic performance. We were able to select phenotypic features such as ethnicity and body mass index, and SNPs those map to specific genes such as CRR9, TERT. The performance results of the proposed hybrid model, on prostate cancer dataset, with 90.92% of sensitivity and 0.91 of area under ROC curve, shows the potential of the approach for prediction and early detection of the prostate cancer.

Citing Articles

Identifying Critical States of Complex Diseases by Single-Sample Jensen-Shannon Divergence.

Yan J, Li P, Gao R, Li Y, Chen L Front Oncol. 2021; 11:684781.

PMID: 34150649 PMC: 8212786. DOI: 10.3389/fonc.2021.684781.


Precise diagnosis of three top cancers using dbGaP data.

Liu X, Liu X, Rong J, Gao F, Wu Y, Deng C Sci Rep. 2021; 11(1):823.

PMID: 33436913 PMC: 7804208. DOI: 10.1038/s41598-020-80832-x.


Building a genetic risk model for bipolar disorder from genome-wide association data with random forest algorithm.

Chuang L, Kuo P Sci Rep. 2017; 7:39943.

PMID: 28045094 PMC: 5206749. DOI: 10.1038/srep39943.


FHSA-SED: Two-Locus Model Detection for Genome-Wide Association Study with Harmony Search Algorithm.

Tuo S, Zhang J, Yuan X, Zhang Y, Liu Z PLoS One. 2016; 11(3):e0150669.

PMID: 27014873 PMC: 4807955. DOI: 10.1371/journal.pone.0150669.


Incorporation of personal single nucleotide polymorphism (SNP) data into a national level electronic health record for disease risk assessment, part 3: an evaluation of SNP incorporated national health information system of Turkey for prostate cancer.

Beyan T, Aydin Son Y JMIR Med Inform. 2015; 2(2):e21.

PMID: 25600087 PMC: 4288064. DOI: 10.2196/medinform.3560.

References
1.
Aksoy Y, Oral A, Aksoy H, Demirel A, Akcay F . PSA density and PSA transition zone density in the diagnosis of prostate cancer in PSA gray zone cases. Ann Clin Lab Sci. 2003; 33(3):320-3. View

2.
Kleinmann N, Zaorsky N, Showalter T, Gomella L, Lallas C, Trabulsi E . The effect of ethnicity and sexual preference on prostate-cancer-related quality of life. Nat Rev Urol. 2012; 9(5):258-65. DOI: 10.1038/nrurol.2012.56. View

3.
Musani S, Shriner D, Liu N, Feng R, Coffey C, Yi N . Detection of gene x gene interactions in genome-wide association studies of human population data. Hum Hered. 2007; 63(2):67-84. DOI: 10.1159/000099179. View

4.
Giovannucci E, Rimm E, Liu Y, Leitzmann M, Wu K, Stampfer M . Body mass index and risk of prostate cancer in U.S. health professionals. J Natl Cancer Inst. 2003; 95(16):1240-4. DOI: 10.1093/jnci/djg009. View

5.
Huang L, Hsu S, Lin E . A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data. J Transl Med. 2009; 7:81. PMC: 2765429. DOI: 10.1186/1479-5876-7-81. View