» Articles » PMID: 19772600

A Comparison of Classification Methods for Predicting Chronic Fatigue Syndrome Based on Genetic Data

Overview
Journal J Transl Med
Publisher Biomed Central
Date 2009 Sep 24
PMID 19772600
Citations 34
Authors
Affiliations
Soon will be listed here.
Abstract

Background: In the studies of genomics, it is essential to select a small number of genes that are more significant than the others for the association studies of disease susceptibility. In this work, our goal was to compare computational tools with and without feature selection for predicting chronic fatigue syndrome (CFS) using genetic factors such as single nucleotide polymorphisms (SNPs).

Methods: We employed the dataset that was original to the previous study by the CDC Chronic Fatigue Syndrome Research Group. To uncover relationships between CFS and SNPs, we applied three classification algorithms including naive Bayes, the support vector machine algorithm, and the C4.5 decision tree algorithm. Furthermore, we utilized feature selection methods to identify a subset of influential SNPs. One was the hybrid feature selection approach combining the chi-squared and information-gain methods. The other was the wrapper-based feature selection method.

Results: The naive Bayes model with the wrapper-based approach performed maximally among predictive models to infer the disease susceptibility dealing with the complex relationship between CFS and SNPs.

Conclusion: We demonstrated that our approach is a promising method to assess the associations between CFS and SNPs.

Citing Articles

Risk score prediction model based on single nucleotide polymorphism for predicting malaria: a machine learning approach.

Tai K, Dhaliwal J, Wong K BMC Bioinformatics. 2022; 23(1):325.

PMID: 35934714 PMC: 9358850. DOI: 10.1186/s12859-022-04870-0.


Machine Learning and Deep Learning for the Pharmacogenomics of Antidepressant Treatments.

Lin E, Lin C, Lane H Clin Psychopharmacol Neurosci. 2021; 19(4):577-588.

PMID: 34690113 PMC: 8553527. DOI: 10.9758/cpn.2021.19.4.577.


Deep Learning with Neuroimaging and Genomics in Alzheimer's Disease.

Lin E, Lin C, Lane H Int J Mol Sci. 2021; 22(15).

PMID: 34360676 PMC: 8347529. DOI: 10.3390/ijms22157911.


Prediction of Probable Major Depressive Disorder in the Taiwan Biobank: An Integrated Machine Learning and Genome-Wide Analysis Approach.

Lin E, Kuo P, Lin W, Liu Y, Yang A, Tsai S J Pers Med. 2021; 11(7).

PMID: 34202750 PMC: 8308113. DOI: 10.3390/jpm11070597.


Prediction of functional outcomes of schizophrenia with genetic biomarkers using a bagging ensemble machine learning method with feature selection.

Lin E, Lin C, Lane H Sci Rep. 2021; 11(1):10179.

PMID: 33986383 PMC: 8119477. DOI: 10.1038/s41598-021-89540-6.


References
1.
Aliferis C, Statnikov A, Tsamardinos I, Schildcrout J, Shepherd B, Harrell Jr F . Factors influencing the statistical power of complex data analysis protocols for molecular signature development from microarray data. PLoS One. 2009; 4(3):e4922. PMC: 2654113. DOI: 10.1371/journal.pone.0004922. View

2.
Lee K, Sha N, Dougherty E, Vannucci M, Mallick B . Gene selection: a Bayesian variable selection approach. Bioinformatics. 2002; 19(1):90-7. DOI: 10.1093/bioinformatics/19.1.90. View

3.
Lin E, Hwang Y, Tzeng C . A case study of the utility of the HapMap database for pharmacogenomic haplotype analysis in the Taiwanese population. Mol Diagn Ther. 2006; 10(6):367-70. DOI: 10.1007/BF03256213. View

4.
Lin E, Hwang Y . A support vector machine approach to assess drug efficacy of interferon-alpha and ribavirin combination therapy. Mol Diagn Ther. 2008; 12(4):219-23. DOI: 10.1007/BF03256287. View

5.
Lin E, Hwang Y, Liang K, Chen E . Pattern-recognition techniques with haplotype analysis in pharmacogenomics. Pharmacogenomics. 2006; 8(1):75-83. DOI: 10.2217/14622416.8.1.75. View