» Articles » PMID: 24987669

Clinic-genomic Association Mining for Colorectal Cancer Using Publicly Available Datasets

Overview
Journal Biomed Res Int
Publisher Wiley
Date 2014 Jul 3
PMID 24987669
Citations 5
Authors
Affiliations
Soon will be listed here.
Abstract

In recent years, a growing number of researchers began to focus on how to establish associations between clinical and genomic data. However, up to now, there is lack of research mining clinic-genomic associations by comprehensively analysing available gene expression data for a single disease. Colorectal cancer is one of the malignant tumours. A number of genetic syndromes have been proven to be associated with colorectal cancer. This paper presents our research on mining clinic-genomic associations for colorectal cancer under biomedical big data environment. The proposed method is engineered with multiple technologies, including extracting clinical concepts using the unified medical language system (UMLS), extracting genes through the literature mining, and mining clinic-genomic associations through statistical analysis. We applied this method to datasets extracted from both gene expression omnibus (GEO) and genetic association database (GAD). A total of 23,517 clinic-genomic associations between 139 clinical concepts and 7914 genes were obtained, of which 3474 associations between 31 clinical concepts and 1689 genes were identified as highly reliable ones. Evaluation and interpretation were performed using UMLS, KEGG, and Gephi, and potential new discoveries were explored. The proposed method is effective in mining valuable knowledge from available biomedical big data and achieves a good performance in bridging clinical data with genomic data for colorectal cancer.

Citing Articles

Text-mining in cancer research may help identify effective treatments.

Hsiao Y, Lu T Transl Lung Cancer Res. 2020; 8(Suppl 4):S460-S463.

PMID: 32038938 PMC: 6987358. DOI: 10.21037/tlcr.2019.12.20.


A Comparison of High Dimensional Variable Selection Methods with Missing Covariates in a Prostate Cancer Study.

Chen C, Zhao J, Miecznikowski J, Markatou M Commun Stat Case Stud Data Anal Appl. 2019; 4(2):82-95.

PMID: 31867439 PMC: 6924953. DOI: 10.1080/23737484.2018.1521315.


ERICH3 in Primary Cilia Regulates Cilium Formation and the Localisations of Ciliary Transport and Sonic Hedgehog Signaling Proteins.

Alsolami M, Kuhns S, Alsulami M, Blacque O Sci Rep. 2019; 9(1):16519.

PMID: 31712586 PMC: 6848114. DOI: 10.1038/s41598-019-52830-1.


Quantitative Proteomic Analysis of Human Airway Cilia Identifies Previously Uncharacterized Proteins of High Abundance.

Blackburn K, Bustamante-Marin X, Yin W, Goshe M, Ostrowski L J Proteome Res. 2017; 16(4):1579-1592.

PMID: 28282151 PMC: 5733142. DOI: 10.1021/acs.jproteome.6b00972.


Big Data and Comparative Effectiveness Research in Radiation Oncology: Synergy and Accelerated Discovery.

Trifiletti D, Showalter T Front Oncol. 2015; 5:274.

PMID: 26697409 PMC: 4672039. DOI: 10.3389/fonc.2015.00274.

References
1.
Wilhite S, Barrett T . Strategies to explore functional genomics data sets in NCBI's GEO database. Methods Mol Biol. 2011; 802:41-53. PMC: 3341798. DOI: 10.1007/978-1-61779-400-1_3. View

2.
Shimokawa K, Mogushi K, Shoji S, Hiraishi A, Ido K, Mizushima H . iCOD: an integrated clinical omics database based on the systems-pathology view of disease. BMC Genomics. 2010; 11 Suppl 4:S19. PMC: 3005924. DOI: 10.1186/1471-2164-11-S4-S19. View

3.
Jemal A, Siegel R, Ward E, Hao Y, Xu J, Thun M . Cancer statistics, 2009. CA Cancer J Clin. 2009; 59(4):225-49. DOI: 10.3322/caac.20006. View

4.
Rebhan M, Chalifa-Caspi V, Prilusky J, Lancet D . GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support. Bioinformatics. 1998; 14(8):656-64. DOI: 10.1093/bioinformatics/14.8.656. View

5.
Brazma A, Parkinson H, Sarkans U, Shojatalab M, Vilo J, Abeygunawardena N . ArrayExpress--a public repository for microarray gene expression data at the EBI. Nucleic Acids Res. 2003; 31(1):68-71. PMC: 165538. DOI: 10.1093/nar/gkg091. View