» Articles » PMID: 25392685

Text Mining in Cancer Gene and Pathway Prioritization

Overview
Journal Cancer Inform
Publisher Sage Publications
Date 2014 Nov 14
PMID 25392685
Citations 17
Authors
Affiliations
Soon will be listed here.
Abstract

Prioritization of cancer implicated genes has received growing attention as an effective way to reduce wet lab cost by computational analysis that ranks candidate genes according to the likelihood that experimental verifications will succeed. A multitude of gene prioritization tools have been developed, each integrating different data sources covering gene sequences, differential expressions, function annotations, gene regulations, protein domains, protein interactions, and pathways. This review places existing gene prioritization tools against the backdrop of an integrative Omic hierarchy view toward cancer and focuses on the analysis of their text mining components. We explain the relatively slow progress of text mining in gene prioritization, identify several challenges to current text mining methods, and highlight a few directions where more effective text mining algorithms may improve the overall prioritization task and where prioritizing the pathways may be more desirable than prioritizing only genes.

Citing Articles

Text mining of verbal autopsy narratives to extract mortality causes and most prevalent diseases using natural language processing.

Mapundu M, Kabudula C, Musenge E, Olago V, Celik T PLoS One. 2024; 19(9):e0308452.

PMID: 39298425 PMC: 11412533. DOI: 10.1371/journal.pone.0308452.


Machine Learning for Lung Cancer Diagnosis, Treatment, and Prognosis.

Li Y, Wu X, Yang P, Jiang G, Luo Y Genomics Proteomics Bioinformatics. 2022; 20(5):850-866.

PMID: 36462630 PMC: 10025752. DOI: 10.1016/j.gpb.2022.11.003.


Deep learning for cancer type classification and driver gene identification.

Zeng Z, Mao C, Vo A, Li X, Nugent J, Khan S BMC Bioinformatics. 2021; 22(Suppl 4):491.

PMID: 34689757 PMC: 8543824. DOI: 10.1186/s12859-021-04400-4.


DES-ROD: Exploring Literature to Develop New Links between RNA Oxidation and Human Diseases.

Essack M, Salhi A, Van Neste C, Raies A, Tifratene F, Uludag M Oxid Med Cell Longev. 2020; 2020:5904315.

PMID: 32308806 PMC: 7142358. DOI: 10.1155/2020/5904315.


Cancer classification and pathway discovery using non-negative matrix factorization.

Zeng Z, Vo A, Mao C, Clare S, Khan S, Luo Y J Biomed Inform. 2019; 96:103247.

PMID: 31271844 PMC: 6697569. DOI: 10.1016/j.jbi.2019.103247.


References
1.
Ringwald M, Eppig J, Begley D, Corradi J, McCright I, Hayamizu T . The Mouse Gene Expression Database (GXD). Nucleic Acids Res. 2000; 29(1):98-101. PMC: 29814. DOI: 10.1093/nar/29.1.98. View

2.
Lein E, Hawrylycz M, Ao N, Ayres M, Bensinger A, Bernard A . Genome-wide atlas of gene expression in the adult mouse brain. Nature. 2006; 445(7124):168-76. DOI: 10.1038/nature05453. View

3.
Schaefer C, Anthony K, Krupa S, Buchoff J, Day M, Hannay T . PID: the Pathway Interaction Database. Nucleic Acids Res. 2008; 37(Database issue):D674-9. PMC: 2686461. DOI: 10.1093/nar/gkn653. View

4.
Hindorff L, Sethupathy P, Junkins H, Ramos E, Mehta J, Collins F . Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A. 2009; 106(23):9362-7. PMC: 2687147. DOI: 10.1073/pnas.0903103106. View

5.
Letovsky S, Cottingham R, Porter C, Li P . GDB: the Human Genome Database. Nucleic Acids Res. 1998; 26(1):94-9. PMC: 147203. DOI: 10.1093/nar/26.1.94. View