» Articles » PMID: 22735708

Co-clustering Phenome-genome for Phenotype Classification and Disease Gene Discovery

Overview
Specialty Biochemistry
Date 2012 Jun 28
PMID 22735708
Citations 37
Authors
Affiliations
Soon will be listed here.
Abstract

Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype-gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype-gene association matrix under the prior knowledge from phenotype similarity network and protein-protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype-gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein-protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways.

Citing Articles

Mining functional gene modules by multi-view NMF of phenome-genome association.

Jin X, He W, Liu M, Wang L, Zhang Y, Xu Y BMC Genomics. 2025; 23(Suppl 6):868.

PMID: 39789452 PMC: 11720361. DOI: 10.1186/s12864-024-11120-5.


Deciphering high-order structures in spatial transcriptomes with graph-guided Tucker decomposition.

Broadbent C, Song T, Kuang R Bioinformatics. 2024; 40(Suppl 1):i529-i538.

PMID: 38940176 PMC: 11256919. DOI: 10.1093/bioinformatics/btae245.


Multi-omics integration of scRNA-seq time series data predicts new intervention points for Parkinson's disease.

Mihajlovic K, Ceddia G, Malod-Dognin N, Novak G, Kyriakis D, Skupin A Sci Rep. 2024; 14(1):10983.

PMID: 38744869 PMC: 11094121. DOI: 10.1038/s41598-024-61844-3.


HetFCM: functional co-module discovery by heterogeneous network co-clustering.

Tan H, Guo M, Chen J, Wang J, Yu G Nucleic Acids Res. 2023; 52(3):e16.

PMID: 38088228 PMC: 10853805. DOI: 10.1093/nar/gkad1174.


Computational Genomics in the Era of Precision Medicine: Applications to Variant Analysis and Gene Therapy.

Wang Y, Wu Y, Choi J, Allington G, Zhao S, Khanfar M J Pers Med. 2022; 12(2).

PMID: 35207663 PMC: 8878256. DOI: 10.3390/jpm12020175.


References
1.
Vanunu O, Magger O, Ruppin E, Shlomi T, Sharan R . Associating genes and protein complexes with disease via network propagation. PLoS Comput Biol. 2010; 6(1):e1000641. PMC: 2797085. DOI: 10.1371/journal.pcbi.1000641. View

2.
Johnson A, ODonnell C . An open access database of genome-wide association results. BMC Med Genet. 2009; 10:6. PMC: 2639349. DOI: 10.1186/1471-2350-10-6. View

3.
Massague J . G1 cell-cycle control and cancer. Nature. 2004; 432(7015):298-306. DOI: 10.1038/nature03094. View

4.
Roden D . Clinical practice. Long-QT syndrome. N Engl J Med. 2008; 358(2):169-76. DOI: 10.1056/NEJMcp0706513. View

5.
Manser C, Stevenson A, Banner S, Davies J, Tudor E, Ono Y . Deregulation of PKN1 activity disrupts neurofilament organisation and axonal transport. FEBS Lett. 2008; 582(15):2303-2308. PMC: 4516414. DOI: 10.1016/j.febslet.2008.05.034. View