» Articles » PMID: 33816987

CogNet: Classification of Gene Expression Data Based on Ranked Active-subnetwork-oriented KEGG Pathway Enrichment Analysis

Overview
Date 2021 Apr 5
PMID 33816987
Citations 16
Authors
Affiliations
Soon will be listed here.
Abstract

Most of the traditional gene selection approaches are borrowed from other fields such as statistics and computer science, However, they do not prioritize biologically relevant genes since the ultimate goal is to determine features that optimize model performance metrics not to build a biologically meaningful model. Therefore, there is an imminent need for new computational tools that integrate the biological knowledge about the data in the process of gene selection and machine learning. Integrative gene selection enables incorporation of biological domain knowledge from external biological resources. In this study, we propose a new computational approach named CogNet that is an integrative gene selection tool that exploits biological knowledge for grouping the genes for the computational modeling tasks of ranking and classification. In CogNet, the pathfindR serves as the biological grouping tool to allow the main algorithm to rank active-subnetwork-oriented KEGG pathway enrichment analysis results to build a biologically relevant model. CogNet provides a list of significant KEGG pathways that can classify the data with a very high accuracy. The list also provides the genes belonging to these pathways that are differentially expressed that are used as features in the classification problem. The list facilitates deep analysis and better interpretability of the role of KEGG pathways in classification of the data thus better establishing the biological relevance of these differentially expressed genes. Even though the main aim of our study is not to improve the accuracy of any existing tool, the performance of the CogNet outperforms a similar approach called maTE while obtaining similar performance compared to other similar tools including SVM-RCE. CogNet was tested on 13 gene expression datasets concerning a variety of diseases.

Citing Articles

RCE-IFE: recursive cluster elimination with intra-cluster feature elimination.

Kuzudisli C, Bakir-Gungor B, Qaqish B, Yousef M PeerJ Comput Sci. 2025; 11:e2528.

PMID: 40062294 PMC: 11888879. DOI: 10.7717/peerj-cs.2528.


Topic selection for text classification using ensemble topic modeling with grouping, scoring, and modeling approach.

Voskergian D, Jayousi R, Yousef M Sci Rep. 2024; 14(1):23516.

PMID: 39384798 PMC: 11464685. DOI: 10.1038/s41598-024-74022-2.


microBiomeGSM: the identification of taxonomic biomarkers from metagenomic data using grouping, scoring and modeling (G-S-M) approach.

Bakir-Gungor B, Temiz M, Jabeer A, Wu D, Yousef M Front Microbiol. 2023; 14:1264941.

PMID: 38075911 PMC: 10703168. DOI: 10.3389/fmicb.2023.1264941.


TextNetTopics Pro, a topic model-based text classification for short text by integration of semantic and document-topic distribution information.

Voskergian D, Bakir-Gungor B, Yousef M Front Genet. 2023; 14:1243874.

PMID: 37867598 PMC: 10585361. DOI: 10.3389/fgene.2023.1243874.


GeNetOntology: identifying affected gene ontology terms via grouping, scoring, and modeling of gene expression data utilizing biological knowledge-based machine learning.

Ersoz N, Bakir-Gungor B, Yousef M Front Genet. 2023; 14:1139082.

PMID: 37671046 PMC: 10476493. DOI: 10.3389/fgene.2023.1139082.


References
1.
Lazzarini N, Bacardit J . RGIFE: a ranked guided iterative feature elimination heuristic for the identification of biomarkers. BMC Bioinformatics. 2017; 18(1):322. PMC: 5493069. DOI: 10.1186/s12859-017-1729-2. View

2.
Inza I, Larranaga P, Blanco R, Cerrolaza A . Filter versus wrapper gene selection approaches in DNA microarray domains. Artif Intell Med. 2004; 31(2):91-103. DOI: 10.1016/j.artmed.2004.01.007. View

3.
Johannes M, Brase J, Frohlich H, Gade S, Gehrmann M, Falth M . Integration of pathway knowledge into a reweighted recursive feature elimination approach for risk stratification of cancer patients. Bioinformatics. 2010; 26(17):2136-44. DOI: 10.1093/bioinformatics/btq345. View

4.
J van t Veer L, Dai H, van de Vijver M, He Y, Hart A, Mao M . Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002; 415(6871):530-6. DOI: 10.1038/415530a. View

5.
Ashburner M, Ball C, Blake J, Botstein D, Butler H, Cherry J . Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000; 25(1):25-9. PMC: 3037419. DOI: 10.1038/75556. View