SoFoCles: Feature Filtering for Microarray Classification Based on Gene Ontology
Overview
Affiliations
Marker gene selection has been an important research topic in the classification analysis of gene expression data. Current methods try to reduce the "curse of dimensionality" by using statistical intra-feature set calculations, or classifiers that are based on the given dataset. In this paper, we present SoFoCles, an interactive tool that enables semantic feature filtering in microarray classification problems with the use of external, well-defined knowledge retrieved from the Gene Ontology. The notion of semantic similarity is used to derive genes that are involved in the same biological path during the microarray experiment, by enriching a feature set that has been initially produced with legacy methods. Among its other functionalities, SoFoCles offers a large repository of semantic similarity methods that are used in order to derive feature sets and marker genes. The structure and functionality of the tool are discussed in detail, as well as its ability to improve classification accuracy. Through experimental evaluation, SoFoCles is shown to outperform other classification schemes in terms of classification accuracy in two real datasets using different semantic similarity computation approaches.
Biologically weighted LASSO: enhancing functional interpretability in gene expression data analysis.
Mongardi S, Cascianelli S, Masseroli M Bioinformatics. 2024; 40(10).
PMID: 39412436 PMC: 11639179. DOI: 10.1093/bioinformatics/btae605.
Review of feature selection approaches based on grouping of features.
Kuzudisli C, Bakir-Gungor B, Bulut N, Qaqish B, Yousef M PeerJ. 2023; 11:e15666.
PMID: 37483989 PMC: 10358338. DOI: 10.7717/peerj.15666.
Yousef M, Ulgen E, Sezerman O PeerJ Comput Sci. 2021; 7:e336.
PMID: 33816987 PMC: 7959595. DOI: 10.7717/peerj-cs.336.
Application of Biological Domain Knowledge Based Feature Selection on Gene Expression Data.
Yousef M, Kumar A, Bakir-Gungor B Entropy (Basel). 2020; 23(1).
PMID: 33374969 PMC: 7821996. DOI: 10.3390/e23010002.
Perscheid C, Grasnick B, Uflacker M J Integr Bioinform. 2019; 16(1).
PMID: 30785707 PMC: 6798862. DOI: 10.1515/jib-2018-0064.