» Articles » PMID: 38615048

Advancing Forensic-based Investigation Incorporating Slime Mould Search for Gene Selection of High-dimensional Genetic Data

Overview
Journal Sci Rep
Specialty Science
Date 2024 Apr 13
PMID 38615048
Authors
Affiliations
Soon will be listed here.
Abstract

Modern medicine has produced large genetic datasets of high dimensions through advanced gene sequencing technology, and processing these data is of great significance for clinical decision-making. Gene selection (GS) is an important data preprocessing technique that aims to select a subset of feature information to improve performance and reduce data dimensionality. This study proposes an improved wrapper GS method based on forensic-based investigation (FBI). The method introduces the search mechanism of the slime mould algorithm in the FBI to improve the original FBI; the newly proposed algorithm is named SMA_FBI; then GS is performed by converting the continuous optimizer to a binary version of the optimizer through a transfer function. In order to verify the superiority of SMA_FBI, experiments are first executed on the 30-function test set of CEC2017 and compared with 10 original algorithms and 10 state-of-the-art algorithms. The experimental results show that SMA_FBI is better than other algorithms in terms of finding the optimal solution, convergence speed, and robustness. In addition, BSMA_FBI (binary version of SMA_FBI) is compared with 8 binary algorithms on 18 high-dimensional genetic data from the UCI repository. The results indicate that BSMA_FBI is able to obtain high classification accuracy with fewer features selected in GS applications. Therefore, SMA_FBI is considered an optimization tool with great potential for dealing with global optimization problems, and its binary version, BSMA_FBI, can be used for GS tasks.

References
1.
Zhu Y, Huang R, Wu Z, Song S, Cheng L, Zhu R . Deep learning-based predictive identification of neural stem cell differentiation. Nat Commun. 2021; 12(1):2614. PMC: 8110743. DOI: 10.1038/s41467-021-22758-0. View

2.
Zhou P, Du L, Liu X, Shen Y, Fan M, Li X . Self-Paced Clustering Ensemble. IEEE Trans Neural Netw Learn Syst. 2020; 32(4):1497-1511. DOI: 10.1109/TNNLS.2020.2984814. View

3.
Lian J, Hui G, Ma L, Zhu T, Wu X, Heidari A . Parrot optimizer: Algorithm and applications to medical problems. Comput Biol Med. 2024; 172:108064. DOI: 10.1016/j.compbiomed.2024.108064. View

4.
Li J, Luo J, Liu L, Fu H, Tang L . The genetic association between apolipoprotein E gene polymorphism and Parkinson disease: A meta-Analysis of 47 studies. Medicine (Baltimore). 2018; 97(43):e12884. PMC: 6221690. DOI: 10.1097/MD.0000000000012884. View

5.
Chen K, Xue B, Zhang M, Zhou F . An Evolutionary Multitasking-Based Feature Selection Method for High-Dimensional Classification. IEEE Trans Cybern. 2020; 52(7):7172-7186. DOI: 10.1109/TCYB.2020.3042243. View