» Articles » PMID: 39342017

HemoFuse: Multi-feature Fusion Based on Multi-head Cross-attention for Identification of Hemolytic Peptides

Overview
Journal Sci Rep
Specialty Science
Date 2024 Sep 28
PMID 39342017
Authors
Affiliations
Soon will be listed here.
Abstract

Hemolytic peptides are therapeutic peptides that damage red blood cells. However, therapeutic peptides used in medical treatment must exhibit low toxicity to red blood cells to achieve the desired therapeutic effect. Therefore, accurate prediction of the hemolytic activity of therapeutic peptides is essential for the development of peptide therapies. In this study, a multi-feature cross-fusion model, HemoFuse, for hemolytic peptide identification is proposed. The feature vectors of peptide sequences are transformed by word embedding technique and four hand-crafted feature extraction methods. We apply multi-head cross-attention mechanism to hemolytic peptide identification for the first time. It captures the interaction between word embedding features and hand-crafted features by calculating the attention of all positions in them, so that multiple features can be deeply fused. Moreover, we visualize the features obtained by this module to enhance its interpretability. On the comprehensive integrated dataset, HemoFuse achieves ideal results, with ACC, SP, SN, MCC, F1, AUC, and AP of 0.7575, 0.8814, 0.5793, 0.4909, 0.6620, 0.8387, and 0.7118, respectively. Compared with HemoDL proposed by Yang et al., it is 3.32%, 3.89%, 5.93%, 10.6%, 8.17%, 5.88%, and 2.72% higher. Other ablation experiments also prove that our model is reasonable and efficient. The codes and datasets are accessible at https://github.com/z11code/Hemo .

References
1.
Bhasin M, Raghava G . ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST. Nucleic Acids Res. 2004; 32(Web Server issue):W414-9. PMC: 441488. DOI: 10.1093/nar/gkh350. View

2.
Shi H, Zhang S . Accurate Prediction of Anti-hypertensive Peptides Based on Convolutional Neural Network and Gated Recurrent unit. Interdiscip Sci. 2022; 14(4):879-894. DOI: 10.1007/s12539-022-00521-3. View

3.
Zhu Y, Liu Z, Liu Y, Ji Z, Yu D . ULDNA: integrating unsupervised multi-source language models with LSTM-attention network for high-accuracy protein-DNA binding site prediction. Brief Bioinform. 2024; 25(2). PMC: 10939370. DOI: 10.1093/bib/bbae040. View

4.
Hasan M, Schaduangrat N, Basith S, Lee G, Shoombuatong W, Manavalan B . HLPpred-Fuse: improved and robust prediction of hemolytic peptide and its activity by fusing multiple feature representation. Bioinformatics. 2020; 36(11):3350-3356. DOI: 10.1093/bioinformatics/btaa160. View

5.
Pham N, Rakkiyapan R, Park J, Malik A, Manavalan B . H2Opred: a robust and efficient hybrid deep learning model for predicting 2'-O-methylation sites in human RNA. Brief Bioinform. 2024; 25(1). PMC: 10768780. DOI: 10.1093/bib/bbad476. View