» Articles » PMID: 34912370

Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules

Overview
Journal Front Genet
Date 2021 Dec 16
PMID 34912370
Citations 7
Authors
Affiliations
Soon will be listed here.
Abstract

Drug discovery and repurposing against COVID-19 is a highly relevant topic with huge efforts dedicated to delivering novel therapeutics targeting SARS-CoV-2. In this context, computer-aided drug discovery is of interest in orienting the early high throughput screenings and in optimizing the hit identification rate. We herein propose a pipeline for Ligand-Based Drug Discovery (LBDD) against SARS-CoV-2. Through an extensive search of the literature and multiple steps of filtering, we integrated information on 2,610 molecules having a validated effect against SARS-CoV and/or SARS-CoV-2. The chemical structures of these molecules were encoded through multiple systems to be readily useful as input to conventional machine learning (ML) algorithms or deep learning (DL) architectures. We assessed the performances of seven ML algorithms and four DL algorithms in achieving molecule classification into two classes: active and inactive. The Random Forests (RF), Graph Convolutional Network (GCN), and Directed Acyclic Graph (DAG) models achieved the best performances. These models were further optimized through hyperparameter tuning and achieved ROC-AUC scores through cross-validation of 85, 83, and 79% for RF, GCN, and DAG models, respectively. An external validation step on the FDA-approved drugs collection revealed a superior potential of DL algorithms to achieve drug repurposing against SARS-CoV-2 based on the dataset herein presented. Namely, GCN and DAG achieved more than 50% of the true positive rate assessed on the confirmed hits of a PubChem bioassay.

Citing Articles

Positional embeddings and zero-shot learning using BERT for molecular-property prediction.

Mswahili M, Hwang J, Rajapakse J, Jo K, Jeong Y J Cheminform. 2025; 17(1):17.

PMID: 39910649 PMC: 11800558. DOI: 10.1186/s13321-025-00959-9.


Transformer-based models for chemical SMILES representation: A comprehensive literature review.

Mswahili M, Jeong Y Heliyon. 2024; 10(20):e39038.

PMID: 39640612 PMC: 11620068. DOI: 10.1016/j.heliyon.2024.e39038.


cidalsDB: an AI-empowered platform for anti-pathogen therapeutics research.

Harigua-Souiai E, Masmoudi O, Makni S, Oualha R, Abdelkrim Y, Hamdi S J Cheminform. 2024; 16(1):134.

PMID: 39609715 PMC: 11605991. DOI: 10.1186/s13321-024-00929-7.


Approved drugs successfully repurposed against based on machine learning predictions.

Oualha R, Abdelkrim Y, Guizani I, Harigua-Souiai E Front Cell Infect Microbiol. 2024; 14:1403589.

PMID: 39391884 PMC: 11464777. DOI: 10.3389/fcimb.2024.1403589.


Knowledge mapping of graph neural networks for drug discovery: a bibliometric and visualized analysis.

Yao R, Shen Z, Xu X, Ling G, Xiang R, Song T Front Pharmacol. 2024; 15:1393415.

PMID: 38799167 PMC: 11116974. DOI: 10.3389/fphar.2024.1393415.


References
1.
Walters W, Barzilay R . Critical assessment of AI in drug discovery. Expert Opin Drug Discov. 2021; 16(9):937-947. DOI: 10.1080/17460441.2021.1915982. View

2.
Kim S, Chen J, Cheng T, Gindulyte A, He J, He S . PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res. 2020; 49(D1):D1388-D1395. PMC: 7778930. DOI: 10.1093/nar/gkaa971. View

3.
Simic V, Miljkovic M, Stamenkovic D, Vekic B, Ratkovic N, Simic R . An overview of antiviral strategies for coronavirus 2 (SARS-CoV-2) infection with special reference to antimalarial drugs chloroquine and hydroxychloroquine. Int J Clin Pract. 2020; 75(3):e13825. DOI: 10.1111/ijcp.13825. View

4.
Zhai T, Zhang F, Haider S, Kraut D, Huang Z . An Integrated Computational and Experimental Approach to Identifying Inhibitors for SARS-CoV-2 3CL Protease. Front Mol Biosci. 2021; 8:661424. PMC: 8166273. DOI: 10.3389/fmolb.2021.661424. View

5.
Wu Z, Ramsundar B, Feinberg E, Gomes J, Geniesse C, Pappu A . MoleculeNet: a benchmark for molecular machine learning. Chem Sci. 2018; 9(2):513-530. PMC: 5868307. DOI: 10.1039/c7sc02664a. View