» Articles » PMID: 39959102

Efficient and Explainable Virtual Screening of Molecules Through Fingerprint-Generating Networks Integrated with Artificial Neural Networks

Overview
Journal ACS Omega
Specialty Chemistry
Date 2025 Feb 17
PMID 39959102
Authors
Affiliations
Soon will be listed here.
Abstract

A machine learning-based drug screening technique has been developed and optimized using a novel, stitched neural network architecture with trainable, graph convolution-based fingerprints as a base into an artificial neural network. The architecture is efficient, explainable, and performant as a tool for the binary classification of ligands based on a user-chosen docking score threshold. Assessment using two standardized virtual screening databases substantiated the architecture's ability to learn molecular features and substructures and predict ligand classes based on binding affinity values more effectively than similar contemporary counterparts. Furthermore, to highlight the architecture's utility to groups and laboratories with varying resources, experiments were carried out using randomly sampled small molecules from the ZINC database and their computational docking scores against six drug-design relevant proteins. This new architecture proved to be more efficient in screening molecules that less favorably bind to a specific target thereby retaining top-hit molecules. Compared to similar protocols developed using Morgan fingerprints, the neural fingerprint-based model shows superiority in retaining the best ligands while filtering molecules at a higher relative rate. Lastly, the explainability of the model was investigated; it was revealed that the model accurately emphasized important chemical substructures and atoms through the intermediate fingerprint, which, in turn, contributed heavily to the ultimate prediction of a ligand as binding tightly to a certain protein.

References
1.
Hu Q, Williams M, Shulgina I, Fossum C, Weeks K, Adams L . Editing Domain Motions Preorganize the Synthetic Active Site of Prolyl-tRNA Synthetase. ACS Catal. 2021; 10(17):10229-10242. PMC: 8293909. DOI: 10.1021/acscatal.0c02381. View

2.
Roberts P, Der C . Targeting the Raf-MEK-ERK mitogen-activated protein kinase cascade for the treatment of cancer. Oncogene. 2007; 26(22):3291-310. DOI: 10.1038/sj.onc.1210422. View

3.
Tran-Nguyen V, Jacquemard C, Rognan D . LIT-PCBA: An Unbiased Data Set for Machine Learning and Virtual Screening. J Chem Inf Model. 2020; 60(9):4263-4273. DOI: 10.1021/acs.jcim.0c00155. View

4.
Zdrazil B, Felix E, Hunter F, Manners E, Blackshaw J, Corbett S . The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res. 2023; 52(D1):D1180-D1192. PMC: 10767899. DOI: 10.1093/nar/gkad1004. View

5.
Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta R, Kumar P . Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers. 2021; 25(3):1315-1360. PMC: 8040371. DOI: 10.1007/s11030-021-10217-3. View