Boosting Virtual Screening Enrichments with Data Fusion: Coalescing Hits from Two-dimensional Fingerprints, Shape, and Docking
Overview
Medical Informatics
Affiliations
Virtual screening is an effective way to find hits in drug discovery, with approaches ranging from fast information-based similarity methods to more computationally intensive physics-based docking methods. However, the best approach to use for a given project is not clear in advance of the screen. In this work, we show that combining results from multiple methods using a standard score (Z-score) can significantly improve virtual screening enrichments over any of the single screening methods. We show that an augmented Z-score, which considers the best two out of three scores for a given compound, outperforms previously published data fusion algorithms. We use three different virtual screening methods (two-dimensional (2D) fingerprint similarity, shape-based similarity, and docking) and study two different databases (DUD and MDDR). The average enrichment in the top 1% was improved by 9% for DUD and 25% for the MDDR, compared with the top individual method. Improvements of 22% for DUD and 43% for MDDR are seen over the average of the three individual methods. Statistics are presented that show a high significance associated with the findings in this work.
On the relevance of query definition in the performance of 3D ligand-based virtual screening.
Vazquez J, Garcia R, Llinares P, Luque F, Herrero E J Comput Aided Mol Des. 2024; 38(1):18.
PMID: 38573547 PMC: 10995064. DOI: 10.1007/s10822-024-00561-5.
EMBER-Embedding Multiple Molecular Fingerprints for Virtual Screening.
Mendolia I, Contino S, De Simone G, Perricone U, Pirrone R Int J Mol Sci. 2022; 23(4).
PMID: 35216273 PMC: 8877815. DOI: 10.3390/ijms23042156.
Ruggiero D, Terracciano S, Lauro G, Pecoraro M, Franceschelli S, Bifulco G Molecules. 2022; 27(3).
PMID: 35163936 PMC: 8839660. DOI: 10.3390/molecules27030665.
Leveraging nonstructural data to predict structures and affinities of protein-ligand complexes.
Paggi J, Belk J, Hollingsworth S, Villanueva N, Powers A, Clark M Proc Natl Acad Sci U S A. 2021; 118(51).
PMID: 34921117 PMC: 8713799. DOI: 10.1073/pnas.2112621118.
Quevedo-Tumailli V, Ortega-Tenezaca B, Gonzalez-Diaz H Int J Mol Sci. 2021; 22(23).
PMID: 34884870 PMC: 8657696. DOI: 10.3390/ijms222313066.