» Articles » PMID: 28654262

Machine Learning Consensus Scoring Improves Performance Across Targets in Structure-Based Virtual Screening

Overview
Date 2017 Jun 28
PMID 28654262
Citations 34
Authors
Affiliations
Soon will be listed here.
Abstract

In structure-based virtual screening, compound ranking through a consensus of scores from a variety of docking programs or scoring functions, rather than ranking by scores from a single program, provides better predictive performance and reduces target performance variability. Here we compare traditional consensus scoring methods with a novel, unsupervised gradient boosting approach. We also observed increased score variation among active ligands and developed a statistical mixture model consensus score based on combining score means and variances. To evaluate performance, we used the common performance metrics ROCAUC and EF1 on 21 benchmark targets from DUD-E. Traditional consensus methods, such as taking the mean of quantile normalized docking scores, outperformed individual docking methods and are more robust to target variation. The mixture model and gradient boosting provided further improvements over the traditional consensus methods. These methods are readily applicable to new targets in academic research and overcome the potentially poor performance of using a single docking method on a new target.

Citing Articles

Data-driven discovery of potent small molecule ice recrystallisation inhibitors.

Warren M, Biggs C, Bissoyi A, Gibson M, Sosso G Nat Commun. 2024; 15(1):8082.

PMID: 39278938 PMC: 11402961. DOI: 10.1038/s41467-024-52266-w.


Building shape-focused pharmacophore models for effective docking screening.

Moyano-Gomez P, Lehtonen J, Pentikainen O, Postila P J Cheminform. 2024; 16(1):97.

PMID: 39123240 PMC: 11312248. DOI: 10.1186/s13321-024-00857-6.


Performance Drift in Machine Learning Models for Cardiac Surgery Risk Prediction: Retrospective Analysis.

Dong T, Sinha S, Zhai B, Fudulu D, Chan J, Narayan P JMIRx Med. 2024; 5:e45973.

PMID: 38889069 PMC: 11217160. DOI: 10.2196/45973.


Consensus holistic virtual screening for drug discovery: a novel machine learning model approach.

Moshawih S, Bu Z, Goh H, Kifli N, Lee L, Goh K J Cheminform. 2024; 16(1):62.

PMID: 38807196 PMC: 11134635. DOI: 10.1186/s13321-024-00855-8.


DataPype: A Fully Automated Unified Software Platform for Computer-Aided Drug Design.

Khan M, Kandwal S, Fayne D ACS Omega. 2023; 8(42):39468-39480.

PMID: 37901539 PMC: 10601415. DOI: 10.1021/acsomega.3c05207.


References
1.
Wang J, Wang W, Kollman P, Case D . Automatic atom type and bond type perception in molecular mechanical calculations. J Mol Graph Model. 2006; 25(2):247-60. DOI: 10.1016/j.jmgm.2005.12.005. View

2.
Paul N, Rognan D . ConsDock: A new program for the consensus analysis of protein-ligand interactions. Proteins. 2002; 47(4):521-33. DOI: 10.1002/prot.10119. View

3.
Cleves A, Jain A . Knowledge-guided docking: accurate prospective prediction of bound configurations of novel ligands using Surflex-Dock. J Comput Aided Mol Des. 2015; 29(6):485-509. PMC: 4464052. DOI: 10.1007/s10822-015-9846-3. View

4.
Morris G, Huey R, Lindstrom W, Sanner M, Belew R, Goodsell D . AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J Comput Chem. 2009; 30(16):2785-91. PMC: 2760638. DOI: 10.1002/jcc.21256. View

5.
Verdonk M, Berdini V, Hartshorn M, Mooij W, Murray C, Taylor R . Virtual screening using protein-ligand docking: avoiding artificial enrichment. J Chem Inf Comput Sci. 2004; 44(3):793-806. DOI: 10.1021/ci034289q. View