Machine Learning Consensus Scoring Improves Performance Across Targets in Structure-Based Virtual Screening
Overview
Medical Informatics
Authors
Affiliations
In structure-based virtual screening, compound ranking through a consensus of scores from a variety of docking programs or scoring functions, rather than ranking by scores from a single program, provides better predictive performance and reduces target performance variability. Here we compare traditional consensus scoring methods with a novel, unsupervised gradient boosting approach. We also observed increased score variation among active ligands and developed a statistical mixture model consensus score based on combining score means and variances. To evaluate performance, we used the common performance metrics ROCAUC and EF1 on 21 benchmark targets from DUD-E. Traditional consensus methods, such as taking the mean of quantile normalized docking scores, outperformed individual docking methods and are more robust to target variation. The mixture model and gradient boosting provided further improvements over the traditional consensus methods. These methods are readily applicable to new targets in academic research and overcome the potentially poor performance of using a single docking method on a new target.
Data-driven discovery of potent small molecule ice recrystallisation inhibitors.
Warren M, Biggs C, Bissoyi A, Gibson M, Sosso G Nat Commun. 2024; 15(1):8082.
PMID: 39278938 PMC: 11402961. DOI: 10.1038/s41467-024-52266-w.
Building shape-focused pharmacophore models for effective docking screening.
Moyano-Gomez P, Lehtonen J, Pentikainen O, Postila P J Cheminform. 2024; 16(1):97.
PMID: 39123240 PMC: 11312248. DOI: 10.1186/s13321-024-00857-6.
Dong T, Sinha S, Zhai B, Fudulu D, Chan J, Narayan P JMIRx Med. 2024; 5:e45973.
PMID: 38889069 PMC: 11217160. DOI: 10.2196/45973.
Consensus holistic virtual screening for drug discovery: a novel machine learning model approach.
Moshawih S, Bu Z, Goh H, Kifli N, Lee L, Goh K J Cheminform. 2024; 16(1):62.
PMID: 38807196 PMC: 11134635. DOI: 10.1186/s13321-024-00855-8.
DataPype: A Fully Automated Unified Software Platform for Computer-Aided Drug Design.
Khan M, Kandwal S, Fayne D ACS Omega. 2023; 8(42):39468-39480.
PMID: 37901539 PMC: 10601415. DOI: 10.1021/acsomega.3c05207.