» Articles » PMID: 38035191

Optimal Decision-making in High-throughput Virtual Screening Pipelines

Overview
Journal Patterns (N Y)
Date 2023 Nov 30
PMID 38035191
Authors
Affiliations
Soon will be listed here.
Abstract

The need for efficient computational screening of molecular candidates that possess desired properties frequently arises in various scientific and engineering problems, including drug discovery and materials design. However, the enormous search space containing the candidates and the substantial computational cost of high-fidelity property prediction models make screening practically challenging. In this work, we propose a general framework for constructing and optimizing a high-throughput virtual screening (HTVS) pipeline that consists of multi-fidelity models. The central idea is to optimally allocate the computational resources to models with varying costs and accuracy to optimize the return on computational investment. Based on both simulated and real-world data, we demonstrate that the proposed optimal HTVS framework can significantly accelerate virtual screening without any degradation in terms of accuracy. Furthermore, it enables an adaptive operational strategy for HTVS, where one can trade accuracy for efficiency.

Citing Articles

Multi-objective latent space optimization of generative molecular design models.

Abeer A, Urban N, Weil M, Alexander F, Yoon B Patterns (N Y). 2024; 5(10):101042.

PMID: 39569209 PMC: 11573897. DOI: 10.1016/j.patter.2024.101042.


Optimal decision-making in high-throughput virtual screening pipelines.

Woo H, Qian X, Tan L, Jha S, Alexander F, Dougherty E Patterns (N Y). 2023; 4(11):100875.

PMID: 38035191 PMC: 10682755. DOI: 10.1016/j.patter.2023.100875.


Optimal high-throughput virtual screening pipeline for efficient selection of redox-active organic materials.

Woo H, Allam O, Chen J, Jang S, Yoon B iScience. 2022; 26(1):105735.

PMID: 36582827 PMC: 9793274. DOI: 10.1016/j.isci.2022.105735.

References
1.
Li W, Godzik A . Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006; 22(13):1658-9. DOI: 10.1093/bioinformatics/btl158. View

2.
Martin R, Simon C, Smit B, Haranczyk M . In silico design of porous polymer networks: high-throughput screening for methane storage materials. J Am Chem Soc. 2014; 136(13):5006-22. DOI: 10.1021/ja4123939. View

3.
Hartmann A, Czauderna T, Hoffmann R, Stein N, Schreiber F . HTPheno: an image analysis pipeline for high-throughput plant phenotyping. BMC Bioinformatics. 2011; 12:148. PMC: 3113939. DOI: 10.1186/1471-2105-12-148. View

4.
Zhang B, Shi G, Yang Z, Zhang F, Pan S . Fluorooxoborates: Beryllium-Free Deep-Ultraviolet Nonlinear Optical Materials without Layered Growth. Angew Chem Int Ed Engl. 2017; 56(14):3916-3919. DOI: 10.1002/anie.201700540. View

5.
Shi X, Sun M, Liu H, Yao Y, Kong R, Chen F . A critical role for the long non-coding RNA GAS5 in proliferation and apoptosis in non-small-cell lung cancer. Mol Carcinog. 2013; 54 Suppl 1:E1-E12. DOI: 10.1002/mc.22120. View