Demystifying Multitask Deep Neural Networks for Quantitative Structure-Activity Relationships
Overview
Medical Informatics
Affiliations
Deep neural networks (DNNs) are complex computational models that have found great success in many artificial intelligence applications, such as computer vision1,2 and natural language processing.3,4 In the past four years, DNNs have also generated promising results for quantitative structure-activity relationship (QSAR) tasks.5,6 Previous work showed that DNNs can routinely make better predictions than traditional methods, such as random forests, on a diverse collection of QSAR data sets. It was also found that multitask DNN models-those trained on and predicting multiple QSAR properties simultaneously-outperform DNNs trained separately on the individual data sets in many, but not all, tasks. To date there has been no satisfactory explanation of why the QSAR of one task embedded in a multitask DNN can borrow information from other unrelated QSAR tasks. Thus, using multitask DNNs in a way that consistently provides a predictive advantage becomes a challenge. In this work, we explored why multitask DNNs make a difference in predictive performance. Our results show that during prediction a multitask DNN does borrow "signal" from molecules with similar structures in the training sets of the other tasks. However, whether this borrowing leads to better or worse predictive performance depends on whether the activities are correlated. On the basis of this, we have developed a strategy to use multitask DNNs that incorporate prior domain knowledge to select training sets with correlated activities, and we demonstrate its effectiveness on several examples.
kMoL: an open-source machine and federated learning library for drug discovery.
Cozac R, Hasic H, Choong J, Richard V, Beheshti L, Froehlich C J Cheminform. 2025; 17(1):22.
PMID: 40001146 PMC: 11854109. DOI: 10.1186/s13321-025-00967-9.
HDAC3_VS_assistant: cheminformatics-driven discovery of histone deacetylase 3 inhibitors.
Tinkov O, Grigorev V Mol Divers. 2024; .
PMID: 39710831 DOI: 10.1007/s11030-024-11066-6.
Haas B, Hardy M, Sowndarya S V S, Adams K, Coley C, Paton R Digit Discov. 2024; 4(1):222-233.
PMID: 39664609 PMC: 11626426. DOI: 10.1039/d4dd00284a.
Benchmarking Cross-Docking Strategies in Kinase Drug Discovery.
Schaller D, Christ C, Chodera J, Volkamer A J Chem Inf Model. 2024; 64(23):8848-8858.
PMID: 39558632 PMC: 11661510. DOI: 10.1021/acs.jcim.4c00905.
off-target profiling for enhanced drug safety assessment.
Liu J, Gui Y, Rao J, Sun J, Wang G, Ren Q Acta Pharm Sin B. 2024; 14(7):2927-2941.
PMID: 39027254 PMC: 11252485. DOI: 10.1016/j.apsb.2024.03.002.