» Articles » PMID: 33533614

Large-Scale Modeling of Multispecies Acute Toxicity End Points Using Consensus of Multitask Deep Learning Methods

Overview
Date 2021 Feb 3
PMID 33533614
Citations 16
Authors
Affiliations
Soon will be listed here.
Abstract

Computational methods to predict molecular properties regarding safety and toxicology represent alternative approaches to expedite drug development, screen environmental chemicals, and thus significantly reduce associated time and costs. There is a strong need and interest in the development of computational methods that yield reliable predictions of toxicity, and many approaches, including the recently introduced deep neural networks, have been leveraged towards this goal. Herein, we report on the collection, curation, and integration of data from the public data sets that were the source of the ChemIDplus database for systemic acute toxicity. These efforts generated the largest publicly available such data set comprising > 80,000 compounds measured against a total of 59 acute systemic toxicity end points. This data was used for developing multiple single- and multitask models utilizing random forest, deep neural networks, convolutional, and graph convolutional neural network approaches. For the first time, we also reported the consensus models based on different multitask approaches. To the best of our knowledge, prediction models for 36 of the 59 end points have never been published before. Furthermore, our results demonstrated a significantly better performance of the consensus model obtained from three multitask learning approaches that particularly predicted the 29 smaller tasks (less than 300 compounds) better than other models developed in the study. The curated data set and the developed models have been made publicly available at https://github.com/ncats/ld50-multitask, https://predictor.ncats.io/, and https://cactus.nci.nih.gov/download/acute-toxicity-db (data set only) to support regulatory and research applications.

Citing Articles

One size does not fit all: revising traditional paradigms for assessing accuracy of QSAR models used for virtual screening.

Wellnitz J, Jain S, Hochuli J, Maxfield T, Muratov E, Tropsha A J Cheminform. 2025; 17(1):7.

PMID: 39819357 PMC: 11740363. DOI: 10.1186/s13321-025-00948-y.


Synergizing Chemical Structures and Bioassay Descriptions for Enhanced Molecular Property Prediction in Drug Discovery.

Schuh M, Boldini D, Sieber S J Chem Inf Model. 2024; 64(12):4640-4650.

PMID: 38836773 PMC: 11200265. DOI: 10.1021/acs.jcim.4c00765.


Advancing Drug Safety in Drug Development: Bridging Computational Predictions for Enhanced Toxicity Prediction.

Amorim A, Piochi L, Gaspar A, Preto A, Rosario-Ferreira N, Moreira I Chem Res Toxicol. 2024; 37(6):827-849.

PMID: 38758610 PMC: 11187637. DOI: 10.1021/acs.chemrestox.3c00352.


Expanding Predictive Capacities in Toxicology: Insights from Hackathon-Enhanced Data and Model Aggregation.

Shkil D, Muhamedzhanova A, Petrov P, Skorb E, Aliev T, Steshin I Molecules. 2024; 29(8).

PMID: 38675645 PMC: 11055041. DOI: 10.3390/molecules29081826.


Computational models for predicting liver toxicity in the deep learning era.

Mostafa F, Chen M Front Toxicol. 2024; 5:1340860.

PMID: 38312894 PMC: 10834666. DOI: 10.3389/ftox.2023.1340860.


References
1.
Wexler P . TOXNET: an evolving web resource for toxicology and environmental health information. Toxicology. 2001; 157(1-2):3-10. DOI: 10.1016/s0300-483x(00)00337-1. View

2.
Auerbach S, Shah R, Mav D, Smith C, Walker N, Vallant M . Predicting the hepatocarcinogenic potential of alkenylbenzene flavoring agents using toxicogenomics and machine learning. Toxicol Appl Pharmacol. 2009; 243(3):300-14. DOI: 10.1016/j.taap.2009.11.021. View

3.
Lo Y, Rensi S, Torng W, Altman R . Machine learning in chemoinformatics and drug discovery. Drug Discov Today. 2018; 23(8):1538-1546. PMC: 6078794. DOI: 10.1016/j.drudis.2018.05.010. View

4.
Wu Z, Ramsundar B, Feinberg E, Gomes J, Geniesse C, Pappu A . MoleculeNet: a benchmark for molecular machine learning. Chem Sci. 2018; 9(2):513-530. PMC: 5868307. DOI: 10.1039/c7sc02664a. View

5.
Sheridan R . Time-split cross-validation as a method for estimating the goodness of prospective prediction. J Chem Inf Model. 2013; 53(4):783-90. DOI: 10.1021/ci400084k. View