Comparing LD/LC Machine Learning Models for Multiple Species
Overview
Affiliations
The lethal dose or concentration which kills 50% of the animals (LD or LC) is an important parameter for scientists to understand the toxicity of chemicals in different scenarios that can be used to make go-no-go decisions, and ultimately assist in the choice of the right personal protective equipment needed for containment. The LD assessment process has also required the use of many animals although modern methods have reduced the number of rats needed. Since a compound is usually considered highly toxic when the LD is lower than 25 mg/kg, such a classification provides potentially valuable safety information to synthetic chemists and other safety assessment scientists. The need for finding alternative approaches such as computational methods is important to ultimately reduce animal use for this testing further still. We now summarize our efforts to use public data for building LD or LC classification and regression machine learning models for various species (rat, mouse, fish and daphnia) and their 5-fold cross validation statistics with different machine learning algorithms as well as an external curated test set for mouse LD. These datasets consist of different molecule classes, may cover different activity ranges, and also have a range of dataset sizes. The challenges of using such computational models are that their applicability domain will also need to be understood so that they can be used to make reliable predictions for novel molecules. These machine learning models will also need to be backed up with experimental validation. However, such models could also be used for efforts to bridge gaps in individual toxicity datasets. Making such models available also opens them up to potential misuse or dual use. We will summarize these efforts and propose that they could be used for scoring the millions of commercially available molecules, most of which likely do not have a known LD or for that matter any data or for toxicity.
ApisTox: a new benchmark dataset for the classification of small molecules toxicity on honey bees.
Adamczyk J, Poziemski J, Siedlecki P Sci Data. 2025; 12(1):5.
PMID: 39747220 PMC: 11696378. DOI: 10.1038/s41597-024-04232-w.
Fuadah Y, Pramudito M, Firdaus L, Vanheusden F, Lim K ACS Omega. 2025; 9(51):50796-50808.
PMID: 39741811 PMC: 11683616. DOI: 10.1021/acsomega.4c09356.
Kataki A, Baldini F, Naorem A PLoS One. 2024; 19(9):e0308707.
PMID: 39240894 PMC: 11379303. DOI: 10.1371/journal.pone.0308707.
Predicting the Hallucinogenic Potential of Molecules Using Artificial Intelligence.
Urbina F, Jones T, Harris J, Snyder S, Lane T, Ekins S ACS Chem Neurosci. 2024; 15(16):3078-3089.
PMID: 39092989 PMC: 11338697. DOI: 10.1021/acschemneuro.4c00405.
Near-Term Quantum Classification Algorithms Applied to Antimalarial Drug Discovery.
Dorsey M, Dsouza K, Ranganath D, Harris J, Lane T, Urbina F J Chem Inf Model. 2024; 64(15):5922-5930.
PMID: 39013438 PMC: 11338495. DOI: 10.1021/acs.jcim.4c00953.