» Articles » PMID: 36269851

MS2Tox Machine Learning Tool for Predicting the Ecotoxicity of Unidentified Chemicals in Water by Nontarget LC-HRMS

Overview
Date 2022 Oct 21
PMID 36269851
Authors
Affiliations
Soon will be listed here.
Abstract

To achieve water quality objectives of the zero pollution action plan in Europe, rapid methods are needed to identify the presence of toxic substances in complex water samples. However, only a small fraction of chemicals detected with nontarget high-resolution mass spectrometry can be identified, and fewer have ecotoxicological data available. We hypothesized that ecotoxicological data could be predicted for unknown molecular features in data-rich high-resolution mass spectrometry (HRMS) spectra, thereby circumventing time-consuming steps of molecular identification and rapidly flagging molecules of potentially high toxicity in complex samples. Here, we present MS2Tox, a machine learning method, to predict the toxicity of unidentified chemicals based on high-resolution accurate mass tandem mass spectra (MS). The MS2Tox model for fish toxicity was trained and tested on 647 lethal concentration (LC) values from the CompTox database and validated for 219 chemicals and 420 MS spectra from MassBank. The root mean square error (RMSE) of MS2Tox predictions was below 0.89 log-mM, while the experimental repeatability of LC values in CompTox was 0.44 log-mM. MS2Tox allowed accurate prediction of fish LC values for 22 chemicals detected in water samples, and empirical evidence suggested the right directionality for another 68 chemicals. Moreover, by incorporating structural information, e.g., the presence of carbonyl-benzene, amide moieties, or hydroxyl groups, MS2Tox outperforms baseline models that use only the exact mass or log .

Citing Articles

MLinvitroTox reloaded for high-throughput hazard-based prioritization of high-resolution mass spectrometry data.

Arturi K, Harris E, Gasser L, Escher B, Braun G, Bosshard R J Cheminform. 2025; 17(1):14.

PMID: 39891244 PMC: 11786476. DOI: 10.1186/s13321-025-00950-4.


Communicating with Stakeholders to Identify High-Impact Research Directions for Non-Targeted Analysis.

Nason S, McCord J, Feng Y, Sobus J, Fisher C, Marfil-Vega R Anal Chem. 2025; 97(5):2567-2578.

PMID: 39883652 PMC: 11886761. DOI: 10.1021/acs.analchem.4c04801.


Advances in environmental analysis of high molecular weight disinfection byproducts.

He G, Zhao J, Liu Y, Wang D, Sheng Z, Zhou Q Anal Bioanal Chem. 2024; 417(3):513-534.

PMID: 39527292 DOI: 10.1007/s00216-024-05627-9.


Machine learning methods to predict cadmium (Cd) concentration in rice grain and support soil management at a regional scale.

Huang B, Lu Q, Tang Z, Tang Z, Chen H, Yang X Fundam Res. 2024; 4(5):1196-1205.

PMID: 39431142 PMC: 11489518. DOI: 10.1016/j.fmre.2023.02.016.


Quantification Approaches in Non-Target LC/ESI/HRMS Analysis: An Interlaboratory Comparison.

Malm L, Liigand J, Aalizadeh R, Alygizakis N, Ng K, Fro Kjaer E Anal Chem. 2024; 96(41):16215-16226.

PMID: 39353203 PMC: 11483430. DOI: 10.1021/acs.analchem.4c02902.


References
1.
Randazzo G, Tonoli D, Hambye S, Guillarme D, Jeanneret F, Nurisso A . Prediction of retention time in reversed-phase liquid chromatography as a tool for steroid identification. Anal Chim Acta. 2016; 916:8-16. DOI: 10.1016/j.aca.2016.02.014. View

2.
Schymanski E, Singer H, Longree P, Loos M, Ruff M, Stravs M . Strategies to characterize polar organic contamination in wastewater: exploring the capability of high resolution mass spectrometry. Environ Sci Technol. 2014; 48(3):1811-8. DOI: 10.1021/es4044374. View

3.
Chen X, Dang L, Yang H, Huang X, Yu X . Machine learning-based prediction of toxicity of organic compounds towards fathead minnow. RSC Adv. 2022; 10(59):36174-36180. PMC: 9056962. DOI: 10.1039/d0ra05906d. View

4.
OBoyle N, Sayle R . Comparing structural fingerprints using a literature-based similarity benchmark. J Cheminform. 2016; 8:36. PMC: 4932683. DOI: 10.1186/s13321-016-0148-0. View

5.
Wu K, Wei G . Quantitative Toxicity Prediction Using Topology Based Multitask Deep Neural Networks. J Chem Inf Model. 2018; 58(2):520-531. DOI: 10.1021/acs.jcim.7b00558. View