» Articles » PMID: 31965249

In Silico MS/MS Spectra for Identifying Unknowns: a Critical Examination Using CFM-ID Algorithms and ENTACT Mixture Samples

Overview
Specialty Chemistry
Date 2020 Jan 23
PMID 31965249
Citations 17
Authors
Affiliations
Soon will be listed here.
Abstract

High-resolution mass spectrometry (HRMS) enables rapid chemical annotation via accurate mass measurements and matching of experimentally derived spectra with reference spectra. Reference libraries are generated from chemical standards and are therefore limited in size relative to known chemical space. To address this limitation, in silico spectra (i.e., MS/MS or MS2 spectra), predicted via Competitive Fragmentation Modeling-ID (CFM-ID) algorithms, were generated for compounds within the U.S. Environmental Protection Agency's (EPA) Distributed Structure-Searchable Toxicity (DSSTox) database (totaling, at the time of analysis, ~ 765,000 substances). Experimental spectra from EPA's Non-Targeted Analysis Collaborative Trial (ENTACT) mixtures (n = 10) were then used to evaluate the performance of the in silico spectra. Overall, MS2 spectra were acquired for 377 unique compounds from the ENTACT mixtures. Approximately 53% of these compounds were correctly identified using a commercial reference library, whereas up to 50% were correctly identified as the top hit using the in silico library. Together, the reference and in silico libraries were able to correctly identify 73% of the 377 ENTACT substances. When using the in silico spectra for candidate filtering, an examination of binary classifiers showed a true positive rate (TPR) of 0.90 associated with false positive rates (FPRs) of 0.10 to 0.85, depending on the sample and method of candidate filtering. Taken together, these findings show the abilities of in silico spectra to correctly identify true positives in complex samples (at rates comparable to those observed with reference spectra), and efficiently filter large numbers of potential false positives from further consideration. Graphical abstract.

Citing Articles

Automated QA/QC reporting for non-targeted analysis: a demonstration of "INTERPRET NTA" with de facto water reuse data.

Sobus J, Sayre-Smith N, Chao A, Ferland T, Minucci J, Carr E Anal Bioanal Chem. 2025; .

PMID: 39953322 DOI: 10.1007/s00216-025-05771-w.


Development and application of a non-targeted analysis method using GC-MS and LC-MS for identifying chemical contaminants in drinking water via point-of-use filters.

Sloop J, Casey J, Liberatore H, Chao A, Isaacs K, Newton S Microchem J. 2025; 207.

PMID: 39877062 PMC: 11770584. DOI: 10.1016/j.microc.2024.112223.


Introducing "Identification Probability" for Automated and Transferable Assessment of Metabolite Identification Confidence in Metabolomics and Related Studies.

Metz T, Chang C, Gautam V, Anjum A, Tian S, Wang F Anal Chem. 2024; 97(1):1-11.

PMID: 39699939 PMC: 11740175. DOI: 10.1021/acs.analchem.4c04060.


Data Processing of Product Ion Spectra: Methods to Control False Discovery Rate in Compound Search Results for Untargeted Metabolomics.

Matsuda F Mass Spectrom (Tokyo). 2024; 13(1):A0155.

PMID: 39555379 PMC: 11565486. DOI: 10.5702/massspectrometry.A0155.


Introducing 'identification probability' for automated and transferable assessment of metabolite identification confidence in metabolomics and related studies.

Metz T, Chang C, Gautam V, Anjum A, Tian S, Wang F bioRxiv. 2024; .

PMID: 39131324 PMC: 11312557. DOI: 10.1101/2024.07.30.605945.


References
1.
Kim S, Chen J, Cheng T, Gindulyte A, He J, He S . PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 2018; 47(D1):D1102-D1109. PMC: 6324075. DOI: 10.1093/nar/gky1033. View

2.
Laponogov I, Sadawi N, Galea D, Mirnezami R, Veselkov K . ChemDistiller: an engine for metabolite annotation in mass spectrometry. Bioinformatics. 2018; 34(12):2096-2102. PMC: 9881669. DOI: 10.1093/bioinformatics/bty080. View

3.
Duhrkop K, Shen H, Meusel M, Rousu J, Bocker S . Searching molecular structure databases with tandem mass spectra using CSI:FingerID. Proc Natl Acad Sci U S A. 2015; 112(41):12580-5. PMC: 4611636. DOI: 10.1073/pnas.1509788112. View

4.
Ruttkies C, Schymanski E, Wolf S, Hollender J, Neumann S . MetFrag relaunched: incorporating strategies beyond in silico fragmentation. J Cheminform. 2016; 8:3. PMC: 4732001. DOI: 10.1186/s13321-016-0115-9. View

5.
Schymanski E, Neumann S . The Critical Assessment of Small Molecule Identification (CASMI): Challenges and Solutions. Metabolites. 2014; 3(3):517-38. PMC: 3901296. DOI: 10.3390/metabo3030517. View