» Articles » PMID: 37287889

Machine Learning Spectroscopy Using a 2-Stage, Generalized Constituent Contribution Protocol

Overview
Specialty Biology
Date 2023 Jun 8
PMID 37287889
Authors
Affiliations
Soon will be listed here.
Abstract

A corrected group contribution (CGC)-molecule contribution (MC)-Bayesian neural network (BNN) protocol for accurate prediction of absorption spectra is presented. Upon combination of BNN with CGC methods, the full absorption spectra of various molecules are afforded accurately and efficiently-by using only a small dataset for training. Here, with a small training sample (<100), accurate prediction of maximum wavelength for single molecules is afforded with the first stage of the protocol; by contrast, previously reported machine learning (ML) methods require >1,000 samples to ensure the accuracy of prediction. Furthermore, with <500 samples, the mean square error in the prediction of full ultraviolet spectra reaches <2%; for comparison, ML models with molecular SMILES for training require a much larger dataset (>2,000) to achieve comparable accuracy. Moreover, by employing an MC method designed specifically for CGC that properly interprets the mixing rule, the spectra of mixtures are obtained with high accuracy. The logical origins of the good performance of the protocol are discussed in detail. Considering that such a constituent contribution protocol combines chemical principles and data-driven tools, most likely, it will be proven efficient to solve molecular-property-relevant problems in wider fields.

Citing Articles

Group Contribution Method Supervised Neural Network for Precise Design of Organic Nonlinear Optical Materials.

Fan J, Yuan B, Qian C, Zhou S Precis Chem. 2024; 2(6):263-272.

PMID: 39474201 PMC: 11504572. DOI: 10.1021/prechem.4c00015.


A Universal Framework for General Prediction of Physicochemical Properties: The Natural Growth Model.

Fan J, Qian C, Zhou S Research (Wash D C). 2024; 7:0510.

PMID: 39445107 PMC: 11496607. DOI: 10.34133/research.0510.

References
1.
Westermayr J, Marquetand P . Deep learning for UV absorption spectra with SchNarc: First steps toward transferability in chemical compound space. J Chem Phys. 2020; 153(15):154112. DOI: 10.1063/5.0021915. View

2.
LeCun Y, Bengio Y, Hinton G . Deep learning. Nature. 2015; 521(7553):436-44. DOI: 10.1038/nature14539. View

3.
Sun C, Tian Y, Gao L, Niu Y, Zhang T, Li H . Machine Learning Allows Calibration Models to Predict Trace Element Concentration in Soils with Generalized LIBS Spectra. Sci Rep. 2019; 9(1):11363. PMC: 6684658. DOI: 10.1038/s41598-019-47751-y. View

4.
Urbina F, Batra K, Luebke K, White J, Matsiev D, Olson L . UV-adVISor: Attention-Based Recurrent Neural Networks to Predict UV-Vis Spectra. Anal Chem. 2021; 93(48):16076-16085. PMC: 9137254. DOI: 10.1021/acs.analchem.1c03741. View

5.
Venkatraman V, Yemene A, de Mello J . Prediction of Absorption Spectrum Shifts in Dyes Adsorbed on Titania. Sci Rep. 2019; 9(1):16983. PMC: 6861231. DOI: 10.1038/s41598-019-53534-2. View