» Articles » PMID: 21819139

Learning to Predict Chemical Reactions

Overview
Date 2011 Aug 9
PMID 21819139
Citations 47
Authors
Affiliations
Soon will be listed here.
Abstract

Being able to predict the course of arbitrary chemical reactions is essential to the theory and applications of organic chemistry. Approaches to the reaction prediction problems can be organized around three poles corresponding to: (1) physical laws; (2) rule-based expert systems; and (3) inductive machine learning. Previous approaches at these poles, respectively, are not high throughput, are not generalizable or scalable, and lack sufficient data and structure to be implemented. We propose a new approach to reaction prediction utilizing elements from each pole. Using a physically inspired conceptualization, we describe single mechanistic reactions as interactions between coarse approximations of molecular orbitals (MOs) and use topological and physicochemical attributes as descriptors. Using an existing rule-based system (Reaction Explorer), we derive a restricted chemistry data set consisting of 1630 full multistep reactions with 2358 distinct starting materials and intermediates, associated with 2989 productive mechanistic steps and 6.14 million unproductive mechanistic steps. And from machine learning, we pose identifying productive mechanistic steps as a statistical ranking, information retrieval problem: given a set of reactants and a description of conditions, learn a ranking model over potential filled-to-unfilled MO interactions such that the top-ranked mechanistic steps yield the major products. The machine learning implementation follows a two-stage approach, in which we first train atom level reactivity filters to prune 94.00% of nonproductive reactions with a 0.01% error rate. Then, we train an ensemble of ranking models on pairs of interacting MOs to learn a relative productivity function over mechanistic steps in a given system. Without the use of explicit transformation patterns, the ensemble perfectly ranks the productive mechanism at the top 89.05% of the time, rising to 99.86% of the time when the top four are considered. Furthermore, the system is generalizable, making reasonable predictions over reactants and conditions which the rule-based expert does not handle. A web interface to the machine learning based mechanistic reaction predictor is accessible through our chemoinformatics portal ( http://cdb.ics.uci.edu) under the Toolkits section.

Citing Articles

Chemically Informed Deep Learning for Interpretable Radical Reaction Prediction.

Tavakoli M, Chiu Y, Carlton A, Van Vranken D, Baldi P J Chem Inf Model. 2025; 65(3):1228-1242.

PMID: 39871741 PMC: 11815866. DOI: 10.1021/acs.jcim.4c01901.


Integrating Machine Learning and Color Chemistry: Developing a High-School Curriculum toward Real-World Problem-Solving.

Jiang S, McClure J, Mao H, Chen J, Liu Y, Zhang Y J Chem Educ. 2024; 101(2):675-681.

PMID: 38939529 PMC: 11210371. DOI: 10.1021/acs.jchemed.3c00589.


Deconvolution and Analysis of the H NMR Spectra of Crude Reaction Mixtures.

Venetos M, Elkin M, Delaney C, Hartwig J, Persson K J Chem Inf Model. 2024; 64(8):3008-3020.

PMID: 38573053 PMC: 11040730. DOI: 10.1021/acs.jcim.3c01864.


PMechDB: A Public Database of Elementary Polar Reaction Steps.

Tavakoli M, Miller R, Angel M, Pfeiffer M, Gutman E, Mood A J Chem Inf Model. 2024; 64(6):1975-1983.

PMID: 38483315 PMC: 10966657. DOI: 10.1021/acs.jcim.3c01810.


Chemprop: A Machine Learning Package for Chemical Property Prediction.

Heid E, Greenman K, Chung Y, Li S, Graff D, Vermeire F J Chem Inf Model. 2023; 64(1):9-17.

PMID: 38147829 PMC: 10777403. DOI: 10.1021/acs.jcim.3c01250.


References
1.
Park J, Rosania G, Saitou K . Tunable machine vision-based strategy for automated annotation of chemical databases. J Chem Inf Model. 2009; 49(8):1993-2001. PMC: 2907084. DOI: 10.1021/ci900029v. View

2.
Azencott C, Ksikes A, Swamidass S, Chen J, Ralaivola L, Baldi P . One- to four-dimensional kernels for virtual screening and the prediction of physical, chemical, and biological properties. J Chem Inf Model. 2007; 47(3):965-74. DOI: 10.1021/ci600397p. View

3.
Hahnke V, Hofmann B, Grgat T, Proschak E, Steinhilber D, Schneider G . PhAST: pharmacophore alignment search tool. J Comput Chem. 2008; 30(5):761-71. DOI: 10.1002/jcc.21095. View

4.
Cantillo D, Kappe C . A unified mechanistic view on the Morita-Baylis-Hillman reaction: computational and experimental investigations. J Org Chem. 2010; 75(24):8615-26. DOI: 10.1021/jo102094h. View

5.
Socorro I, Taylor K, Goodman J . ROBIA: a reaction prediction program. Org Lett. 2005; 7(16):3541-4. DOI: 10.1021/ol0512738. View