» Articles » PMID: 39416295

Chemoenzymatic Multistep Retrosynthesis with Transformer Loops

Overview
Journal Chem Sci
Specialty Chemistry
Date 2024 Oct 17
PMID 39416295
Authors
Affiliations
Soon will be listed here.
Abstract

Integrating enzymatic reactions into computer-aided synthesis planning (CASP) should help devise more selective, economical, and greener synthetic routes. Herein we report the triple-transformer loop algorithm with biocatalysis (TTLAB) as a new CASP tool for chemo-enzymatic multistep retrosynthesis. Single-step retrosyntheses are performed using two triple transformer loops (TTL), one trained with chemical reactions from the US Patent Office (USPTO-TTL), the second one obtained by multitask transfer learning combining the USPTO dataset with preparative biotransformations from the literature (ENZR-TTL). Each TTL performs single-step retrosynthesis independently by tagging potential reactive sites in the product, predicting for each site possible starting materials (T1) and reagents or enzymes (T2), and validating the predictions a forward transformer (T3). TTLAB combines predictions from both TTLs to explore multistep sequences using a heuristic best-first tree search and propose short routes from commercial building blocks including enantioselective biocatalytic steps. TTLAB can be used to assist chemoenzymatic route design.

References
1.
Nagano N . EzCatDB: the Enzyme Catalytic-mechanism Database. Nucleic Acids Res. 2004; 33(Database issue):D407-12. PMC: 540034. DOI: 10.1093/nar/gki080. View

2.
Segler M, Waller M . Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. Chemistry. 2017; 23(25):5966-5971. DOI: 10.1002/chem.201605499. View

3.
Duan H, Wang L, Zhang C, Guo L, Li J . Retrosynthesis with attention-based NMT model and chemical analysis of "wrong" predictions. RSC Adv. 2022; 10(3):1371-1378. PMC: 9047528. DOI: 10.1039/c9ra08535a. View

4.
Schwaller P, Petraglia R, Zullo V, Nair V, Haeuselmann R, Pisoni R . Predicting retrosynthetic pathways using transformer-based models and a hyper-graph exploration strategy. Chem Sci. 2021; 11(12):3316-3325. PMC: 8152799. DOI: 10.1039/c9sc05704h. View

5.
Zhu D, Hua L . Biocatalytic asymmetric amination of carbonyl functional groups - a synthetic biology approach to organic chemistry. Biotechnol J. 2009; 4(10):1420-31. DOI: 10.1002/biot.200900110. View