» Articles » PMID: 35246540

Retrosynthetic Reaction Pathway Prediction Through Neural Machine Translation of Atomic Environments

Overview
Journal Nat Commun
Specialty Biology
Date 2022 Mar 5
PMID 35246540
Authors
Affiliations
Soon will be listed here.
Abstract

Designing efficient synthetic routes for a target molecule remains a major challenge in organic synthesis. Atom environments are ideal, stand-alone, chemically meaningful building blocks providing a high-resolution molecular representation. Our approach mimics chemical reasoning, and predicts reactant candidates by learning the changes of atom environments associated with the chemical reaction. Through careful inspection of reactant candidates, we demonstrate atom environments as promising descriptors for studying reaction route prediction and discovery. Here, we present a new single-step retrosynthesis prediction method, viz. RetroTRAE, being free from all SMILES-based translation issues, yields a top-1 accuracy of 58.3% on the USPTO test dataset, and top-1 accuracy reaches to 61.6% with the inclusion of highly similar analogs, outperforming other state-of-the-art neural machine translation-based methods. Our methodology introduces a novel scheme for fragmental and topological descriptors to be used as natural inputs for retrosynthetic prediction tasks.

Citing Articles

Graph-Theory Algorithm for Prediction of Electrolyte Degradation Reactions in Lithium- and Sodium-Ion Batteries.

Borislavov L, Tadjer A, Stoyanova R Materials (Basel). 2025; 18(4).

PMID: 40004354 PMC: 11857540. DOI: 10.3390/ma18040832.


Single-step retrosynthesis prediction via multitask graph representation learning.

Zhao P, Wei X, Wang Q, Wang Q, Li J, Shang J Nat Commun. 2025; 16(1):814.

PMID: 39827189 PMC: 11742932. DOI: 10.1038/s41467-025-56062-y.


Sort & Slice: a simple and superior alternative to hash-based folding for extended-connectivity fingerprints.

Dablander M, Hanser T, Lambiotte R, Morris G J Cheminform. 2024; 16(1):135.

PMID: 39627861 PMC: 11616156. DOI: 10.1186/s13321-024-00932-y.


A systematic review of deep learning chemical language models in recent era.

Flores-Hernandez H, Martinez-Ledesma E J Cheminform. 2024; 16(1):129.

PMID: 39558376 PMC: 11571686. DOI: 10.1186/s13321-024-00916-y.


RetroCaptioner: beyond attention in end-to-end retrosynthesis transformer via contrastively captioned learnable graph representation.

Liu X, Ai C, Yang H, Dong R, Tang J, Zheng S Bioinformatics. 2024; 40(9).

PMID: 39342389 PMC: 11520410. DOI: 10.1093/bioinformatics/btae561.


References
1.
Durant J, Leland B, Henry D, Nourse J . Reoptimization of MDL keys for use in drug discovery. J Chem Inf Comput Sci. 2002; 42(6):1273-80. DOI: 10.1021/ci010132r. View

2.
Segler M, Waller M . Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. Chemistry. 2017; 23(25):5966-5971. DOI: 10.1002/chem.201605499. View

3.
Brown R, MARTIN Y . An evaluation of structural descriptors and clustering methods for use in diversity selection. SAR QSAR Environ Res. 1998; 8(1-2):23-39. DOI: 10.1080/10629369808033260. View

4.
Vogt M, Bajorath J . ccbmlib - a Python package for modeling Tanimoto similarity value distributions. F1000Res. 2020; 9. PMC: 7050271. DOI: 10.12688/f1000research.22292.2. View

5.
Lee A, Yang Q, Sresht V, Bolgar P, Hou X, Klug-McLeod J . Molecular Transformer unifies reaction prediction and retrosynthesis across pharma chemical space. Chem Commun (Camb). 2019; 55(81):12152-12155. DOI: 10.1039/c9cc05122h. View