» Articles » PMID: 38741153

One Chiral Fingerprint to Find Them All

Overview
Journal J Cheminform
Publisher Biomed Central
Specialty Chemistry
Date 2024 May 14
PMID 38741153
Authors
Affiliations
Soon will be listed here.
Abstract

Molecular fingerprints are indispensable tools in cheminformatics. However, stereochemistry is generally not considered, which is problematic for large molecules which are almost all chiral. Herein we report MAP4C, a chiral version of our previously reported fingerprint MAP4, which lists MinHashes computed from character strings containing the SMILES of all pairs of circular substructures up to a diameter of four bonds and the shortest topological distance between their central atoms. MAP4C includes the Cahn-Ingold-Prelog (CIP) annotation (R, S, r or s) whenever the chiral atom is the center of a circular substructure, a question mark for undefined stereocenters, and double bond cis-trans information if specified. MAP4C performs slightly better than the achiral MAP4, ECFP and AP fingerprints in non-stereoselective virtual screening benchmarks. Furthermore, MAP4C distinguishes between stereoisomers in chiral molecules from small molecule drugs to large natural products and peptides comprising thousands of diastereomers, with a degree of distinction smaller than between structural isomers and proportional to the number of chirality changes. Due to its excellent performance across diverse molecular classes and its ability to handle stereochemistry, MAP4C is recommended as a generally applicable chiral molecular fingerprint. SCIENTIFIC CONTRIBUTION: The ability of our chiral fingerprint MAP4C to handle stereoisomers from small molecules to large natural products and peptides is unprecedented and opens the way for cheminformatics to include stereochemistry as an important molecular parameter across all fields of molecular design.

Citing Articles

Navigating a 1E+60 Chemical Space of Peptide/Peptoid Oligomers.

Orsi M, Reymond J Mol Inform. 2024; 44(1):e202400186.

PMID: 39390672 PMC: 11733718. DOI: 10.1002/minf.202400186.


AutoPeptideML: a study on how to build more trustworthy peptide bioactivity predictors.

Fernandez-Diaz R, Cossio-Perez R, Agoni C, Lam H, Lopez V, Shields D Bioinformatics. 2024; 40(9).

PMID: 39292535 PMC: 11438549. DOI: 10.1093/bioinformatics/btae555.


Chemoinformatic Characterization of NAPROC-13: A Database for Natural Product C NMR Dereplication.

Avellaneda-Tamayo J, Agudo-Munoz N, Sanchez-Galan J, Lopez-Perez J, Medina-Franco J J Nat Prod. 2024; 87(9):2216-2229.

PMID: 39269718 PMC: 11443490. DOI: 10.1021/acs.jnatprod.4c00530.


Can large language models predict antimicrobial peptide activity and toxicity?.

Orsi M, Reymond J RSC Med Chem. 2024; 15(6):2030-2036.

PMID: 38911166 PMC: 11187562. DOI: 10.1039/d4md00159a.

References
1.
Capecchi A, Probst D, Reymond J . One molecular fingerprint to rule them all: drugs, biomolecules, and the metabolome. J Cheminform. 2021; 12(1):43. PMC: 7291580. DOI: 10.1186/s13321-020-00445-4. View

2.
Willett P . Similarity-based virtual screening using 2D fingerprints. Drug Discov Today. 2006; 11(23-24):1046-53. DOI: 10.1016/j.drudis.2006.10.005. View

3.
Medina-Franco J, Sanchez-Cruz N, Lopez-Lopez E, Diaz-Eufracio B . Progress on open chemoinformatic tools for expanding and exploring the chemical space. J Comput Aided Mol Des. 2021; 36(5):341-354. PMC: 8211976. DOI: 10.1007/s10822-021-00399-1. View

4.
Scior T, Bender A, Tresadern G, Medina-Franco J, Martinez-Mayorga K, Langer T . Recognizing pitfalls in virtual screening: a critical review. J Chem Inf Model. 2012; 52(4):867-81. DOI: 10.1021/ci200528d. View

5.
Probst D, Reymond J . FUn: a framework for interactive visualizations of large, high-dimensional datasets on the web. Bioinformatics. 2017; 34(8):1433-1435. DOI: 10.1093/bioinformatics/btx760. View