» Articles » PMID: 36611029

Leveraging Molecular Structure and Bioactivity with Chemical Language Models for De Novo Drug Design

Overview
Journal Nat Commun
Specialty Biology
Date 2023 Jan 7
PMID 36611029
Authors
Affiliations
Soon will be listed here.
Abstract

Generative chemical language models (CLMs) can be used for de novo molecular structure generation by learning from a textual representation of molecules. Here, we show that hybrid CLMs can additionally leverage the bioactivity information available for the training compounds. To computationally design ligands of phosphoinositide 3-kinase gamma (PI3Kγ), a collection of virtual molecules was created with a generative CLM. This virtual compound library was refined using a CLM-based classifier for bioactivity prediction. This second hybrid CLM was pretrained with patented molecular structures and fine-tuned with known PI3Kγ ligands. Several of the computer-generated molecular designs were commercially available, enabling fast prescreening and preliminary experimental validation. A new PI3Kγ ligand with sub-micromolar activity was identified, highlighting the method's scaffold-hopping potential. Chemical synthesis and biochemical testing of two of the top-ranked de novo designed molecules and their derivatives corroborated the model's ability to generate PI3Kγ ligands with medium to low nanomolar activity for hit-to-lead expansion. The most potent compounds led to pronounced inhibition of PI3K-dependent Akt phosphorylation in a medulloblastoma cell model, demonstrating efficacy of PI3Kγ ligands in PI3K/Akt pathway repression in human tumor cells. The results positively advocate hybrid CLMs for virtual compound screening and activity-focused molecular design.

Citing Articles

Accelerating discovery of bioactive ligands with pharmacophore-informed generative models.

Xie W, Zhang J, Xie Q, Gong C, Ren Y, Xie J Nat Commun. 2025; 16(1):2391.

PMID: 40064886 PMC: 11894060. DOI: 10.1038/s41467-025-56349-0.


In silico discovery of a novel potential allosteric PI3Kα inhibitor incorporating 2-oxopropyl urea targeting head and neck squamous cell carcinoma.

Jia W, Li G, Cheng X, Zhang R, Ma Y BMC Chem. 2025; 19(1):55.

PMID: 40022235 PMC: 11871742. DOI: 10.1186/s13065-025-01420-6.


Leveraging large language models for peptide antibiotic design.

Guan C, Fernandes F, Franco O, de la Fuente-Nunez C Cell Rep Phys Sci. 2025; 6(1).

PMID: 39949833 PMC: 11823563. DOI: 10.1016/j.xcrp.2024.102359.


fragSMILES as a chemical string notation for advanced fragment and chirality representation.

Mastrolorito F, Ciriaco F, Togo M, Gambacorta N, Trisciuzzi D, Altomare C Commun Chem. 2025; 8(1):26.

PMID: 39880917 PMC: 11779804. DOI: 10.1038/s42004-025-01423-3.


Artificial intelligence in drug development.

Zhang K, Yang X, Wang Y, Yu Y, Huang N, Li G Nat Med. 2025; 31(1):45-59.

PMID: 39833407 DOI: 10.1038/s41591-024-03434-4.


References
1.
Dimova D, Stumpfe D, Bajorath J . Systematic assessment of coordinated activity cliffs formed by kinase inhibitors and detailed characterization of activity cliff clusters and associated SAR information. Eur J Med Chem. 2014; 90:414-27. DOI: 10.1016/j.ejmech.2014.11.058. View

2.
Yang J, Nie J, Ma X, Wei Y, Peng Y, Wei X . Targeting PI3K in cancer: mechanisms and advances in clinical trials. Mol Cancer. 2019; 18(1):26. PMC: 6379961. DOI: 10.1186/s12943-019-0954-x. View

3.
Olivecrona M, Blaschke T, Engkvist O, Chen H . Molecular de-novo design through deep reinforcement learning. J Cheminform. 2017; 9(1):48. PMC: 5583141. DOI: 10.1186/s13321-017-0235-x. View

4.
Bussink J, van der Kogel A, Kaanders J . Activation of the PI3-K/AKT pathway and implications for radioresistance mechanisms in head and neck cancer. Lancet Oncol. 2008; 9(3):288-96. DOI: 10.1016/S1470-2045(08)70073-1. View

5.
Verdonk M, Cole J, Hartshorn M, Murray C, Taylor R . Improved protein-ligand docking using GOLD. Proteins. 2003; 52(4):609-23. DOI: 10.1002/prot.10465. View