» Articles » PMID: 37699937

A Quantum Chemical Interaction Energy Dataset for Accurately Modeling Protein-ligand Interactions

Overview
Journal Sci Data
Specialty Science
Date 2023 Sep 12
PMID 37699937
Authors
Affiliations
Soon will be listed here.
Abstract

Fast and accurate calculation of intermolecular interaction energies is desirable for understanding many chemical and biological processes, including the binding of small molecules to proteins. The Splinter ["Symmetry-adapted perturbation theory (SAPT0) protein-ligand interaction"] dataset has been created to facilitate the development and improvement of methods for performing such calculations. Molecular fragments representing commonly found substructures in proteins and small-molecule ligands were paired into >9000 unique dimers, assembled into numerous configurations using an approach designed to adequately cover the breadth of the dimers' potential energy surfaces while enhancing sampling in favorable regions. ~1.5 million configurations of these dimers were randomly generated, and a structurally diverse subset of these were minimized to obtain an additional ~80 thousand local and global minima. For all >1.6 million configurations, SAPT0 calculations were performed with two basis sets to complete the dataset. It is expected that Splinter will be a useful benchmark dataset for training and testing various methods for the calculation of intermolecular interaction energies.

Citing Articles

MORE-Q, a dataset for molecular olfactorial receptor engineering by quantum mechanics.

Chen L, Medrano Sandonas L, Traber P, Dianat A, Tverdokhleb N, Hurevich M Sci Data. 2025; 12(1):324.

PMID: 39987132 PMC: 11846975. DOI: 10.1038/s41597-025-04616-6.


Quantification of Anisotropy in Exchange and Dispersion Interactions: A Simple Model for Physics-Based Force Fields.

Kriz K, van der Spoel D J Phys Chem Lett. 2024; 15(39):9974-9978.

PMID: 39314113 PMC: 11457221. DOI: 10.1021/acs.jpclett.4c02034.


A physics-aware neural network for protein-ligand interactions with quantum chemical accuracy.

Glick Z, Metcalf D, Glick C, Spronk S, Koutsoukas A, Cheney D Chem Sci. 2024; 15(33):13313-13324.

PMID: 39183910 PMC: 11339967. DOI: 10.1039/d4sc01029a.


Electron iso-density surfaces provide a thermodynamically consistent representation of atomic and molecular surfaces.

Alibakhshi A, Schafer L Nat Commun. 2024; 15(1):6086.

PMID: 39030194 PMC: 11271626. DOI: 10.1038/s41467-024-50408-8.


The influence of model building schemes and molecular dynamics sampling on QM-cluster models: the chorismate mutase case study.

Agbaglo D, Summers T, Cheng Q, DeYonker N Phys Chem Chem Phys. 2024; 26(16):12467-12482.

PMID: 38618904 PMC: 11090134. DOI: 10.1039/d3cp06100k.


References
1.
Parker T, Burns L, Parrish R, Ryno A, Sherrill C . Levels of symmetry adapted perturbation theory (SAPT). I. Efficiency and performance for interaction energies. J Chem Phys. 2014; 140(9):094106. DOI: 10.1063/1.4867135. View

2.
Fedik N, Zubatyuk R, Kulichenko M, Lubbers N, Smith J, Nebgen B . Extending machine learning beyond interatomic potentials for predicting molecular properties. Nat Rev Chem. 2023; 6(9):653-672. DOI: 10.1038/s41570-022-00416-3. View

3.
Kulichenko M, Smith J, Nebgen B, Li Y, Fedik N, Boldyrev A . The Rise of Neural Networks for Materials and Chemical Dynamics. J Phys Chem Lett. 2021; 12(26):6227-6243. DOI: 10.1021/acs.jpclett.1c01357. View

4.
Eastman P, Behara P, Dotson D, Galvelis R, Herr J, Horton J . SPICE, A Dataset of Drug-like Molecules and Peptides for Training Machine Learning Potentials. Sci Data. 2023; 10(1):11. PMC: 9813265. DOI: 10.1038/s41597-022-01882-6. View

5.
Smith J, Isayev O, Roitberg A . ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem Sci. 2017; 8(4):3192-3203. PMC: 5414547. DOI: 10.1039/c6sc05720a. View