» Articles » PMID: 39148510

CrysFormer: Protein Structure Determination Via Patterson Maps, Deep Learning, and Partial Structure Attention

Overview
Journal Struct Dyn
Date 2024 Aug 16
PMID 39148510
Authors
Affiliations
Soon will be listed here.
Abstract

Determining the atomic-level structure of a protein has been a decades-long challenge. However, recent advances in transformers and related neural network architectures have enabled researchers to significantly improve solutions to this problem. These methods use large datasets of sequence information and corresponding known protein template structures, if available. Yet, such methods only focus on sequence information. Other available prior knowledge could also be utilized, such as constructs derived from x-ray crystallography experiments and the known structures of the most common conformations of amino acid residues, which we refer to as partial structures. To the best of our knowledge, we propose the first transformer-based model that directly utilizes experimental protein crystallographic data and partial structure information to calculate electron density maps of proteins. In particular, we use Patterson maps, which can be directly obtained from x-ray crystallography experimental data, thus bypassing the well-known crystallographic phase problem. We demonstrate that our method, CrysFormer, achieves precise predictions on two synthetic datasets of peptide fragments in crystalline forms, one with two residues per unit cell and the other with fifteen. These predictions can then be used to generate accurate atomic models using established crystallographic refinement programs.

Citing Articles

The physics-AI dialogue in drug design.

Vargas-Rosales P, Caflisch A RSC Med Chem. 2025; .

PMID: 39906313 PMC: 11788922. DOI: 10.1039/d4md00869c.

References
1.
. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 2018; 47(D1):D520-D528. PMC: 6324056. DOI: 10.1093/nar/gky949. View

2.
Winn M, Ballard C, Cowtan K, Dodson E, Emsley P, Evans P . Overview of the CCP4 suite and current developments. Acta Crystallogr D Biol Crystallogr. 2011; 67(Pt 4):235-42. PMC: 3069738. DOI: 10.1107/S0907444910045749. View

3.
Cheng H, Lian D, Gao S, Geng Y . Utilizing Information Bottleneck to Evaluate the Capability of Deep Neural Networks for Image Classification. Entropy (Basel). 2020; 21(5). PMC: 7514945. DOI: 10.3390/e21050456. View

4.
Brini E, Simmerling C, Dill K . Protein storytelling through physics. Science. 2020; 370(6520). PMC: 7945008. DOI: 10.1126/science.aaz3041. View

5.
Sali A, Blundell T . Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993; 234(3):779-815. DOI: 10.1006/jmbi.1993.1626. View