» Articles » PMID: 39913307

Massive Compression for High Data Rate Macromolecular Crystallography (HDRMX): Impact on Diffraction Data and Subsequent Structural Analysis

Overview
Date 2025 Feb 6
PMID 39913307
Authors
Affiliations
Soon will be listed here.
Abstract

New higher-count-rate, integrating, large-area X-ray detectors with framing rates as high as 17400 images per second are beginning to be available. These will soon be used for specialized macromolecular crystallography experiments but will require optimal lossy compression algorithms to enable systems to keep up with data throughput. Some information may be lost. Can we minimize this loss with acceptable impact on structural information? To explore this question, we have considered several approaches: summing short sequences of images, binning to create the effect of larger pixels, use of JPEG-2000 lossy wavelet-based compression, and use of Hcompress, which is a Haar-wavelet-based lossy compression borrowed from astronomy. We also explore the effect of the combination of summing, binning, and Hcompress or JPEG-2000. In each of these last two methods one can specify approximately how much one wants the result to be compressed from the starting file size. These provide particularly effective lossy compressions that retain essential information for structure solution from Bragg reflections.

References
1.
Bernstein H, Andrews L, Diaz Jr J, Jakoncic J, Nguyen T, Sauter N . Best practices for high data-rate macromolecular crystallography (HDRMX). Struct Dyn. 2020; 7(1):014302. PMC: 6952294. DOI: 10.1063/1.5128498. View

2.
Vagin A, Steiner R, Lebedev A, Potterton L, McNicholas S, Long F . REFMAC5 dictionary: organization of prior chemical knowledge and guidelines for its use. Acta Crystallogr D Biol Crystallogr. 2004; 60(Pt 12 Pt 1):2184-95. DOI: 10.1107/S0907444904023510. View

3.
Kroon-Batenburg L, Helliwell J . Experiences with making diffraction image data available: what metadata do we need to archive?. Acta Crystallogr D Biol Crystallogr. 2014; 70(Pt 10):2502-9. PMC: 4187998. DOI: 10.1107/S1399004713029817. View

4.
Prucha G, Henry S, Hollander K, Carter Z, Spasov K, Jorgensen W . Covalent and noncovalent strategies for targeting Lys102 in HIV-1 reverse transcriptase. Eur J Med Chem. 2023; 262:115894. PMC: 10872499. DOI: 10.1016/j.ejmech.2023.115894. View

5.
Langer G, Cohen S, Lamzin V, Perrakis A . Automated macromolecular model building for X-ray crystallography using ARP/wARP version 7. Nat Protoc. 2008; 3(7):1171-9. PMC: 2582149. DOI: 10.1038/nprot.2008.91. View