» Articles » PMID: 25286836

Experiences with Making Diffraction Image Data Available: What Metadata Do We Need to Archive?

Overview
Specialty Chemistry
Date 2014 Oct 8
PMID 25286836
Citations 20
Authors
Affiliations
Soon will be listed here.
Abstract

Recently, the IUCr (International Union of Crystallography) initiated the formation of a Diffraction Data Deposition Working Group with the aim of developing standards for the representation of raw diffraction data associated with the publication of structural papers. Archiving of raw data serves several goals: to improve the record of science, to verify the reproducibility and to allow detailed checks of scientific data, safeguarding against fraud and to allow reanalysis with future improved techniques. A means of studying this issue is to submit exemplar publications with associated raw data and metadata. In a recent study of the binding of cisplatin and carboplatin to histidine in lysozyme crystals under several conditions, the possible effects of the equipment and X-ray diffraction data-processing software on the occupancies and B factors of the bound Pt compounds were compared. Initially, 35.3 GB of data were transferred from Manchester to Utrecht to be processed with EVAL. A detailed description and discussion of the availability of metadata was published in a paper that was linked to a local raw data archive at Utrecht University and also mirrored at the TARDIS raw diffraction data archive in Australia. By making these raw diffraction data sets available with the article, it is possible for the diffraction community to make their own evaluation. This led to one of the authors of XDS (K. Diederichs) to re-integrate the data from crystals that supposedly solely contained bound carboplatin, resulting in the analysis of partially occupied chlorine anomalous electron densities near the Pt-binding sites and the use of several criteria to more carefully assess the diffraction resolution limit. General arguments for archiving raw data, the possibilities of doing so and the requirement of resources are discussed. The problems associated with a partially unknown experimental setup, which preferably should be available as metadata, is discussed. Current thoughts on data compression are summarized, which could be a solution especially for pixel-device data sets with fine slicing that may otherwise present an unmanageable amount of data.

Citing Articles

Massive compression for high data rate macromolecular crystallography (HDRMX): impact on diffraction data and subsequent structural analysis.

Bernstein H, Soares A, Horvat K, Jakoncic J J Synchrotron Radiat. 2025; 32(Pt 2):385-398.

PMID: 39913307 PMC: 11892891. DOI: 10.1107/S1600577525000396.


Making your raw data available to the macromolecular crystallography community.

Kroon-Batenburg L Acta Crystallogr F Struct Biol Commun. 2023; 79(Pt 10):267-273.

PMID: 37815476 PMC: 10565795. DOI: 10.1107/S2053230X23007987.


launches Raw Data Letters.

Kroon-Batenburg L, Helliwell J, Hester J IUCrdata. 2022; 7(Pt 9):x220821.

PMID: 36337453 PMC: 9635430. DOI: 10.1107/S2414314622008215.


Raw diffraction data are our ground truth from which all subsequent workflows develop.

Helliwell J Acta Crystallogr D Struct Biol. 2022; 78(Pt 6):683-689.

PMID: 35647915 PMC: 9159283. DOI: 10.1107/S2059798322003795.


Submission of structural biology data for review purposes.

Baker E, Bond C, Garman E, Newman J, Read R, van Raaij M IUCrJ. 2022; 9(Pt 1):1-2.

PMID: 35059201 PMC: 8733878. DOI: 10.1107/S2052252521012999.


References
1.
Ferrer J, Roth M, Antoniadis A . Data compression for diffraction patterns. Acta Crystallogr D Biol Crystallogr. 1998; 54(Pt 2):184-99. DOI: 10.1107/s0907444997007257. View

2.
Evans P . Scaling and assessment of data quality. Acta Crystallogr D Biol Crystallogr. 2005; 62(Pt 1):72-82. DOI: 10.1107/S0907444905036693. View

3.
Helliwell J . Overview and new developments in softer X-ray (2A < lambda < 5A) protein crystallography. J Synchrotron Radiat. 2003; 11(Pt 1):1-3. DOI: 10.1107/s0909049503024099. View

4.
Cruickshank D . Remarks about protein structure precision. Acta Crystallogr D Biol Crystallogr. 1999; 55(Pt 3):583-601. DOI: 10.1107/s0907444998012645. View

5.
Leslie A . Integration of macromolecular diffraction data. Acta Crystallogr D Biol Crystallogr. 1999; 55(Pt 10):1696-702. DOI: 10.1107/s090744499900846x. View