» Articles » PMID: 39934981

A Call for Action: Lessons Learned From a Pilot to Share a Complex, Linked COVID-19 Cohort Dataset for Open Science

Abstract

The COVID-19 pandemic proved how sharing of genomic sequences in a timely manner, as well as early detection and surveillance of variants and characterization of their clinical impacts, helped to inform public health responses. However, the area of (re)emerging infectious diseases and our global connectivity require interdisciplinary collaborations to happen at local, national and international levels and connecting data to understand the linkages between all factors involved. Here, we describe experiences and lessons learned from a COVID-19 pilot study aimed at developing a model for storage and sharing linked laboratory data and clinical-epidemiological data using European open science infrastructure. We provide insights into the barriers and complexities of internationally sharing linked, complex cohort datasets from opportunistic studies for connected data analyses. An analytical timeline of events, describing key actions and delays in the execution of the pilot, and a critical path, defining steps in the process of internationally sharing a linked cohort dataset are included. The pilot showed how building on existing infrastructure that had previously been developed within the European Nucleotide Archive at the European Molecular Biology Laboratory-European Bioinformatics Institute for pathogen genomics data sharing, allowed the rapid development of connected "data hubs." These data hubs were required to link human clinical-epidemiological data under controlled access with open high dimensional laboratory data, under FAIR (Findable, Accessible, Interoperable, Reusable) principles. Based on our own experiences, we call for action and make recommendations to support and to improve data sharing for outbreak preparedness and response.

References
1.
Oude Munnink B, Nieuwenhuijse D, Stein M, OToole A, Haverkate M, Mollers M . Rapid SARS-CoV-2 whole-genome sequencing and analysis for informed public health decision-making in the Netherlands. Nat Med. 2020; 26(9):1405-1410. DOI: 10.1038/s41591-020-0997-y. View

2.
Lin D, Crabtree J, Dillo I, Downs R, Edmunds R, Giaretta D . The TRUST Principles for digital repositories. Sci Data. 2020; 7(1):144. PMC: 7224370. DOI: 10.1038/s41597-020-0486-7. View

3.
Aguilar-Bretones M, Westerhuis B, Raadsen M, de Bruin E, Chandler F, Okba N . Seasonal coronavirus-specific B cells with limited SARS-CoV-2 cross-reactivity dominate the IgG response in severe COVID-19. J Clin Invest. 2021; 131(21). PMC: 8553556. DOI: 10.1172/JCI150613. View

4.
Federer L, Lu Y, Joubert D, Welsh J, Brandys B . Biomedical Data Sharing and Reuse: Attitudes and Practices of Clinical and Scientific Research Staff. PLoS One. 2015; 10(6):e0129506. PMC: 4481309. DOI: 10.1371/journal.pone.0129506. View

5.
Hallock H, Marshall S, t Hoen P, Nygard J, Hoorne B, Fox C . Federated Networks for Distributed Analysis of Health Data. Front Public Health. 2021; 9:712569. PMC: 8514765. DOI: 10.3389/fpubh.2021.712569. View