» Articles » PMID: 36130056

Reproducible Acquisition, Management and Meta-analysis of Nucleotide Sequence (meta)data Using Q2-fondue

Overview
Journal Bioinformatics
Specialty Biology
Date 2022 Sep 21
PMID 36130056
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: The volume of public nucleotide sequence data has blossomed over the past two decades and is ripe for re- and meta-analyses to enable novel discoveries. However, reproducible re-use and management of sequence datasets and associated metadata remain critical challenges. We created the open source Python package q2-fondue to enable user-friendly acquisition, re-use and management of public sequence (meta)data while adhering to open data principles.

Results: q2-fondue allows fully provenance-tracked programmatic access to and management of data from the NCBI Sequence Read Archive (SRA). Unlike other packages allowing download of sequence data from the SRA, q2-fondue enables full data provenance tracking from data download to final visualization, integrates with the QIIME 2 ecosystem, prevents data loss upon space exhaustion and allows download of (meta)data given a publication library. To highlight its manifold capabilities, we present executable demonstrations using publicly available amplicon, whole genome and metagenome datasets.

Availability And Implementation: q2-fondue is available as an open-source BSD-3-licensed Python package at https://github.com/bokulich-lab/q2-fondue. Usage tutorials are available in the same repository. All Jupyter notebooks used in this article are available under https://github.com/bokulich-lab/q2-fondue-examples.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

MOSHPIT: accessible, reproducible metagenome data science on the QIIME 2 framework.

Ziemski M, Gehret L, Simard A, Dau S, Risch V, Grabocka D bioRxiv. 2025; .

PMID: 39975011 PMC: 11838374. DOI: 10.1101/2025.01.27.635007.


Differential impact of infection on the microbiota of and .

Skickova S, Svobodova K, Maitre A, Wu-Chuang A, Abuin-Denis L, Piloto-Sardinas E Heliyon. 2024; 10(22):e39384.

PMID: 39624306 PMC: 11609247. DOI: 10.1016/j.heliyon.2024.e39384.


Food-breastmilk combinations alter the colonic microbiome of weaning infants: an study.

da Silva V, Smith N, Mullaney J, Wall C, Roy N, McNabb W mSystems. 2024; 9(9):e0057724.

PMID: 39191378 PMC: 11406890. DOI: 10.1128/msystems.00577-24.


Impact of methane mitigation strategies on the native ruminant microbiome: A protocol for a systematic review and meta-analysis.

Frazier A, Belk A, Beck M, Koziel J PLoS One. 2024; 19(8):e0308914.

PMID: 39172818 PMC: 11340963. DOI: 10.1371/journal.pone.0308914.


sp. nov., isolated from the nose, lung, and liver of rabbits.

Boutroux M, Favre-Rochex S, Gorgette O, Touak G, Muhle E, Bouchier C Int J Syst Evol Microbiol. 2024; 74(7).

PMID: 39023135 PMC: 11316581. DOI: 10.1099/ijsem.0.006460.


References
1.
Stephens Z, Lee S, Faghri F, Campbell R, Zhai C, Efron M . Big Data: Astronomical or Genomical?. PLoS Biol. 2015; 13(7):e1002195. PMC: 4494865. DOI: 10.1371/journal.pbio.1002195. View

2.
Kodama Y, Shumway M, Leinonen R . The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res. 2011; 40(Database issue):D54-6. PMC: 3245110. DOI: 10.1093/nar/gkr854. View

3.
Panagiotou O, Willer C, Hirschhorn J, Ioannidis J . The power of meta-analysis in genome-wide association studies. Annu Rev Genomics Hum Genet. 2013; 14:441-65. PMC: 4040957. DOI: 10.1146/annurev-genom-091212-153520. View

4.
Wilkinson M, Dumontier M, Aalbersberg I, Appleton G, Axton M, Baak A . The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016; 3:160018. PMC: 4792175. DOI: 10.1038/sdata.2016.18. View

5.
Hadfield J, Megill C, Bell S, Huddleston J, Potter B, Callender C . Nextstrain: real-time tracking of pathogen evolution. Bioinformatics. 2018; 34(23):4121-4123. PMC: 6247931. DOI: 10.1093/bioinformatics/bty407. View