» Articles » PMID: 39739308

MeSS and Assembly_finder: a Toolkit for in Silico Metagenomic Sample Generation

Overview
Journal Bioinformatics
Specialty Biology
Date 2024 Dec 31
PMID 39739308
Authors
Affiliations
Soon will be listed here.
Abstract

Summary: The intrinsic complexity of the microbiota combined with technical variability render shotgun metagenomics challenging to analyze for routine clinical or research applications. In silico data generation offers a controlled environment allowing for example to benchmark bioinformatics tools, to optimize study design, statistical power, or to validate targeted applications. Here, we propose assembly_finder and the Metagenomic Sequence Simulator (MeSS), two easy-to-use Bioconda packages, as part of a benchmarking toolkit to download genomes and simulate shotgun metagenomics samples, respectively. Outperforming existing tools in speed while requiring less memory, MeSS reproducibly generates accurate complex communities based on a list of taxonomic ranks and their abundance.

Availability And Implementation: All code is released under MIT License and is available on https://github.com/metagenlab/MeSS and https://github.com/metagenlab/assembly_finder.

References
1.
Manni M, Berkeley M, Seppey M, Zdobnov E . BUSCO: Assessing Genomic Data Quality and Beyond. Curr Protoc. 2021; 1(12):e323. DOI: 10.1002/cpz1.323. View

2.
Rong R, Jiang S, Xu L, Xiao G, Xie Y, Liu D . MB-GAN: Microbiome Simulation via Generative Adversarial Network. Gigascience. 2021; 10(2). PMC: 7931821. DOI: 10.1093/gigascience/giab005. View

3.
. Structure, function and diversity of the healthy human microbiome. Nature. 2012; 486(7402):207-14. PMC: 3564958. DOI: 10.1038/nature11234. View

4.
Caro H, Dollin S, Biton A, Brancotte B, Desvillechabrol D, Dufresne Y . BioConvert: a comprehensive format converter for life sciences. NAR Genom Bioinform. 2023; 5(3):lqad074. PMC: 10440784. DOI: 10.1093/nargab/lqad074. View

5.
Federhen S . The NCBI Taxonomy database. Nucleic Acids Res. 2011; 40(Database issue):D136-43. PMC: 3245000. DOI: 10.1093/nar/gkr1178. View