» Articles » PMID: 38895276

Lightweight Taxonomic Profiling of Long-read Metagenomic Datasets with Lemur and Magnet

Overview
Journal bioRxiv
Date 2024 Jun 19
PMID 38895276
Authors
Affiliations
Soon will be listed here.
Abstract

The advent of long-read sequencing of microbiomes necessitates the development of new taxonomic profilers tailored to long-read shotgun metagenomic datasets. Here, we introduce Lemur and Magnet, a pair of tools optimized for lightweight and accurate taxonomic profiling for long-read shotgun metagenomic datasets. Lemur is a marker-gene-based method that leverages an EM algorithm to reduce false positive calls while preserving true positives; Magnet is a whole-genome read-mapping-based method that provides detailed presence and absence calls for bacterial genomes. We demonstrate that Lemur and Magnet can run in minutes to hours on a laptop with 32 GB of RAM, even for large inputs, a crucial feature given the portability of long-read sequencing machines. Furthermore, the marker gene database used by Lemur is only 4 GB and contains information from over 300,000 RefSeq genomes. Lemur and Magnet are open-source and available at https://github.com/treangenlab/lemur and https://github.com/treangenlab/magnet.

References
1.
Agustinho D, Fu Y, Menon V, Metcalf G, Treangen T, Sedlazeck F . Unveiling microbial diversity: harnessing long-read sequencing technology. Nat Methods. 2024; 21(6):954-966. DOI: 10.1038/s41592-024-02262-1. View

2.
Sakamoto M, Ikeyama N, Toyoda A, Murakami T, Mori H, Morohoshi S . Coprobacter secundus subsp. similis subsp. nov. and Solibaculum mannosilyticum gen. nov., sp. nov., isolated from human feces. Microbiol Immunol. 2021; 65(6):245-256. DOI: 10.1111/1348-0421.12886. View

3.
Ye S, Siddle K, Park D, Sabeti P . Benchmarking Metagenomics Tools for Taxonomic Classification. Cell. 2019; 178(4):779-794. PMC: 6716367. DOI: 10.1016/j.cell.2019.07.010. View

4.
Wu M, Eisen J . A simple, fast, and accurate method of phylogenomic inference. Genome Biol. 2008; 9(10):R151. PMC: 2760878. DOI: 10.1186/gb-2008-9-10-r151. View

5.
Quick J, Loman N, Duraffour S, Simpson J, Severi E, Cowley L . Real-time, portable genome sequencing for Ebola surveillance. Nature. 2016; 530(7589):228-232. PMC: 4817224. DOI: 10.1038/nature16996. View