» Articles » PMID: 34076241

Mantis: Flexible and Consensus-driven Genome Annotation

Overview
Journal Gigascience
Specialties Biology
Genetics
Date 2021 Jun 2
PMID 34076241
Citations 16
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The rapid development of the (meta-)omics fields has produced an unprecedented amount of high-resolution and high-fidelity data. Through the use of these datasets we can infer the role of previously functionally unannotated proteins from single organisms and consortia. In this context, protein function annotation can be described as the identification of regions of interest (i.e., domains) in protein sequences and the assignment of biological functions. Despite the existence of numerous tools, challenges remain in terms of speed, flexibility, and reproducibility. In the big data era, it is also increasingly important to cease limiting our findings to a single reference, coalescing knowledge from different data sources, and thus overcoming some limitations in overly relying on computationally generated data from single sources.

Results: We implemented a protein annotation tool, Mantis, which uses database identifiers intersection and text mining to integrate knowledge from multiple reference data sources into a single consensus-driven output. Mantis is flexible, allowing for the customization of reference data and execution parameters, and is reproducible across different research goals and user environments. We implemented a depth-first search algorithm for domain-specific annotation, which significantly improved annotation performance compared to sequence-wide annotation. The parallelized implementation of Mantis results in short runtimes while also outputting high coverage and high-quality protein function annotations.

Conclusions: Mantis is a protein function annotation tool that produces high-quality consensus-driven protein annotations. It is easy to set up, customize, and use, scaling from single genomes to large metagenomes. Mantis is available under the MIT license at https://github.com/PedroMTQ/mantis.

Citing Articles

Diversity and biogeography of the bacterial microbiome in glacier-fed streams.

Ezzat L, Peter H, Bourquin M, Busi S, Michoud G, Fodelianakis S Nature. 2025; 637(8046):622-630.

PMID: 39743584 PMC: 11735386. DOI: 10.1038/s41586-024-08313-z.


Microbial communities reveal niche partitioning across the slope and bottom zones of the challenger deep.

Hu A, Zhao W, Wang J, Qi Q, Xiao X, Jing H Environ Microbiol Rep. 2024; 16(4):e13314.

PMID: 39086173 PMC: 11291871. DOI: 10.1111/1758-2229.13314.


Functional prediction of proteins from the human gut archaeome.

Novikova P, Busi S, Probst A, May P, Wilmes P ISME Commun. 2024; 4(1):ycad014.

PMID: 38486809 PMC: 10939349. DOI: 10.1093/ismeco/ycad014.


A toolbox of machine learning software to support microbiome analysis.

Marcos-Zambrano L, Lopez-Molina V, Bakir-Gungor B, Frohme M, Karaduzovic-Hadziabdic K, Klammsteiner T Front Microbiol. 2023; 14:1250806.

PMID: 38075858 PMC: 10704913. DOI: 10.3389/fmicb.2023.1250806.


Forecasting the dynamics of a complex microbial community using integrated meta-omics.

Delogu F, Kunath B, Queiros P, Halder R, Lebrun L, Pope P Nat Ecol Evol. 2023; 8(1):32-44.

PMID: 37957315 PMC: 10781640. DOI: 10.1038/s41559-023-02241-3.


References
1.
Treiber M, Taft D, Korf I, Mills D, Lemay D . Pre- and post-sequencing recommendations for functional annotation of human fecal metagenomes. BMC Bioinformatics. 2020; 21(1):74. PMC: 7041091. DOI: 10.1186/s12859-020-3416-y. View

2.
Steinegger M, Meier M, Mirdita M, Vohringer H, Haunsberger S, Soding J . HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics. 2019; 20(1):473. PMC: 6744700. DOI: 10.1186/s12859-019-3019-7. View

3.
Queiros P, Delogu F, Hickl O, May P, Wilmes P . Mantis: flexible and consensus-driven genome annotation. Gigascience. 2021; 10(6). PMC: 8170692. DOI: 10.1093/gigascience/giab042. View

4.
Pesquita C, Faria D, Falcao A, Lord P, Couto F . Semantic similarity in biomedical ontologies. PLoS Comput Biol. 2009; 5(7):e1000443. PMC: 2712090. DOI: 10.1371/journal.pcbi.1000443. View

5.
Zhao B, Hu S, Li X, Zhang F, Tian Q, Ni W . An efficient method for protein function annotation based on multilayer protein networks. Hum Genomics. 2016; 10(1):33. PMC: 5039885. DOI: 10.1186/s40246-016-0087-x. View