» Articles » PMID: 36408900

ProGenomes3: Approaching One Million Accurately and Consistently Annotated High-quality Prokaryotic Genomes

Abstract

The interpretation of genomic, transcriptomic and other microbial 'omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic databases including mobile genetic elements and biosynthetic gene clusters. proGenomes3 encompasses 41 171 species-level clusters, defined based on universal single copy marker genes, for which pan-genomes and contextual habitat annotations are provided. The database is available at http://progenomes.embl.de/.

Citing Articles

Priority effects, nutrition and milk glycan-metabolic potential drive subspecies dynamics in the infant gut microbiome.

Pucci N, Ujcic-Voortman J, Verhoeff A, Mende D PeerJ. 2025; 13:e18602.

PMID: 39866568 PMC: 11758915. DOI: 10.7717/peerj.18602.


Analysis of exportins expression unveils their prognostic significance in colon adenocarcinoma: insights from public databases.

Kalia P, Nair R, Yadav S Discov Oncol. 2025; 16(1):21.

PMID: 39776001 PMC: 11711428. DOI: 10.1007/s12672-025-01748-4.


Evolutionary Analysis of the hnRNP Interactomes and Their Functions in Eukaryotes.

Nishanth M, Jha S Biochem Genet. 2024; .

PMID: 39540958 DOI: 10.1007/s10528-024-10956-6.


Non-canonical start codons confer context-dependent advantages in carbohydrate utilization for commensal E. coli in the murine gut.

Cherrak Y, Salazar M, Napflin N, Malfertheiner L, Herzog M, Schubert C Nat Microbiol. 2024; 9(10):2696-2709.

PMID: 39160293 PMC: 11445065. DOI: 10.1038/s41564-024-01775-x.


Genome-resolved metagenomics: a game changer for microbiome medicine.

Kim N, Ma J, Kim W, Kim J, Belenky P, Lee I Exp Mol Med. 2024; 56(7):1501-1512.

PMID: 38945961 PMC: 11297344. DOI: 10.1038/s12276-024-01262-7.


References
1.
Van Rossum T, Costea P, Paoli L, Alves R, Thielemann R, Sunagawa S . metaSNV v2: detection of SNVs and subspecies in prokaryotic metagenomes. Bioinformatics. 2021; 38(4):1162-1164. PMC: 8796361. DOI: 10.1093/bioinformatics/btab789. View

2.
Buttigieg P, Pafilis E, Lewis S, Schildhauer M, Walls R, Mungall C . The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation. J Biomed Semantics. 2016; 7(1):57. PMC: 5035502. DOI: 10.1186/s13326-016-0097-6. View

3.
Davis J, Wattam A, Aziz R, Brettin T, Butler R, Butler R . The PATRIC Bioinformatics Resource Center: expanding data and analysis capabilities. Nucleic Acids Res. 2019; 48(D1):D606-D612. PMC: 7145515. DOI: 10.1093/nar/gkz943. View

4.
Nocedal I, Laub M . Ancestral reconstruction of duplicated signaling proteins reveals the evolution of signaling specificity. Elife. 2022; 11. PMC: 9208753. DOI: 10.7554/eLife.77346. View

5.
Vilgalys R . Taxonomic misidentification in public DNA databases. New Phytol. 2021; 160(1):4-5. DOI: 10.1046/j.1469-8137.2003.00894.x. View