» Articles » PMID: 38843834

Discovery of Antimicrobial Peptides in the Global Microbiome with Machine Learning

Abstract

Novel antibiotics are urgently needed to combat the antibiotic-resistance crisis. We present a machine-learning-based approach to predict antimicrobial peptides (AMPs) within the global microbiome and leverage a vast dataset of 63,410 metagenomes and 87,920 prokaryotic genomes from environmental and host-associated habitats to create the AMPSphere, a comprehensive catalog comprising 863,498 non-redundant peptides, few of which match existing databases. AMPSphere provides insights into the evolutionary origins of peptides, including by duplication or gene truncation of longer sequences, and we observed that AMP production varies by habitat. To validate our predictions, we synthesized and tested 100 AMPs against clinically relevant drug-resistant pathogens and human gut commensals both in vitro and in vivo. A total of 79 peptides were active, with 63 targeting pathogens. These active AMPs exhibited antibacterial activity by disrupting bacterial membranes. In conclusion, our approach identified nearly one million prokaryotic AMP sequences, an open-access resource for antibiotic discovery.

Citing Articles

Advances in medical devices using nanomaterials and nanotechnology: Innovation and regulatory science.

Lin C, Huang X, Xue Y, Jiang S, Chen C, Liu Y Bioact Mater. 2025; 48:353-369.

PMID: 40060145 PMC: 11889687. DOI: 10.1016/j.bioactmat.2025.02.017.


Paving the way for new antimicrobial peptides through molecular de-extinction.

Osiro K, Gil-Ley A, Fernandes F, de Oliveira K, de la Fuente-Nunez C, Franco O Microb Cell. 2025; 12:1-8.

PMID: 40012704 PMC: 11853161. DOI: 10.15698/mic2025.02.841.


Generative latent diffusion language modeling yields anti-infective synthetic peptides.

Torres M, Chen T, Wan F, Chatterjee P, de la Fuente-Nunez C bioRxiv. 2025; .

PMID: 39975107 PMC: 11838489. DOI: 10.1101/2025.01.31.636003.


Leveraging large language models for peptide antibiotic design.

Guan C, Fernandes F, Franco O, de la Fuente-Nunez C Cell Rep Phys Sci. 2025; 6(1).

PMID: 39949833 PMC: 11823563. DOI: 10.1016/j.xcrp.2024.102359.


Microbial production systems and optimization strategies of antimicrobial peptides: a review.

Lou M, Ji S, Wu R, Zhu Y, Wu J, Zhang J World J Microbiol Biotechnol. 2025; 41(2):66.

PMID: 39920500 DOI: 10.1007/s11274-025-04278-x.


References
1.
Fu L, Niu B, Zhu Z, Wu S, Li W . CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012; 28(23):3150-2. PMC: 3516142. DOI: 10.1093/bioinformatics/bts565. View

2.
Lifson S, Sander C . Antiparallel and parallel beta-strands differ in amino acid residue preferences. Nature. 1979; 282(5734):109-11. DOI: 10.1038/282109a0. View

3.
Coelho L, Alves R, Monteiro P, Huerta-Cepas J, Freitas A, Bork P . NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language. Microbiome. 2019; 7(1):84. PMC: 6547473. DOI: 10.1186/s40168-019-0684-8. View

4.
Pirtskhalava M, Amstrong A, Grigolava M, Chubinidze M, Alimbarashvili E, Vishnepolsky B . DBAASP v3: database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics. Nucleic Acids Res. 2020; 49(D1):D288-D297. PMC: 7778994. DOI: 10.1093/nar/gkaa991. View

5.
Eberhardt R, Haft D, Punta M, Martin M, ODonovan C, Bateman A . AntiFam: a tool to help identify spurious ORFs in protein annotation. Database (Oxford). 2012; 2012:bas003. PMC: 3308159. DOI: 10.1093/database/bas003. View