» Articles » PMID: 33945539

Ranking Microbial Metabolomic and Genomic Links in the NPLinker Framework Using Complementary Scoring Functions

Overview
Specialty Biology
Date 2021 May 4
PMID 33945539
Citations 23
Authors
Affiliations
Soon will be listed here.
Abstract

Specialised metabolites from microbial sources are well-known for their wide range of biomedical applications, particularly as antibiotics. When mining paired genomic and metabolomic data sets for novel specialised metabolites, establishing links between Biosynthetic Gene Clusters (BGCs) and metabolites represents a promising way of finding such novel chemistry. However, due to the lack of detailed biosynthetic knowledge for the majority of predicted BGCs, and the large number of possible combinations, this is not a simple task. This problem is becoming ever more pressing with the increased availability of paired omics data sets. Current tools are not effective at identifying valid links automatically, and manual verification is a considerable bottleneck in natural product research. We demonstrate that using multiple link-scoring functions together makes it easier to prioritise true links relative to others. Based on standardising a commonly used score, we introduce a new, more effective score, and introduce a novel score using an Input-Output Kernel Regression approach. Finally, we present NPLinker, a software framework to link genomic and metabolomic data. Results are verified using publicly available data sets that include validated links.

Citing Articles

Pattern-Based Genome Mining Guides Discovery of the Antibiotic Indanopyrrole A from a Marine Streptomycete.

Sweeney D, Bogdanov A, Chase A, Castro-Falcon G, Trinidad-Javier A, Dahesh S J Nat Prod. 2024; 87(12):2768-2778.

PMID: 39575834 PMC: 11686505. DOI: 10.1021/acs.jnatprod.4c00934.


Pattern-based genome mining guides discovery of the antibiotic indanopyrrole A from a marine streptomycetef.

Sweeney D, Bogdanov A, Chase A, Castro-Falcon G, Trinidad-Javier A, Dahesh S bioRxiv. 2024; .

PMID: 39554111 PMC: 11565753. DOI: 10.1101/2024.10.29.620887.


Primed for Discovery.

Walker A, Clardy J Biochemistry. 2024; 63(21):2705-2713.

PMID: 39497571 PMC: 11542185. DOI: 10.1021/acs.biochem.4c00464.


Discovering type I cis-AT polyketides through computational mass spectrometry and genome mining with Seq2PKS.

Yan D, Zhou M, Adduri A, Zhuang Y, Guler M, Liu S Nat Commun. 2024; 15(1):5356.

PMID: 38918378 PMC: 11199612. DOI: 10.1038/s41467-024-49587-1.


Metabologenomics reveals strain-level genetic and chemical diversity of secondary metabolism.

Yancey C, Hart L, Hefferan S, Mohamed O, Newmister S, Tripathi A mSystems. 2024; 9(7):e0033424.

PMID: 38916306 PMC: 11264947. DOI: 10.1128/msystems.00334-24.


References
1.
Medema M, Kottmann R, Yilmaz P, Cummings M, Biggins J, Blin K . Minimum Information about a Biosynthetic Gene cluster. Nat Chem Biol. 2015; 11(9):625-31. PMC: 5714517. DOI: 10.1038/nchembio.1890. View

2.
Navarro-Munoz J, Selem-Mojica N, Mullowney M, Kautsar S, Tryon J, Parkinson E . A computational framework to explore large-scale biosynthetic diversity. Nat Chem Biol. 2019; 16(1):60-68. PMC: 6917865. DOI: 10.1038/s41589-019-0400-9. View

3.
van der Hooft J, Mohimani H, Bauermeister A, Dorrestein P, Duncan K, Medema M . Linking genomics and metabolomics to chart specialized metabolic diversity. Chem Soc Rev. 2020; 49(11):3297-3314. DOI: 10.1039/d0cs00162g. View

4.
Mohimani H, Gurevich A, Mikheenko A, Garg N, Nothias L, Ninomiya A . Dereplication of peptidic natural products through database search of mass spectra. Nat Chem Biol. 2016; 13(1):30-37. PMC: 5409158. DOI: 10.1038/nchembio.2219. View

5.
Kersten R, Yang Y, Xu Y, Cimermancic P, Nam S, Fenical W . A mass spectrometry-guided genome mining approach for natural product peptidogenomics. Nat Chem Biol. 2011; 7(11):794-802. PMC: 3258187. DOI: 10.1038/nchembio.684. View