» Articles » PMID: 33126516

Amino Acid -mer Feature Extraction for Quantitative Antimicrobial Resistance (AMR) Prediction by Machine Learning and Model Interpretation for Biological Insights

Overview
Journal Biology (Basel)
Publisher MDPI
Specialty Biology
Date 2020 Oct 31
PMID 33126516
Citations 18
Authors
Affiliations
Soon will be listed here.
Abstract

Machine learning algorithms can learn mechanisms of antimicrobial resistance from the data of DNA sequence without any a priori information. Interpreting a trained machine learning algorithm can be exploited for validating the model and obtaining new information about resistance mechanisms. Different feature extraction methods, such as SNP calling and counting nucleotide -mers have been proposed for presenting DNA sequences to the model. However, there are trade-offs between interpretability, computational complexity and accuracy for different feature extraction methods. In this study, we have proposed a new feature extraction method, counting amino acid -mers or oligopeptides, which provides easier model interpretation compared to counting nucleotide -mers and reaches the same or even better accuracy in comparison with different methods. Additionally, we have trained machine learning algorithms using different feature extraction methods and compared the results in terms of accuracy, model interpretability and computational complexity. We have built a new feature selection pipeline for extraction of important features so that new AMR determinants can be discovered by analyzing these features. This pipeline allows the construction of models that only use a small number of features and can predict resistance accurately.

Citing Articles

Unraveling diversity by isolating peptide sequences specific to distinct taxonomic groups.

Bochalis E, Patsakis M, Chantzi N, Mouratidis I, Chartoumpekis D, Georgakopoulos-Soares I bioRxiv. 2025; .

PMID: 39975352 PMC: 11839104. DOI: 10.1101/2025.02.05.636664.


The role of artificial intelligence and machine learning in predicting and combating antimicrobial resistance.

Bilal H, Khan M, Khan S, Shafiq M, Fang W, Khan R Comput Struct Biotechnol J. 2025; 27:423-439.

PMID: 39906157 PMC: 11791014. DOI: 10.1016/j.csbj.2025.01.006.


A comparison of various feature extraction and machine learning methods for antimicrobial resistance prediction in .

Kaya D, Ulgen E, Kocagoz A, Sezerman O Front Antibiot. 2025; 2():1126468.

PMID: 39816648 PMC: 11731958. DOI: 10.3389/frabi.2023.1126468.


A machine learning-based strategy to elucidate the identification of antibiotic resistance in bacteria.

Parthasarathi K, Gaikwad K, Rajesh S, Rana S, Pandey A, Singh H Front Antibiot. 2025; 3():1405296.

PMID: 39816256 PMC: 11732175. DOI: 10.3389/frabi.2024.1405296.


Advancing microbial diagnostics: a universal phylogeny guided computational algorithm to find unique sequences for precise microorganism detection.

Sharma G, Sharma R, Joshi K, Qureshi S, Mathur S, Sinha S Brief Bioinform. 2024; 25(6).

PMID: 39441245 PMC: 11497845. DOI: 10.1093/bib/bbae545.


References
1.
Hyun J, Kavvas E, Monk J, Palsson B . Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens. PLoS Comput Biol. 2020; 16(3):e1007608. PMC: 7067475. DOI: 10.1371/journal.pcbi.1007608. View

2.
Nguyen M, Brettin T, Long S, Musser J, Olsen R, Olson R . Developing an in silico minimum inhibitory concentration panel test for Klebsiella pneumoniae. Sci Rep. 2018; 8(1):421. PMC: 5765115. DOI: 10.1038/s41598-017-18972-w. View

3.
Li Y, Metcalf B, Chochua S, Li Z, Gertz Jr R, Walker H . Validation of β-lactam minimum inhibitory concentration predictions for pneumococcal isolates with newly encountered penicillin binding protein (PBP) sequences. BMC Genomics. 2017; 18(1):621. PMC: 5558719. DOI: 10.1186/s12864-017-4017-7. View

4.
Wattam A, Davis J, Assaf R, Boisvert S, Brettin T, Bun C . Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center. Nucleic Acids Res. 2016; 45(D1):D535-D542. PMC: 5210524. DOI: 10.1093/nar/gkw1017. View

5.
Shi J, Yan Y, Links M, Li L, Dillon J, Horsch M . Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection. BMC Bioinformatics. 2019; 20(Suppl 15):535. PMC: 6929425. DOI: 10.1186/s12859-019-3054-4. View