» Articles » PMID: 39101033

Benchmarking of Five NGS Mapping Tools for the Reference Alignment of Bacterial Outer Membrane Vesicles-associated Small RNAs

Overview
Journal Front Microbiol
Specialty Microbiology
Date 2024 Aug 5
PMID 39101033
Authors
Affiliations
Soon will be listed here.
Abstract

Advances in small RNAs (sRNAs)-related studies have posed a challenge for NGS-related bioinformatics, especially regarding the correct mapping of sRNAs. Depending on the algorithms and scoring matrices on which they are based, aligners are influenced by the characteristics of the dataset and the reference genome. These influences have been studied mainly in eukaryotes and to some extent in prokaryotes. However, in bacteria, the selection of aligners depending on sRNA-seq data associated with outer membrane vesicles (OMVs) and the features of the corresponding bacterial reference genome has not yet been investigated. We selected five aligners: BBmap, Bowtie2, BWA, Minimap2 and Segemehl, known for their generally good performance, to test them in mapping OMV-associated sRNAs from to the bacterial reference genome. Significant differences in the performance of the five aligners were observed, resulting in differential recognition of OMV-associated sRNA biotypes in . Our results suggest that aligner(s) should not be arbitrarily selected for this task, which is often done, as this can be detrimental to the biological interpretation of NGS analysis results. Since each aligner has specific advantages and disadvantages, these need to be considered depending on the characteristics of the input OMV sRNAs dataset and the corresponding bacterial reference genome to improve the detection of existing, biologically important OMV sRNAs. Until we learn more about these dependencies, we recommend using at least two, preferably three, aligners that have good metrics for the given dataset/bacterial reference genome. The overlapping results should be considered trustworthy, yet their differences should not be dismissed lightly, but treated carefully in order not to overlook any biologically important OMV sRNA. This can be achieved by applying the intersect-then-combine approach. For the mapping of OMV-associated sRNAs of to the reference genome organized into two circular chromosomes and one circular plasmid, containing copies of sequences with rRNA- and tRNA-related features and no copies of sequences with protein-encoding features, if the aligners are used with their default parameters, we advise avoiding Segemehl, and recommend using the intersect-then-combine approach with BBmap, BWA and Minimap2 to improve the potential for discovery of biologically important OMV-associated sRNAs.

References
1.
Baldrich P, Rutter B, Karimi H, Podicheti R, Meyers B, Innes R . Plant Extracellular Vesicles Contain Diverse Small RNA Species and Are Enriched in 10- to 17-Nucleotide "Tiny" RNAs. Plant Cell. 2019; 31(2):315-324. PMC: 6447009. DOI: 10.1105/tpc.18.00872. View

2.
Li H, Durbin R . Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25(14):1754-60. PMC: 2705234. DOI: 10.1093/bioinformatics/btp324. View

3.
Diallo I, Provost P . RNA-Sequencing Analyses of Small Bacterial RNAs and their Emergence as Virulence Factors in Host-Pathogen Interactions. Int J Mol Sci. 2020; 21(5). PMC: 7084465. DOI: 10.3390/ijms21051627. View

4.
Sartorio M, Pardue E, Feldman M, Haurat M . Bacterial Outer Membrane Vesicles: From Discovery to Applications. Annu Rev Microbiol. 2021; 75:609-630. PMC: 8500939. DOI: 10.1146/annurev-micro-052821-031444. View

5.
Phan V, Gao S, Tran Q, Vo N . How genome complexity can explain the difficulty of aligning reads to genomes. BMC Bioinformatics. 2015; 16 Suppl 17:S3. PMC: 4674900. DOI: 10.1186/1471-2105-16-S17-S3. View