» Articles » PMID: 26982880

Computational Performance Assessment of K-mer Counting Algorithms

Overview
Journal J Comput Biol
Date 2016 Mar 17
PMID 26982880
Citations 8
Authors
Affiliations
Soon will be listed here.
Abstract

This article is about the assessment of several tools for k-mer counting, with the purpose to create a reference framework for bioinformatics researchers to identify computational requirements, parallelizing, advantages, disadvantages, and bottlenecks of each of the algorithms proposed in the tools. The k-mer counters evaluated in this article were BFCounter, DSK, Jellyfish, KAnalyze, KHMer, KMC2, MSPKmerCounter, Tallymer, and Turtle. Measured parameters were the following: RAM occupied space, processing time, parallelization, and read and write disk access. A dataset consisting of 36,504,800 reads was used corresponding to the 14th human chromosome. The assessment was performed for two k-mer lengths: 31 and 55. Obtained results were the following: pure Bloom filter-based tools and disk-partitioning techniques showed a lesser RAM use. The tools that took less execution time were the ones that used disk-partitioning techniques. The techniques that made the major parallelization were the ones that used disk partitioning, hash tables with lock-free approach, or multiple hash tables.

Citing Articles

PanKA: Leveraging population pangenome to predict antibiotic resistance.

Do V, Nguyen V, Nguyen S, Le D, Nguyen T, Nguyen C iScience. 2024; 27(9):110623.

PMID: 39228791 PMC: 11369404. DOI: 10.1016/j.isci.2024.110623.


A survey of k-mer methods and applications in bioinformatics.

Moeckel C, Mareboina M, Konnaris M, Chan C, Mouratidis I, Montgomery A Comput Struct Biotechnol J. 2024; 23:2289-2303.

PMID: 38840832 PMC: 11152613. DOI: 10.1016/j.csbj.2024.05.025.


Chromosome-scale assembly and high-density genetic map of the yellow drum, Nibea albiflora.

Xu D, Zhang W, Chen R, Song H, Tian L, Tan P Sci Data. 2021; 8(1):268.

PMID: 34654820 PMC: 8521588. DOI: 10.1038/s41597-021-01045-z.


Estimating the -mer Coverage Frequencies in Genomic Datasets: A Comparative Assessment of the State-of-the-art.

Manekar S, Sathe S Curr Genomics. 2019; 20(1):2-15.

PMID: 31015787 PMC: 6446480. DOI: 10.2174/1389202919666181026101326.


A benchmark study of k-mer counting methods for high-throughput sequencing.

Manekar S, Sathe S Gigascience. 2018; 7(12).

PMID: 30346548 PMC: 6280066. DOI: 10.1093/gigascience/giy125.