» Articles » PMID: 31452597

Comparing Ease of Programming in C++, Go, and Java for Implementing a Next-Generation Sequencing Tool

Overview
Publisher Sage Publications
Specialty Biology
Date 2019 Aug 28
PMID 31452597
Citations 2
Authors
Affiliations
Soon will be listed here.
Abstract

elPrep is an extensible multithreaded software framework for efficiently processing Sequence Alignment/Map (SAM)/Binary Alignment/Map (BAM) files in next-generation sequencing pipelines. Similar to other SAM/BAM tools, a key challenge in elPrep is memory management, as such programs need to manipulate large amounts of data. We therefore investigated 3 programming languages with support for assisted or automated memory management for implementing elPrep, namely C++, Go, and Java. We implemented a nontrivial subset of elPrep in all 3 programming languages and compared them by benchmarking their runtime performance and memory use to determine the best language in terms of computational performance. In a previous article, we motivated why, based on these results, we eventually selected Go as our implementation language. In this article, we discuss the difficulty of achieving the best performance in each language in terms of programming language constructs and standard library support. While benchmarks are easy to objectively measure and evaluate, this is less obvious for assessing ease of programming. However, because we expect elPrep to be regularly modified and extended, this is an equally important aspect. We illustrate representative examples of challenges in all 3 languages, and give our opinion why we think that Go is a reasonable choice also in this light.

Citing Articles

A software package for efficient patient trajectory analysis applied to analyzing bladder cancer development.

Herzeel C, DHondt E, Vandeweerd V, Botermans W, Akand M, Van der Aa F PLOS Digit Health. 2023; 2(11):e0000384.

PMID: 37992021 PMC: 10664923. DOI: 10.1371/journal.pdig.0000384.


Gonomics: uniting high performance and readability for genomics with Go.

Au E, Fauci C, Luo Y, Mangan R, Snellings D, Shoben C Bioinformatics. 2023; 39(8).

PMID: 37624924 PMC: 10466080. DOI: 10.1093/bioinformatics/btad516.

References
1.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N . The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009; 25(16):2078-9. PMC: 2723002. DOI: 10.1093/bioinformatics/btp352. View

2.
Herzeel C, Costanza P, Decap D, Fostier J, Reumers J . elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling. PLoS One. 2015; 10(7):e0132868. PMC: 4504710. DOI: 10.1371/journal.pone.0132868. View

3.
Herzeel C, Costanza P, Decap D, Fostier J, Verachtert W . elPrep 4: A multithreaded framework for sequence analysis. PLoS One. 2019; 14(2):e0209523. PMC: 6373927. DOI: 10.1371/journal.pone.0209523. View

4.
Costanza P, Herzeel C, Verachtert W . A comparison of three programming languages for a full-fledged next-generation sequencing tool. BMC Bioinformatics. 2019; 20(1):301. PMC: 6547519. DOI: 10.1186/s12859-019-2903-5. View