» Articles » PMID: 30020410

Scaling Read Aligners to Hundreds of Threads on General-purpose Processors

Overview
Journal Bioinformatics
Specialty Biology
Date 2018 Jul 19
PMID 30020410
Citations 341
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: General-purpose processors can now contain many dozens of processor cores and support hundreds of simultaneous threads of execution. To make best use of these threads, genomics software must contend with new and subtle computer architecture issues. We discuss some of these and propose methods for improving thread scaling in tools that analyze each read independently, such as read aligners.

Results: We implement these methods in new versions of Bowtie, Bowtie 2 and HISAT. We greatly improve thread scaling in many scenarios, including on the recent Intel Xeon Phi architecture. We also highlight how bottlenecks are exacerbated by variable-record-length file formats like FASTQ and suggest changes that enable superior scaling.

Availability And Implementation: Experiments for this study: https://github.com/BenLangmead/bowtie-scaling.

Bowtie: http://bowtie-bio.sourceforge.net.

Bowtie 2: http://bowtie-bio.sourceforge.net/bowtie2.

Hisat: http://www.ccb.jhu.edu/software/hisat.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

Heart rate variability, daily cortisol indices and their association with psychometric characteristics and gut microbiota composition in an Italian community sample.

Ravenda S, Mancabelli L, Gambetta S, Barbetti M, Turroni F, Carnevali L Sci Rep. 2025; 15(1):8584.

PMID: 40074815 PMC: 11903775. DOI: 10.1038/s41598-025-93137-8.


EBAX-1/ZSWIM8 destabilizes miRNAs, resulting in transgenerational inheritance of a predatory trait.

Quiobe S, Kalirad A, Roseler W, Witte H, Wang Y, Rodelsperger C Sci Adv. 2025; 11(11):eadu0875.

PMID: 40073139 PMC: 11900880. DOI: 10.1126/sciadv.adu0875.


Estropausal gut microbiota transplant improves measures of ovarian function in adult mice.

Kim M, Wang J, Pilley S, Lu R, Xu A, Kim Y bioRxiv. 2025; .

PMID: 40060387 PMC: 11888174. DOI: 10.1101/2024.05.03.592475.


Effect of fieldwork-friendly coffee blender-based extraction methods and leaf tissue storage on the transcriptome of non-model plants.

Dagva S, Galipon J J Plant Res. 2025; .

PMID: 40053276 DOI: 10.1007/s10265-025-01624-w.


Genome-wide CRISPR guide RNA design and specificity analysis with GuideScan2.

Schmidt H, Zhang M, Chakarov D, Bansal V, Mourelatos H, Sanchez-Rivera F Genome Biol. 2025; 26(1):41.

PMID: 40011959 PMC: 11863968. DOI: 10.1186/s13059-025-03488-8.


References
1.
Langmead B, Trapnell C, Pop M, Salzberg S . Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009; 10(3):R25. PMC: 2690996. DOI: 10.1186/gb-2009-10-3-r25. View

2.
Rustagi N, Zhou A, Watkins W, Gedvilaite E, Wang S, Ramesh N . Extremely low-coverage whole genome sequencing in South Asians captures population genomics information. BMC Genomics. 2017; 18(1):396. PMC: 5440948. DOI: 10.1186/s12864-017-3767-6. View

3.
Cock P, Fields C, Goto N, Heuer M, Rice P . The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 2009; 38(6):1767-71. PMC: 2847217. DOI: 10.1093/nar/gkp1137. View

4.
Eberle M, Fritzilas E, Krusche P, Kallberg M, Moore B, Bekritsky M . A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree. Genome Res. 2016; 27(1):157-164. PMC: 5204340. DOI: 10.1101/gr.210500.116. View

5.
Srivastava A, Sarkar H, Gupta N, Patro R . RapMap: a rapid, sensitive and accurate tool for mapping RNA-seq reads to transcriptomes. Bioinformatics. 2016; 32(12):i192-i200. PMC: 4908361. DOI: 10.1093/bioinformatics/btw277. View