» Articles » PMID: 20080505

Fast and Accurate Long-read Alignment with Burrows-Wheeler Transform

Overview
Journal Bioinformatics
Specialty Biology
Date 2010 Jan 19
PMID 20080505
Citations 6280
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Many programs for aligning short sequencing reads to a reference genome have been developed in the last 2 years. Most of them are very efficient for short reads but inefficient or not applicable for reads >200 bp because the algorithms are heavily and specifically tuned for short queries with low sequencing error rate. However, some sequencing platforms already produce longer reads and others are expected to become available soon. For longer reads, hashing-based software such as BLAT and SSAHA2 remain the only choices. Nonetheless, these methods are substantially slower than short-read aligners in terms of aligned bases per unit time.

Results: We designed and implemented a new algorithm, Burrows-Wheeler Aligner's Smith-Waterman Alignment (BWA-SW), to align long sequences up to 1 Mb against a large sequence database (e.g. the human genome) with a few gigabytes of memory. The algorithm is as accurate as SSAHA2, more accurate than BLAT, and is several to tens of times faster than both.

Availability: http://bio-bwa.sourceforge.net

Citing Articles

Structural basis of thymidine-rich DNA recognition by Drosophila P75 PWWP domain.

Jin Z, Meng Z, Liu Y, Li C, Zhang X, Yin Y Commun Biol. 2025; 8(1):445.

PMID: 40089621 DOI: 10.1038/s42003-025-07895-2.


Proteogenomic characterization of molecular and cellular targets for treatment-resistant subtypes in locally advanced cervical cancers.

Hyeon D, Nam D, Shin H, Jeong J, Jung E, Cho S Mol Cancer. 2025; 24(1):77.

PMID: 40087745 DOI: 10.1186/s12943-025-02256-3.


Contributions of interspecific hybrids to genetic variability in Glycyrrhiza uralensis and G. glabra.

Kim J, Lee J, Kang J, Shim H, Kang D, Lee S Sci Rep. 2025; 15(1):8764.

PMID: 40082484 PMC: 11906797. DOI: 10.1038/s41598-025-92115-4.


Chromosome-level genome assembly of the spangled emperor, Lethrinus nebulosus (Forsskål 1775).

Parata L, Anstiss L, de Jong E, Doran A, Edwards R, Newman S Sci Data. 2025; 12(1):435.

PMID: 40082478 PMC: 11906739. DOI: 10.1038/s41597-025-04690-w.


Flax domesticationprocesses as inferred from genome-wide SNP data.

Fu Y Sci Rep. 2025; 15(1):8731.

PMID: 40082459 PMC: 11906640. DOI: 10.1038/s41598-025-89498-9.


References
1.
Ning Z, Cox A, Mullikin J . SSAHA: a fast search method for large DNA databases. Genome Res. 2001; 11(10):1725-9. PMC: 311141. DOI: 10.1101/gr.194201. View

2.
Li H, Durbin R . Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25(14):1754-60. PMC: 2705234. DOI: 10.1093/bioinformatics/btp324. View

3.
Weese D, Emde A, Rausch T, Doring A, Reinert K . RazerS--fast read mapping with sensitivity control. Genome Res. 2009; 19(9):1646-54. PMC: 2752123. DOI: 10.1101/gr.088823.108. View

4.
Morgulis A, Coulouris G, Raytselis Y, Madden T, Agarwala R, Schaffer A . Database indexing for production MegaBLAST searches. Bioinformatics. 2008; 24(16):1757-64. PMC: 2696921. DOI: 10.1093/bioinformatics/btn322. View

5.
Langmead B, Trapnell C, Pop M, Salzberg S . Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009; 10(3):R25. PMC: 2690996. DOI: 10.1186/gb-2009-10-3-r25. View