Efficient Computation of Spaced Seeds
Overview
General Medicine
Authors
Affiliations
Background: The most frequently used tools in bioinformatics are those searching for similarities, or local alignments, between biological sequences. Since the exact dynamic programming algorithm is quadratic, linear-time heuristics such as BLAST are used. Spaced seeds are much more sensitive than the consecutive seed of BLAST and using several seeds represents the current state of the art in approximate search for biological sequences. The most important aspect is computing highly sensitive seeds. Since the problem seems hard, heuristic algorithms are used. The leading software in the common Bernoulli model is the SpEED program.
Findings: SpEED uses a hill climbing method based on the overlap complexity heuristic. We propose a new algorithm for this heuristic that improves its speed by over one order of magnitude. We use the new implementation to compute improved seeds for several software programs. We compute as well multiple seeds of the same weight as MegaBLAST, that greatly improve its sensitivity.
Conclusion: Multiple spaced seeds are being successfully used in bioinformatics software programs. Enabling researchers to compute very fast high quality seeds will help expanding the range of their applications.
Titarenko V, Titarenko S BMC Bioinformatics. 2023; 24(1):396.
PMID: 37875804 PMC: 10594774. DOI: 10.1186/s12859-023-05517-4.
A survey of mapping algorithms in the long-reads era.
Sahlin K, Baudeau T, Cazaux B, Marchet C Genome Biol. 2023; 24(1):133.
PMID: 37264447 PMC: 10236595. DOI: 10.1186/s13059-023-02972-3.
Dencker T, Leimeister C, Gerth M, Bleidorn C, Snir S, Morgenstern B NAR Genom Bioinform. 2021; 2(1):lqz013.
PMID: 33575565 PMC: 7671388. DOI: 10.1093/nargab/lqz013.
Elworth R, Wang Q, Kota P, Barberan C, Coleman B, Balaji A Nucleic Acids Res. 2020; 48(10):5217-5234.
PMID: 32338745 PMC: 7261164. DOI: 10.1093/nar/gkaa265.
Noe L Algorithms Mol Biol. 2017; 12:1.
PMID: 28289437 PMC: 5310094. DOI: 10.1186/s13015-017-0092-1.