PMS6MC: A Multicore Algorithm for Motif Discovery
Overview
Overview
Authors
Affiliations
Affiliations
Soon will be listed here.
Abstract
We develop an efficient multicore algorithm, PMS6MC, for the ()-motif discovery problem in which we are to find all strings of length that appear in every string of a given set of strings with at most mismatches. PMS6MC is based on PMS6, which is currently the fastest single-core algorithm for motif discovery in large instances. The speedup, relative to PMS6, attained by our multicore algorithm ranges from a high of 6.62 for the (17,6) challenging instances to a low of 2.75 for the (13,4) challenging instances on an Intel 6-core system. We estimate that PMS6MC is 2 to 4 times faster than other parallel algorithms for motif search on large instances.
References
1.
Hertz G, Stormo G
. Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics. 1999; 15(7-8):563-77.
DOI: 10.1093/bioinformatics/15.7.563.
View
2.
Dinh H, Rajasekaran S, Kundeti V
. PMS5: an efficient exact algorithm for the (ℓ, d)-motif finding problem. BMC Bioinformatics. 2011; 12:410.
PMC: 3269969.
DOI: 10.1186/1471-2105-12-410.
View
3.
Buhler J, Tompa M
. Finding motifs using random projections. J Comput Biol. 2002; 9(2):225-42.
DOI: 10.1089/10665270252935430.
View
4.
Marsan L, Sagot M
. Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification. J Comput Biol. 2000; 7(3-4):345-62.
DOI: 10.1089/106652700750050826.
View
5.
Bailey T, Williams N, Misleh C, Li W
. MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res. 2006; 34(Web Server issue):W369-73.
PMC: 1538909.
DOI: 10.1093/nar/gkl198.
View