An Exact Method for Finding Short Motifs in Sequences, with Application to the Ribosome Binding Site Problem
Overview
Authors
Affiliations
This is an investigation of methods for finding short motifs that only occur in a fraction of the input sequences. Unlike local search techniques that may not reach a global optimum, the method proposed here is guaranteed to produce the motifs with greatest z-scores. This method is illustrated for the Ribosome Binding Site Problem, which is to identify the short mRNA 5' untranslated sequence that is recognized by the ribosome during initiation of protein synthesis. Experiments were performed to solve this problem for each of fourteen sequenced prokaryotes, by applying the method to the full complement of genes from each. One of the interesting results of this experimentation is evidence that the recognized sequence of the thermophilic archaea A. fulgidus, M. jannaschii, M. thermoautotrophicum, and P. horikoshii may be somewhat different than the well known Shine-Dalgarno sequence.
Tahara S, Tsuchiya T, Matsumoto H, Ozaki H BMC Genomics. 2023; 24(1):597.
PMID: 37805453 PMC: 10560430. DOI: 10.1186/s12864-023-09692-9.
Reducing metabolic burden in the PACEmid evolver system by remastering high-copy phagemid vectors.
Davenport B, Tica J, Isalan M Eng Biol. 2023; 6(2-3):50-61.
PMID: 36969104 PMC: 9996709. DOI: 10.1049/enb2.12021.
Prokaryotic coding regions have little if any specific depletion of Shine-Dalgarno motifs.
Yurovsky A, Amin M, Gardin J, Chen Y, Skiena S, Futcher B PLoS One. 2018; 13(8):e0202768.
PMID: 30138485 PMC: 6107199. DOI: 10.1371/journal.pone.0202768.
Amin M, Yurovsky A, Chen Y, Skiena S, Futcher B PLoS One. 2018; 13(8):e0202767.
PMID: 30138483 PMC: 6107228. DOI: 10.1371/journal.pone.0202767.
A novel -mer set memory (KSM) motif representation improves regulatory variant prediction.
Guo Y, Tian K, Zeng H, Guo X, Gifford D Genome Res. 2018; 28(6):891-900.
PMID: 29654070 PMC: 5991515. DOI: 10.1101/gr.226852.117.