» Articles » PMID: 21059262

An Intuitionistic Approach to Scoring DNA Sequences Against Transcription Factor Binding Site Motifs

Overview
Publisher Biomed Central
Specialty Biology
Date 2010 Nov 10
PMID 21059262
Citations 2
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Transcription factors (TFs) control transcription by binding to specific regions of DNA called transcription factor binding sites (TFBSs). The identification of TFBSs is a crucial problem in computational biology and includes the subtask of predicting the location of known TFBS motifs in a given DNA sequence. It has previously been shown that, when scoring matches to known TFBS motifs, interdependencies between positions within a motif should be taken into account. However, this remains a challenging task owing to the fact that sequences similar to those of known TFBSs can occur by chance with a relatively high frequency. Here we present a new method for matching sequences to TFBS motifs based on intuitionistic fuzzy sets (IFS) theory, an approach that has been shown to be particularly appropriate for tackling problems that embody a high degree of uncertainty.

Results: We propose SCintuit, a new scoring method for measuring sequence-motif affinity based on IFS theory. Unlike existing methods that consider dependencies between positions, SCintuit is designed to prevent overestimation of less conserved positions of TFBSs. For a given pair of bases, SCintuit is computed not only as a function of their combined probability of occurrence, but also taking into account the individual importance of each single base at its corresponding position. We used SCintuit to identify known TFBSs in DNA sequences. Our method provides excellent results when dealing with both synthetic and real data, outperforming the sensitivity and the specificity of two existing methods in all the experiments we performed.

Conclusions: The results show that SCintuit improves the prediction quality for TFs of the existing approaches without compromising sensitivity. In addition, we show how SCintuit can be successfully applied to real research problems. In this study the reliability of the IFS theory for motif discovery tasks is proven.

Citing Articles

Evaluating tools for transcription factor binding site prediction.

Jayaram N, Usvyat D, Martin A BMC Bioinformatics. 2016; 17(1):547.

PMID: 27806697 PMC: 6889335. DOI: 10.1186/s12859-016-1298-9.


Effect of dietary n-3 polyunsaturated fatty acids on transcription factor regulation in the bovine endometrium.

Waters S, Coyne G, Kenny D, Morris D Mol Biol Rep. 2014; 41(5):2745-55.

PMID: 24449365 DOI: 10.1007/s11033-014-3129-2.

References
1.
Sandelin A, Wasserman W, Lenhard B . ConSite: web-based prediction of regulatory elements using cross-species comparison. Nucleic Acids Res. 2004; 32(Web Server issue):W249-52. PMC: 441510. DOI: 10.1093/nar/gkh372. View

2.
Bulyk M, Johnson P, Church G . Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors. Nucleic Acids Res. 2002; 30(5):1255-61. PMC: 101241. DOI: 10.1093/nar/30.5.1255. View

3.
Lopez F, Blanco A, Garcia F, Cano C, Marin A . Fuzzy association rules for biological data analysis: a case study on yeast. BMC Bioinformatics. 2008; 9:107. PMC: 2277399. DOI: 10.1186/1471-2105-9-107. View

4.
Bailey T, Elkan C . Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol. 1994; 2:28-36. View

5.
Gordan R, Narlikar L, Hartemink A . Finding regulatory DNA motifs using alignment-free evolutionary conservation information. Nucleic Acids Res. 2010; 38(6):e90. PMC: 2847231. DOI: 10.1093/nar/gkp1166. View