» Articles » PMID: 38339181

BestCRM: An Exhaustive Search for Optimal Cis-Regulatory Modules in Promoters Accelerated by the Multidimensional Hash Function

Overview
Journal Int J Mol Sci
Publisher MDPI
Date 2024 Feb 10
PMID 38339181
Authors
Affiliations
Soon will be listed here.
Abstract

The concept of cis-regulatory modules located in gene promoters represents today's vision of the organization of gene transcriptional regulation. Such modules are a combination of two or more single, short DNA motifs. The bioinformatic identification of such modules belongs to so-called NP-hard problems with extreme computational complexity, and therefore, simplifications, assumptions, and heuristics are usually deployed to tackle the problem. In practice, this requires, first, many parameters to be set before the search, and second, it leads to the identification of locally optimal results. Here, a novel method is presented, aimed at identifying the cis-regulatory elements in gene promoters based on an exhaustive search of all the feasible modules' configurations. All required parameters are automatically estimated using positive and negative datasets. To be computationally efficient, the search is accelerated using a multidimensional hash function, allowing the search to complete in a few hours on a regular laptop (for example, a CPU Intel i7, 3.2 GH, 32 Gb RAM). Tests on an established benchmark and real data show better performance of BestCRM compared to the available methods according to several metrics like specificity, sensitivity, AUC, etc. A great practical advantage of the method is its minimum number of input parameters-apart from positive and negative promoters, only a desired level of module presence in promoters is required.

References
1.
Kong Q, Chang P, Li C, Hu Z, Zheng M, Sun Q . Identification of AflR Binding Sites in the Genome of Aspergillus flavus by ChIP-Seq. J Fungi (Basel). 2020; 6(2). PMC: 7344883. DOI: 10.3390/jof6020052. View

2.
Sterneck E, Muller C, Katz S, Leutz A . Autocrine growth induced by kinase type oncogenes in myeloid cells requires AP-1 and NF-M, a myeloid specific, C/EBP-like factor. EMBO J. 1992; 11(1):115-26. PMC: 556432. DOI: 10.1002/j.1460-2075.1992.tb05034.x. View

3.
Kel-Margoulis O, Kel A, Reuter I, Deineko I, Wingender E . TRANSCompel: a database on composite regulatory elements in eukaryotic genes. Nucleic Acids Res. 2001; 30(1):332-4. PMC: 99108. DOI: 10.1093/nar/30.1.332. View

4.
Sahu T, Rao A, Vasisht S, Singh N, Singh U . Computational approaches, databases and tools for in silico motif discovery. Interdiscip Sci. 2013; 4(4):239-55. DOI: 10.1007/s12539-012-0141-x. View

5.
Mysickova A, Vingron M . Detection of interacting transcription factors in human tissues using predicted DNA binding affinity. BMC Genomics. 2012; 13 Suppl 1:S2. PMC: 3583127. DOI: 10.1186/1471-2164-13-S1-S2. View