» Articles » PMID: 14668220

Combining Phylogenetic Data with Co-regulated Genes to Identify Regulatory Motifs

Overview
Journal Bioinformatics
Specialty Biology
Date 2003 Dec 12
PMID 14668220
Citations 131
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Discovery of regulatory motifs in unaligned DNA sequences remains a fundamental problem in computational biology. Two categories of algorithms have been developed to identify common motifs from a set of DNA sequences. The first can be called a 'multiple genes, single species' approach. It proposes that a degenerate motif is embedded in some or all of the otherwise unrelated input sequences and tries to describe a consensus motif and identify its occurrences. It is often used for co-regulated genes identified through experimental approaches. The second approach can be called 'single gene, multiple species'. It requires orthologous input sequences and tries to identify unusually well conserved regions by phylogenetic footprinting. Both approaches perform well, but each has some limitations. It is tempting to combine the knowledge of co-regulation among different genes and conservation among orthologous genes to improve our ability to identify motifs.

Results: Based on the Consensus algorithm previously established by our group, we introduce a new algorithm called PhyloCon (Phylogenetic Consensus) that takes into account both conservation among orthologous genes and co-regulation of genes within a species. This algorithm first aligns conserved regions of orthologous sequences into multiple sequence alignments, or profiles, then compares profiles representing non-orthologous sequences. Motifs emerge as common regions in these profiles. Here we present a novel statistic to compare profiles of DNA sequences and a greedy approach to search for common subprofiles. We demonstrate that PhyloCon performs well on both synthetic and biological data.

Availability: Software available upon request from the authors. http://ural.wustl.edu/softwares.html

Citing Articles

Towards a comprehensive regulatory map of Mammalian Genomes.

Mangetti Goncalves T, Stewart C, Baxley S, Xu J, Li D, Gabel H Res Sq. 2023; .

PMID: 37841836 PMC: 10571623. DOI: 10.21203/rs.3.rs-3294408/v1.


Finding motifs using DNA images derived from sparse representations.

Chu S, Stormo G Bioinformatics. 2023; 39(6).

PMID: 37294804 PMC: 10290554. DOI: 10.1093/bioinformatics/btad378.


An in Silico Approach to Identifying TF Binding Sites: Analysis of the Regulatory Regions of BUSCO Genes from Fungal Species in the Family.

Maseko N, Steenkamp E, Wingfield B, Wilken P Genes (Basel). 2023; 14(4).

PMID: 37107606 PMC: 10137650. DOI: 10.3390/genes14040848.


Performance evaluation for MOTIFSIM.

Tran N, Huang C Biol Proced Online. 2018; 20:23.

PMID: 30574025 PMC: 6299673. DOI: 10.1186/s12575-018-0088-3.


Co-expression network analysis and cis-regulatory element enrichment determine putative functions and regulatory mechanisms of grapevine ATL E3 ubiquitin ligases.

Wong D, Ariani P, Castellarin S, Polverari A, Vandelle E Sci Rep. 2018; 8(1):3151.

PMID: 29453355 PMC: 5816651. DOI: 10.1038/s41598-018-21377-y.