» Articles » PMID: 36833209

A Network-Based Approach for Improving Annotation of Transcription Factor Functions and Binding Sites in

Overview
Journal Genes (Basel)
Publisher MDPI
Date 2023 Feb 25
PMID 36833209
Authors
Affiliations
Soon will be listed here.
Abstract

Transcription factors are an integral component of the cellular machinery responsible for regulating many biological processes, and they recognize distinct DNA sequence patterns as well as internal/external signals to mediate target gene expression. The functional roles of an individual transcription factor can be traced back to the functions of its target genes. While such functional associations can be inferred through the use of binding evidence from high-throughput sequencing technologies available today, including chromatin immunoprecipitation sequencing, such experiments can be resource-consuming. On the other hand, exploratory analysis driven by computational techniques can alleviate this burden by narrowing the search scope, but the results are often deemed low-quality or non-specific by biologists. In this paper, we introduce a data-driven, statistics-based strategy to predict novel functional associations for transcription factors in the model plant . To achieve this, we leverage one of the largest available gene expression compendia to build a genome-wide transcriptional regulatory network and infer regulatory relationships among transcription factors and their targets. We then use this network to build a pool of likely downstream targets for each transcription factor and query each target pool for functionally enriched gene ontology terms. The results exhibited sufficient statistical significance to annotate most of the transcription factors in Arabidopsis with highly specific biological processes. We also perform DNA binding motif discovery for transcription factors based on their target pool. We show that the predicted functions and motifs strongly agree with curated databases constructed from experimental evidence. In addition, statistical analysis of the network revealed interesting patterns and connections between network topology and system-level transcriptional regulation properties. We believe that the methods demonstrated in this work can be extended to other species to improve the annotation of transcription factors and understand transcriptional regulation on a system level.

References
1.
Keurentjes J, Fu J, Terpstra I, Garcia J, Van den Ackerveken G, Snoek L . Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci. Proc Natl Acad Sci U S A. 2007; 104(5):1708-13. PMC: 1785256. DOI: 10.1073/pnas.0610429104. View

2.
Boyle E, Weng S, Gollub J, Jin H, Botstein D, Cherry J . GO::TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics. 2004; 20(18):3710-5. PMC: 3037731. DOI: 10.1093/bioinformatics/bth456. View

3.
Bailey T . DREME: motif discovery in transcription factor ChIP-seq data. Bioinformatics. 2011; 27(12):1653-9. PMC: 3106199. DOI: 10.1093/bioinformatics/btr261. View

4.
He F, Yoo S, Wang D, Kumari S, Gerstein M, Ware D . Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis. Plant J. 2016; 86(6):472-80. DOI: 10.1111/tpj.13175. View

5.
Peer D, Regev A, Elidan G, Friedman N . Inferring subnetworks from perturbed expression profiles. Bioinformatics. 2001; 17 Suppl 1:S215-24. DOI: 10.1093/bioinformatics/17.suppl_1.s215. View