» Articles » PMID: 35163661

Genome-Wide Prediction of Transcription Start Sites in Conifers

Overview
Journal Int J Mol Sci
Publisher MDPI
Date 2022 Feb 15
PMID 35163661
Authors
Affiliations
Soon will be listed here.
Abstract

The identification of promoters is an essential step in the genome annotation process, providing a framework for gene regulatory networks and their role in transcription regulation. Despite considerable advances in the high-throughput determination of transcription start sites (TSSs) and transcription factor binding sites (TFBSs), experimental methods are still time-consuming and expensive. Instead, several computational approaches have been developed to provide fast and reliable means for predicting the location of TSSs and regulatory motifs on a genome-wide scale. Numerous studies have been carried out on the regulatory elements of mammalian genomes, but plant promoters, especially in gymnosperms, have been left out of the limelight and, therefore, have been poorly investigated. The aim of this study was to enhance and expand the existing genome annotations using computational approaches for genome-wide prediction of TSSs in the four conifer species: loblolly pine, white spruce, Norway spruce, and Siberian larch. Our pipeline will be useful for TSS predictions in other genomes, especially for draft assemblies, where reliable TSS predictions are not usually available. We also explored some of the features of the nucleotide composition of the predicted promoters and compared the GC properties of conifer genes with model monocot and dicot plants. Here, we demonstrate that even incomplete genome assemblies and partial annotations can be a reliable starting point for TSS annotation. The results of the TSS prediction in four conifer species have been deposited in the Persephone genome browser, which allows smooth visualization and is optimized for large data sets. This work provides the initial basis for future experimental validation and the study of the regulatory regions to understand gene regulation in gymnosperms.

Citing Articles

Classification of Promoter Sequences from Human Genome.

Zaytsev K, Fedorov A, Korotkov E Int J Mol Sci. 2023; 24(16).

PMID: 37628742 PMC: 10454140. DOI: 10.3390/ijms241612561.


The Complete Chloroplast Genome Sequence of (Sieb. et Zucc.) Wedd. and Comparative Analysis with Its Congeneric Species.

Zhang H, Miao Y, Zhang X, Zhang G, Sun X, Zhang M Genes (Basel). 2022; 13(12).

PMID: 36553498 PMC: 9778553. DOI: 10.3390/genes13122230.


Database of Potential Promoter Sequences in the Genome.

Rudenko V, Korotkov E Biology (Basel). 2022; 11(8).

PMID: 35892972 PMC: 9332048. DOI: 10.3390/biology11081117.


Plant Biology and Biotechnology: Focus on Genomics and Bioinformatics.

Orlov Y, Ivanisenko V, Dobrovolskaya O, Chen M Int J Mol Sci. 2022; 23(12).

PMID: 35743200 PMC: 9223720. DOI: 10.3390/ijms23126759.

References
1.
Miller G, Mittler R . Could heat shock transcription factors function as hydrogen peroxide sensors in plants?. Ann Bot. 2006; 98(2):279-88. PMC: 2803459. DOI: 10.1093/aob/mcl107. View

2.
Duval I, Lachance D, Giguere I, Bomal C, Morency M, Pelletier G . Large-scale screening of transcription factor-promoter interactions in spruce reveals a transcriptional network involved in vascular development. J Exp Bot. 2014; 65(9):2319-33. PMC: 4036505. DOI: 10.1093/jxb/eru116. View

3.
Fujimori S, Washio T, Tomita M . GC-compositional strand bias around transcription start sites in plants and fungi. BMC Genomics. 2005; 6:26. PMC: 555766. DOI: 10.1186/1471-2164-6-26. View

4.
Pandey S, Krishnamachari A . Computational analysis of plant RNA Pol-II promoters. Biosystems. 2005; 83(1):38-50. DOI: 10.1016/j.biosystems.2005.09.001. View

5.
Triska M, Solovyev V, Baranova A, Kel A, Tatarinova T . Nucleotide patterns aiding in prediction of eukaryotic promoters. PLoS One. 2017; 12(11):e0187243. PMC: 5687710. DOI: 10.1371/journal.pone.0187243. View