» Articles » PMID: 38841126

Predmoter-cross-species Prediction of Plant Promoter and Enhancer Regions

Overview
Journal Bioinform Adv
Specialty Biology
Date 2024 Jun 6
PMID 38841126
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Identifying -regulatory elements (CREs) is crucial for analyzing gene regulatory networks. Next generation sequencing methods were developed to identify CREs but represent a considerable expenditure for targeted analysis of few genomic loci. Thus, predicting the outputs of these methods would significantly cut costs and time investment.

Results: We present Predmoter, a deep neural network that predicts base-wise Assay for Transposase Accessible Chromatin using sequencing (ATAC-seq) and histone Chromatin immunoprecipitation DNA-sequencing (ChIP-seq) read coverage for plant genomes. Predmoter uses only the DNA sequence as input. We trained our final model on 21 species for 13 of which ATAC-seq data and for 17 of which ChIP-seq data was publicly available. We evaluated our models on and . Our best models showed accurate predictions in peak position and pattern for ATAC- and histone ChIP-seq. Annotating putatively accessible chromatin regions provides valuable input for the identification of CREs. In conjunction with other data, this can significantly reduce the search space for experimentally verifiable DNA-protein interaction pairs.

Availability And Implementation: The source code for Predmoter is available at: https://github.com/weberlab-hhu/Predmoter. Predmoter takes a fasta file as input and outputs h5, and optionally bigWig and bedGraph files.

References
1.
Stiehler F, Steinborn M, Scholz S, Dey D, Weber A, Denton A . Helixer: cross-species gene annotation of large eukaryotic genomes using deep learning. Bioinformatics. 2020; 36(22-23):5291-5298. PMC: 8016489. DOI: 10.1093/bioinformatics/btaa1044. View

2.
Li J, Wu Z, Lin W, Luo J, Zhang J, Chen Q . iEnhancer-ELM: improve enhancer identification by extracting position-related multiscale contextual information based on enhancer language models. Bioinform Adv. 2023; 3(1):vbad043. PMC: 10125906. DOI: 10.1093/bioadv/vbad043. View

3.
Ippen K, Miller J, Scaife J, Beckwith J . New controlling element in the Lac operon of E. coli. Nature. 1968; 217(5131):825-7. DOI: 10.1038/217825a0. View

4.
Bolger A, Lohse M, Usadel B . Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014; 30(15):2114-20. PMC: 4103590. DOI: 10.1093/bioinformatics/btu170. View

5.
Banerji J, Rusconi S, Schaffner W . Expression of a beta-globin gene is enhanced by remote SV40 DNA sequences. Cell. 1981; 27(2 Pt 1):299-308. DOI: 10.1016/0092-8674(81)90413-x. View