» Articles » PMID: 38515040

GNNMF: a Multi-view Graph Neural Network for ATAC-seq Motif Finding

Overview
Journal BMC Genomics
Publisher Biomed Central
Specialty Genetics
Date 2024 Mar 22
PMID 38515040
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) utilizes the Transposase Tn5 to probe open chromatic, which simultaneously reveals multiple transcription factor binding sites (TFBSs) compared to traditional technologies. Deep learning (DL) technology, including convolutional neural networks (CNNs), has successfully found motifs from ATAC-seq data. Due to the limitation of the width of convolutional kernels, the existing models only find motifs with fixed lengths. A Graph neural network (GNN) can work on non-Euclidean data, which has the potential to find ATAC-seq motifs with different lengths. However, the existing GNN models ignored the relationships among ATAC-seq sequences, and their parameter settings should be improved.

Results: In this study, we proposed a novel GNN model named GNNMF to find ATAC-seq motifs via GNN and background coexisting probability. Our experiment has been conducted on 200 human datasets and 80 mouse datasets, demonstrated that GNNMF has improved the area of eight metrics radar scores of 4.92% and 6.81% respectively, and found more motifs than did the existing models.

Conclusions: In this study, we developed a novel model named GNNMF for finding multiple ATAC-seq motifs. GNNMF built a multi-view heterogeneous graph by using ATAC-seq sequences, and utilized background coexisting probability and the iterloss to find different lengths of ATAC-seq motifs and optimize the parameter sets. Compared to existing models, GNNMF achieved the best performance on TFBS prediction and ATAC-seq motif finding, which demonstrates that our improvement is available for ATAC-seq motif finding.

References
1.
Li Z, Schulz M, Look T, Begemann M, Zenke M, Costa I . Identification of transcription factor binding sites using ATAC-seq. Genome Biol. 2019; 20(1):45. PMC: 6391789. DOI: 10.1186/s13059-019-1642-2. View

2.
Vishnoi K, Viswakarma N, Rana A, Rana B . Transcription Factors in Cancer Development and Therapy. Cancers (Basel). 2020; 12(8). PMC: 7464564. DOI: 10.3390/cancers12082296. View

3.
Quang D, Xie X . FactorNet: A deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data. Methods. 2019; 166:40-47. PMC: 6708499. DOI: 10.1016/j.ymeth.2019.03.020. View

4.
Bajic M, Maher K, Deal R . Identification of Open Chromatin Regions in Plant Genomes Using ATAC-Seq. Methods Mol Biol. 2017; 1675:183-201. PMC: 5693289. DOI: 10.1007/978-1-4939-7318-7_12. View

5.
Zhang S, Ma A, Zhao J, Xu D, Ma Q, Wang Y . Assessing deep learning methods in cis-regulatory motif finding based on genomic sequencing data. Brief Bioinform. 2021; 23(1). PMC: 8769700. DOI: 10.1093/bib/bbab374. View