APA-Scan: Detection and Visualization of 3'-UTR Alternative Polyadenylation with RNA-seq and 3'-end-seq Data
Overview
Authors
Affiliations
Background: The eukaryotic genome is capable of producing multiple isoforms from a gene by alternative polyadenylation (APA) during pre-mRNA processing. APA in the 3'-untranslated region (3'-UTR) of mRNA produces transcripts with shorter or longer 3'-UTR. Often, 3'-UTR serves as a binding platform for microRNAs and RNA-binding proteins, which affect the fate of the mRNA transcript. Thus, 3'-UTR APA is known to modulate translation and provides a mean to regulate gene expression at the post-transcriptional level. Current bioinformatics pipelines have limited capability in profiling 3'-UTR APA events due to incomplete annotations and a low-resolution analyzing power: widely available bioinformatics pipelines do not reference actionable polyadenylation (cleavage) sites but simulate 3'-UTR APA only using RNA-seq read coverage, causing false positive identifications. To overcome these limitations, we developed APA-Scan, a robust program that identifies 3'-UTR APA events and visualizes the RNA-seq short-read coverage with gene annotations.
Methods: APA-Scan utilizes either predicted or experimentally validated actionable polyadenylation signals as a reference for polyadenylation sites and calculates the quantity of long and short 3'-UTR transcripts in the RNA-seq data. APA-Scan works in three major steps: (i) calculate the read coverage of the 3'-UTR regions of genes; (ii) identify the potential APA sites and evaluate the significance of the events among two biological conditions; (iii) graphical representation of user specific event with 3'-UTR annotation and read coverage on the 3'-UTR regions. APA-Scan is implemented in Python3. Source code and a comprehensive user's manual are freely available at https://github.com/compbiolabucf/APA-Scan .
Result: APA-Scan was applied to both simulated and real RNA-seq datasets and compared with two widely used baselines DaPars and APAtrap. In simulation APA-Scan significantly improved the accuracy of 3'-UTR APA identification compared to the other baselines. The performance of APA-Scan was also validated by 3'-end-seq data and qPCR on mouse embryonic fibroblast cells. The experiments confirm that APA-Scan can detect unannotated 3'-UTR APA events and improve genome annotation.
Conclusion: APA-Scan is a comprehensive computational pipeline to detect transcriptome-wide 3'-UTR APA events. The pipeline integrates both RNA-seq and 3'-end-seq data information and can efficiently identify the significant events with a high-resolution short reads coverage plots.
vizAPA: visualizing dynamics of alternative polyadenylation from bulk and single-cell data.
Bi X, Ye W, Cheng X, Yang N, Wu X Bioinformatics. 2024; 40(3).
PMID: 38485700 PMC: 10950478. DOI: 10.1093/bioinformatics/btae099.
Bryce-Smith S, Burri D, Gazzara M, Herrmann C, Danecka W, Fitzsimmons C RNA. 2023; 29(12):1839-1855.
PMID: 37816550 PMC: 10653393. DOI: 10.1261/rna.079849.123.
Ye W, Lian Q, Ye C, Wu X Genomics Proteomics Bioinformatics. 2022; 21(1):67-83.
PMID: 36167284 PMC: 10372920. DOI: 10.1016/j.gpb.2022.09.005.
Computational Methods to Study Human Transcript Variants in COVID-19 Infected Lung Cancer Cells.
Sun J, Fahmi N, Nassereddeen H, Cheng S, Martinez I, Fan D Int J Mol Sci. 2021; 22(18).
PMID: 34575842 PMC: 8464664. DOI: 10.3390/ijms22189684.
In silico model for miRNA-mediated regulatory network in cancer.
Ahmed K, Sun J, Chen W, Martinez I, Cheng S, Zhang W Brief Bioinform. 2021; 22(6).
PMID: 34279571 PMC: 8575005. DOI: 10.1093/bib/bbab264.