» Articles » PMID: 39229095

SQANTI-reads: a Tool for the Quality Assessment of Long Read Data in Multi-sample LrRNA-seq Experiments

Overview
Journal bioRxiv
Date 2024 Sep 4
PMID 39229095
Authors
Affiliations
Soon will be listed here.
Abstract

SQANTI-reads leverages SQANTI3, a tool for the analysis of the quality of transcript models, to develop a read-level quality control framework for replicated long-read RNA-seq experiments. The number and distribution of reads, as well as the number and distribution of unique junction chains (transcript splicing patterns), in SQANTI3 structural categories are informative of raw data quality. Multi-sample visualizations of QC metrics are presented by experimental design factors to identify outliers. We introduce new metrics for 1) the identification of potentially under-annotated genes and putative novel transcripts and for 2) quantifying variation in junction donors and acceptors. We applied SQANTI-reads to two different datasets, a developmental experiment and a multi-platform dataset from the LRGASP project and demonstrate that the tool effectively reveals the impact of read coverage on data quality, and readily identifies strong and weak splicing sites. SQANTI-reads is open source and available for download at GitHub.

References
1.
AlKhafaji A, Smith J, Garimella K, Babadi M, Popic V, Sade-Feldman M . High-throughput RNA isoform sequencing using programmed cDNA concatenation. Nat Biotechnol. 2023; 42(4):582-586. DOI: 10.1038/s41587-023-01815-7. View

2.
Delahaye C, Nicolas J . Sequencing DNA with nanopores: Troubles and biases. PLoS One. 2021; 16(10):e0257521. PMC: 8486125. DOI: 10.1371/journal.pone.0257521. View

3.
Ozturk-Colak A, Marygold S, Antonazzo G, Attrill H, Goutte-Gattat D, Jenkins V . FlyBase: updates to the Drosophila genes and genomes database. Genetics. 2024; 227(1). PMC: 11075543. DOI: 10.1093/genetics/iyad211. View

4.
Tardaguila M, de la Fuente L, Marti C, Pereira C, Pardo-Palacios F, Del Risco H . SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification. Genome Res. 2018; 28(3):396-411. PMC: 5848618. DOI: 10.1101/gr.222976.117. View

5.
Joglekar A, Hu W, Zhang B, Narykov O, Diekhans M, Marrocco J . Single-cell long-read sequencing-based mapping reveals specialized splicing patterns in developing and adult mouse and human brain. Nat Neurosci. 2024; 27(6):1051-1063. PMC: 11156538. DOI: 10.1038/s41593-024-01616-4. View