» Articles » PMID: 38243693

SRNAfrag: a Pipeline and Suite of Tools to Analyze Fragmentation in Small RNA Sequencing Data

Overview
Journal Brief Bioinform
Specialty Biology
Date 2024 Jan 20
PMID 38243693
Authors
Affiliations
Soon will be listed here.
Abstract

Fragments derived from small RNAs such as small nucleolar RNAs are biologically relevant but remain poorly understood. To address this gap, we developed sRNAfrag, a modular and interoperable tool designed to standardize the quantification and analysis of small RNA fragmentation across various biotypes. The tool outputs a set of tables forming a relational database, allowing for an in-depth exploration of biologically complex events such as multi-mapping and RNA fragment stability across different cell types. In a benchmark test, sRNAfrag was able to identify established loci of mature microRNAs solely based on sequencing data. Furthermore, the 5' seed sequence could be rediscovered by utilizing a visualization approach primarily applied in multi-sequence-alignments. Utilizing the relational database outputs, we detected 1411 snoRNA fragment conservation events between two out of four eukaryotic species, providing an opportunity to explore motifs through evolutionary time and conserved fragmentation patterns. Additionally, the tool's interoperability with other bioinformatics tools like ViennaRNA amplifies its utility for customized analyses. We also introduce a novel loci-level variance-score which provides insights into the noise around peaks and demonstrates biological relevance by distinctly separating breast cancer and neuroblastoma cell lines after dimension reduction when applied to small nucleolar RNAs. Overall, sRNAfrag serves as a versatile foundation for advancing our understanding of small RNA fragments and offers a functional foundation to further small RNA research. Availability: https://github.com/kenminsoo/sRNAfrag.

References
1.
Liao Y, Smyth G, Shi W . featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2013; 30(7):923-30. DOI: 10.1093/bioinformatics/btt656. View

2.
Wagih O . ggseqlogo: a versatile R package for drawing sequence logos. Bioinformatics. 2017; 33(22):3645-3647. DOI: 10.1093/bioinformatics/btx469. View

3.
Ma L, Zou D, Liu L, Shireen H, Abbasi A, Bateman A . Database Commons: A Catalog of Worldwide Biological Databases. Genomics Proteomics Bioinformatics. 2022; 21(5):1054-1058. PMC: 10928426. DOI: 10.1016/j.gpb.2022.12.004. View

4.
Voinnet O . Shaping small RNAs in plants by gene duplication. Nat Genet. 2004; 36(12):1245-6. DOI: 10.1038/ng1204-1245. View

5.
Zhang Y, Qian H, He J, Gao W . Mechanisms of tRNA-derived fragments and tRNA halves in cancer treatment resistance. Biomark Res. 2020; 8:52. PMC: 7559774. DOI: 10.1186/s40364-020-00233-0. View