RefShannon: A Genome-guided Transcriptome Assembler Using Sparse Flow Decomposition
Overview
Affiliations
High throughput sequencing of RNA (RNA-Seq) has become a staple in modern molecular biology, with applications not only in quantifying gene expression but also in isoform-level analysis of the RNA transcripts. To enable such an isoform-level analysis, a transcriptome assembly algorithm is utilized to stitch together the observed short reads into the corresponding transcripts. This task is complicated due to the complexity of alternative splicing - a mechanism by which the same gene may generate multiple distinct RNA transcripts. We develop a novel genome-guided transcriptome assembler, RefShannon, that exploits the varying abundances of the different transcripts, in enabling an accurate reconstruction of the transcripts. Our evaluation shows RefShannon is able to improve sensitivity effectively (up to 22%) at a given specificity in comparison with other state-of-the-art assemblers. RefShannon is written in Python and is available from Github (https://github.com/shunfumao/RefShannon).
Data-driven AI system for learning how to run transcript assemblers.
Shen Y, Yan Z, Kingsford C bioRxiv. 2024; .
PMID: 39554123 PMC: 11565938. DOI: 10.1101/2024.01.25.577290.
Induction of Invasive Basal Phenotype in Triple-Negative Breast Cancers by Long Noncoding RNA BORG.
Niazi F, Parker K, Mason S, Singh S, Schiemann W, Valadkhan S Cancers (Basel). 2024; 16(18).
PMID: 39335212 PMC: 11430157. DOI: 10.3390/cancers16183241.
ClusTrast: a short read de novo transcript isoform assembler guided by clustered contigs.
Westrin K, Kretzschmar W, Emanuelsson O BMC Bioinformatics. 2024; 25(1):54.
PMID: 38302873 PMC: 10836024. DOI: 10.1186/s12859-024-05663-3.
Lee J, Kim M, Han K, Yoon S Genes Genomics. 2023; 45(12):1599-1609.
PMID: 37837515 DOI: 10.1007/s13258-023-01458-7.
Accurate assembly of multi-end RNA-seq data with Scallop2.
Zhang Q, Shi Q, Shao M Nat Comput Sci. 2023; 2(3):148-152.
PMID: 36713932 PMC: 9879047. DOI: 10.1038/s43588-022-00216-1.