» Articles » PMID: 39313645

Scalable and Unsupervised Discovery from Raw Sequencing Reads Using SPLASH2

Overview
Journal Nat Biotechnol
Specialty Biotechnology
Date 2024 Sep 23
PMID 39313645
Authors
Affiliations
Soon will be listed here.
Abstract

We introduce SPLASH2, a fast, scalable implementation of SPLASH based on an efficient k-mer counting approach for regulated sequence variation detection in massive datasets from a wide range of sequencing technologies and biological contexts. We demonstrate biological discovery by SPLASH2 in single-cell RNA sequencing (RNA-seq) data and in bulk RNA-seq data from the Cancer Cell Line Encyclopedia, including unannotated alternative splicing in cancer transcriptomes and sensitive detection of circular RNA.

Citing Articles

sc-SPLASH provides ultra-efficient reference-free discovery in barcoded single-cell sequencing.

Dehghannasiri R, Kokot M, Starr A, Maziarz J, Gordon T, Tan S bioRxiv. 2025; .

PMID: 39763839 PMC: 11703226. DOI: 10.1101/2024.12.24.630263.


A reference-free algorithm discovers regulation in the plant transcriptome.

Meyer E, Saldivar E, Kokot M, Xue B, Deorowicz S, Rhee S bioRxiv. 2024; .

PMID: 38826472 PMC: 11142198. DOI: 10.1101/2024.05.23.595613.


SPLASH: A statistical, reference-free genomic algorithm unifies biological discovery.

Chaung K, Baharav T, Henderson G, Zheludev I, Wang P, Salzman J Cell. 2023; 186(25):5440-5456.e26.

PMID: 38065078 PMC: 10861363. DOI: 10.1016/j.cell.2023.10.028.


OASIS: An interpretable, finite-sample valid alternative to Pearson's for scientific discovery.

Baharav T, Tse D, Salzman J bioRxiv. 2023; .

PMID: 37961606 PMC: 10634974. DOI: 10.1101/2023.03.16.533008.


SPLASH: a statistical, reference-free genomic algorithm unifies biological discovery.

Chaung K, Baharav T, Henderson G, Zheludev I, Wang P, Salzman J bioRxiv. 2022; .

PMID: 35794890 PMC: 9258296. DOI: 10.1101/2022.06.24.497555.

References
1.
Salzman J, Gawad C, Wang P, Lacayo N, Brown P . Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types. PLoS One. 2012; 7(2):e30733. PMC: 3270023. DOI: 10.1371/journal.pone.0030733. View

2.
Chaung K, Baharav T, Henderson G, Zheludev I, Wang P, Salzman J . SPLASH: A statistical, reference-free genomic algorithm unifies biological discovery. Cell. 2023; 186(25):5440-5456.e26. PMC: 10861363. DOI: 10.1016/j.cell.2023.10.028. View

3.
Ma X, Prudencio M, Koike Y, Vatsavayai S, Kim G, Harbinski F . TDP-43 represses cryptic exon inclusion in the FTD-ALS gene UNC13A. Nature. 2022; 603(7899):124-130. PMC: 8891019. DOI: 10.1038/s41586-022-04424-7. View

4.
Deorowicz S, Debudaj-Grabysz A, Grabowski S . Disk-based k-mer counting on a PC. BMC Bioinformatics. 2013; 14:160. PMC: 3680041. DOI: 10.1186/1471-2105-14-160. View

5.
Kokot M, Dlugosz M, Deorowicz S . KMC 3: counting and manipulating k-mer statistics. Bioinformatics. 2017; 33(17):2759-2761. DOI: 10.1093/bioinformatics/btx304. View