» Articles » PMID: 23941359

Barnacle: Detecting and Characterizing Tandem Duplications and Fusions in Transcriptome Assemblies

Abstract

Background: Chimeric transcripts, including partial and internal tandem duplications (PTDs, ITDs) and gene fusions, are important in the detection, prognosis, and treatment of human cancers.

Results: We describe Barnacle, a production-grade analysis tool that detects such chimeras in de novo assemblies of RNA-seq data, and supports prioritizing them for review and validation by reporting the relative coverage of co-occurring chimeric and wild-type transcripts. We demonstrate applications in large-scale disease studies, by identifying PTDs in MLL, ITDs in FLT3, and reciprocal fusions between PML and RARA, in two deeply sequenced acute myeloid leukemia (AML) RNA-seq datasets.

Conclusions: Our analyses of real and simulated data sets show that, with appropriate filter settings, Barnacle makes highly specific predictions for three types of chimeric transcripts that are important in a range of cancers: PTDs, ITDs, and fusions. High specificity makes manual review and validation efficient, which is necessary in large-scale disease studies. Characterizing an extended range of chimera types will help generate insights into progression, treatment, and outcomes for complex diseases.

Citing Articles

MINTIE: identifying novel structural and splice variants in transcriptomes using RNA-seq data.

Cmero M, Schmidt B, Majewski I, Ekert P, Oshlack A, Davidson N Genome Biol. 2021; 22(1):296.

PMID: 34686194 PMC: 8532352. DOI: 10.1186/s13059-021-02507-8.


SQUID: transcriptomic structural variation detection from RNA-seq.

Ma C, Shao M, Kingsford C Genome Biol. 2018; 19(1):52.

PMID: 29650026 PMC: 5896115. DOI: 10.1186/s13059-018-1421-5.


Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes.

Numanagic I, Malikic S, Ford M, Qin X, Toji L, Radovich M Nat Commun. 2018; 9(1):828.

PMID: 29483503 PMC: 5826927. DOI: 10.1038/s41467-018-03273-1.


Computational identification of micro-structural variations and their proteogenomic consequences in cancer.

Lin Y, Gawronski A, Hach F, Li S, Numanagic I, Sarrafi I Bioinformatics. 2017; 34(10):1672-1681.

PMID: 29267878 PMC: 5946953. DOI: 10.1093/bioinformatics/btx807.


Direct Transcriptional Consequences of Somatic Mutation in Breast Cancer.

Shlien A, Raine K, Fuligni F, Arnold R, Nik-Zainal S, Dronov S Cell Rep. 2016; 16(7):2032-46.

PMID: 27498871 PMC: 4987284. DOI: 10.1016/j.celrep.2016.07.028.

References
1.
Li H, Durbin R . Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25(14):1754-60. PMC: 2705234. DOI: 10.1093/bioinformatics/btp324. View

2.
Salzman J, Gawad C, Wang P, Lacayo N, Brown P . Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types. PLoS One. 2012; 7(2):e30733. PMC: 3270023. DOI: 10.1371/journal.pone.0030733. View

3.
Abyzov A, Gerstein M . AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision. Bioinformatics. 2011; 27(5):595-603. PMC: 3042181. DOI: 10.1093/bioinformatics/btq713. View

4.
Frantz S, Thiara A, Lodwick D, Ng L, Eperon I, Samani N . Exon repetition in mRNA. Proc Natl Acad Sci U S A. 1999; 96(10):5400-5. PMC: 21871. DOI: 10.1073/pnas.96.10.5400. View

5.
Benson G . Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1998; 27(2):573-80. PMC: 148217. DOI: 10.1093/nar/27.2.573. View