» Articles » PMID: 29267927

VaDiR: an Integrated Approach to Variant Detection in RNA

Overview
Journal Gigascience
Specialties Biology
Genetics
Date 2017 Dec 22
PMID 29267927
Citations 14
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue.

Results: We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels.

Conclusions: Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.

Citing Articles

Characterisation of APOBEC3B-Mediated RNA editing in breast cancer cells reveals regulatory roles of NEAT1 and MALAT1 lncRNAs.

Zhang C, Lu Y, Wang M, Chen B, Xiong F, Mitsopoulos C Oncogene. 2024; 43(46):3366-3377.

PMID: 39322638 PMC: 11554567. DOI: 10.1038/s41388-024-03171-5.


FVC as an adaptive and accurate method for filtering variants from popular NGS analysis pipelines.

Ren Y, Kong Y, Zhou X, Genchev G, Zhou C, Zhao H Commun Biol. 2022; 5(1):975.

PMID: 36114280 PMC: 9481582. DOI: 10.1038/s42003-022-03397-7.


RNA-SSNV: A Reliable Somatic Single Nucleotide Variant Identification Framework for Bulk RNA-Seq Data.

Long Q, Yuan Y, Li M Front Genet. 2022; 13:865313.

PMID: 35846154 PMC: 9279659. DOI: 10.3389/fgene.2022.865313.


Whole-Transcriptome Analysis by RNA Sequencing for Genetic Diagnosis of Mendelian Skin Disorders in the Context of Consanguinity.

Youssefian L, Saeidian A, Palizban F, Bagherieh A, Abdollahimajd F, Sotoudeh S Clin Chem. 2021; 67(6):876-888.

PMID: 33969388 PMC: 8167339. DOI: 10.1093/clinchem/hvab042.


Structure-guided engineering of adenine base editor with minimized RNA off-targeting activity.

Li J, Yu W, Huang S, Wu S, Li L, Zhou J Nat Commun. 2021; 12(1):2287.

PMID: 33863894 PMC: 8052359. DOI: 10.1038/s41467-021-22519-z.


References
1.
Larson D, Harris C, Chen K, Koboldt D, Abbott T, Dooling D . SomaticSniper: identification of somatic point mutations in whole genome sequencing data. Bioinformatics. 2011; 28(3):311-7. PMC: 3268238. DOI: 10.1093/bioinformatics/btr665. View

2.
Wiegand K, Shah S, Al-Agha O, Zhao Y, Tse K, Zeng T . ARID1A mutations in endometriosis-associated ovarian carcinomas. N Engl J Med. 2010; 363(16):1532-43. PMC: 2976679. DOI: 10.1056/NEJMoa1008433. View

3.
Chou K . Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol. 2010; 273(1):236-47. PMC: 7125570. DOI: 10.1016/j.jtbi.2010.12.024. View

4.
Wang I, So E, Devlin J, Zhao Y, Wu M, Cheung V . ADAR regulates RNA editing, transcript stability, and gene expression. Cell Rep. 2013; 5(3):849-60. PMC: 3935819. DOI: 10.1016/j.celrep.2013.10.002. View

5.
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A . The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010; 20(9):1297-303. PMC: 2928508. DOI: 10.1101/gr.107524.110. View