» Articles » PMID: 23555596

Development of Strategies for SNP Detection in RNA-seq Data: Application to Lymphoblastoid Cell Lines and Evaluation Using 1000 Genomes Data

Overview
Journal PLoS One
Date 2013 Apr 5
PMID 23555596
Citations 71
Authors
Affiliations
Soon will be listed here.
Abstract

Next-generation RNA sequencing (RNA-seq) maps and analyzes transcriptomes and generates data on sequence variation in expressed genes. There are few reported studies on analysis strategies to maximize the yield of quality RNA-seq SNP data. We evaluated the performance of different SNP-calling methods following alignment to both genome and transcriptome by applying them to RNA-seq data from a HapMap lymphoblastoid cell line sample and comparing results with sequence variation data from 1000 Genomes. We determined that the best method to achieve high specificity and sensitivity, and greatest number of SNP calls, is to remove duplicate sequence reads after alignment to the genome and to call SNPs using SAMtools. The accuracy of SNP calls is dependent on sequence coverage available. In terms of specificity, 89% of RNA-seq SNPs calls were true variants where coverage is >10X. In terms of sensitivity, at >10X coverage 92% of all expected SNPs in expressed exons could be detected. Overall, the results indicate that RNA-seq SNP data are a very useful by-product of sequence-based transcriptome analysis. If RNA-seq is applied to disease tissue samples and assuming that genes carrying mutations relevant to disease biology are being expressed, a very high proportion of these mutations can be detected.

Citing Articles

LILRB3 genetic variation is associated with kidney transplant failure in African American recipients.

Sun Z, Yi Z, Wei C, Wang W, Ren T, Cravedi P Nat Med. 2025; .

PMID: 40065170 DOI: 10.1038/s41591-025-03568-z.


Molecular targets and strategies in the development of nucleic acid cancer vaccines: from shared to personalized antigens.

Chi W, Hu Y, Huang H, Kuo H, Lin S, Kuo C J Biomed Sci. 2024; 31(1):94.

PMID: 39379923 PMC: 11463125. DOI: 10.1186/s12929-024-01082-x.


Genetic variants in androgenetic alopecia: insights from scalp RNA sequencing data.

Premanand A, Shanmuga Priya M, Reena Rajkumari B Arch Dermatol Res. 2024; 316(8):590.

PMID: 39215850 DOI: 10.1007/s00403-024-03351-z.


BBmix: a Bayesian beta-binomial mixture model for accurate genotyping from RNA-sequencing.

Vigorito E, Barton A, Pitzalis C, Lewis M, Wallace C Bioinformatics. 2023; 39(7).

PMID: 37338536 PMC: 10318392. DOI: 10.1093/bioinformatics/btad393.


RNA-Seq-Pop: Exploiting the sequence in RNA sequencing-A Snakemake workflow reveals patterns of insecticide resistance in the malaria vector Anopheles gambiae.

Nagi S, Oruni A, Weetman D, Donnelly M Mol Ecol Resour. 2023; 23(4):946-961.

PMID: 36695302 PMC: 10568660. DOI: 10.1111/1755-0998.13759.


References
1.
Keegan L, Gallo A, OConnell M . The many roles of an RNA editor. Nat Rev Genet. 2001; 2(11):869-78. DOI: 10.1038/35098584. View

2.
Wulff B, Sakurai M, Nishikura K . Elucidating the inosinome: global approaches to adenosine-to-inosine RNA editing. Nat Rev Genet. 2010; 12(2):81-5. PMC: 3075016. DOI: 10.1038/nrg2915. View

3.
Cirulli E, Singh A, Shianna K, Ge D, Smith J, Maia J . Screening the human exome: a comparison of whole genome and whole transcriptome sequencing. Genome Biol. 2010; 11(5):R57. PMC: 2898068. DOI: 10.1186/gb-2010-11-5-r57. View

4.
Tejero M, Voruganti V, Proffitt J, Curran J, Goring H, Johnson M . Cross-species replication of a resistin mRNA QTL, but not QTLs for circulating levels of resistin, in human and baboon. Heredity (Edinb). 2008; 101(1):60-6. DOI: 10.1038/hdy.2008.28. View

5.
Derrien T, Estelle J, Marco Sola S, Knowles D, Raineri E, Guigo R . Fast computation and applications of genome mappability. PLoS One. 2012; 7(1):e30377. PMC: 3261895. DOI: 10.1371/journal.pone.0030377. View