» Articles » PMID: 37143150

Robust Identification of Regulatory Variants (eQTLs) Using a Differential Expression Framework Developed for RNA-sequencing

Overview
Publisher Biomed Central
Date 2023 May 4
PMID 37143150
Authors
Affiliations
Soon will be listed here.
Abstract

Background: A gap currently exists between genetic variants and the underlying cell and tissue biology of a trait, and expression quantitative trait loci (eQTL) studies provide important information to help close that gap. However, two concerns that arise with eQTL analyses using RNA-sequencing data are normalization of data across samples and the data not following a normal distribution. Multiple pipelines have been suggested to address this. For instance, the most recent analysis of the human and farm Genotype-Tissue Expression (GTEx) project proposes using trimmed means of M-values (TMM) to normalize the data followed by an inverse normal transformation.

Results: In this study, we reasoned that eQTL analysis could be carried out using the same framework used for differential gene expression (DGE), which uses a negative binomial model, a statistical test feasible for count data. Using the GTEx framework, we identified 35 significant eQTLs (P < 5 × 10) following the ANOVA model and 39 significant eQTLs (P < 5 × 10) following the additive model. Using a differential gene expression framework, we identified 930 and six significant eQTLs (P < 5 × 10) following an analytical framework equivalent to the ANOVA and additive model, respectively. When we compared the two approaches, there was no overlap of significant eQTLs between the two frameworks. Because we defined specific contrasts, we identified trans eQTLs that more closely resembled what we expect from genetic variants showing complete dominance between alleles. Yet, these were not identified by the GTEx framework.

Conclusions: Our results show that transforming RNA-sequencing data to fit a normal distribution prior to eQTL analysis is not required when the DGE framework is employed. Our proposed approach detected biologically relevant variants that otherwise would not have been identified due to data transformation to fit a normal distribution.

References
1.
Elston R, Satagopan J, Sun S . Genetic terminology. Methods Mol Biol. 2012; 850:1-9. PMC: 4450815. DOI: 10.1007/978-1-61779-555-8_1. View

2.
Loguercio S, Overall R, Michaelson J, Wiltshire T, Pletcher M, Miller B . Integrative analysis of low- and high-resolution eQTL. PLoS One. 2010; 5(11):e13920. PMC: 2978079. DOI: 10.1371/journal.pone.0013920. View

3.
Price A, Patterson N, Plenge R, Weinblatt M, Shadick N, Reich D . Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006; 38(8):904-9. DOI: 10.1038/ng1847. View

4.
Dendrou C, Plagnol V, Fung E, Yang J, Downes K, Cooper J . Cell-specific protein phenotypes for the autoimmune locus IL2RA using a genotype-selectable human bioresource. Nat Genet. 2009; 41(9):1011-5. PMC: 2749506. DOI: 10.1038/ng.434. View

5.
Fang L, Cai W, Liu S, Canela-Xandri O, Gao Y, Jiang J . Comprehensive analyses of 723 transcriptomes enhance genetic and biological interpretations for complex traits in cattle. Genome Res. 2020; 30(5):790-801. PMC: 7263193. DOI: 10.1101/gr.250704.119. View