» Articles » PMID: 32190932

Investigation of Prediction Accuracy and the Impact of Sample Size, Ancestry, and Tissue in Transcriptome-wide Association Studies

Overview
Journal Genet Epidemiol
Specialties Genetics
Public Health
Date 2020 Mar 20
PMID 32190932
Citations 18
Authors
Affiliations
Soon will be listed here.
Abstract

In transcriptome-wide association studies (TWAS), gene expression values are predicted using genotype data and tested for association with a phenotype. The power of this approach to detect associations relies, at least in part, on the accuracy of the prediction. Here we compare the prediction accuracy of six different methods-LASSO, Ridge regression, Elastic net, Best Linear Unbiased Predictor, Bayesian Sparse Linear Mixed Model, and Random Forests-by performing cross-validation using data from the Geuvadis Project. We also examine prediction accuracy (a) at different sample sizes, (b) when ancestry of the prediction model training and testing populations is different, and (c) when the tissue used to train the model is different from the tissue to be predicted. We find that, for most genes, the expression cannot be accurately predicted, but in general sparse statistical models tend to outperform polygenic models at prediction. Average prediction accuracy is reduced when the model training set size is reduced or when predicting across ancestries and is marginally reduced when predicting across tissues. We conclude that using sparse statistical models and the development of large reference panels across multiple ethnicities and tissues will lead to better prediction of gene expression, and thus may improve TWAS power.

Citing Articles

Transferability of Single- and Cross-Tissue Transcriptome Imputation Models Across Ancestry Groups.

Pagnuco I, Eyre S, Rattray M, Morris A Genet Epidemiol. 2025; 49(1):e22611.

PMID: 39812501 PMC: 11734644. DOI: 10.1002/gepi.22611.


Integrating Gene Expression Data into Single-Step Method (ssBLUP) Improves Genomic Prediction Accuracy for Complex Traits of Duroc × Erhualian F Pig Population.

Xu F, Che Z, Qiao J, Han P, Miao N, Dai X Curr Issues Mol Biol. 2024; 46(12):13713-13724.

PMID: 39727947 PMC: 11727526. DOI: 10.3390/cimb46120819.


Unraveling the Complexity of Chikungunya Virus Infection Immunological and Genetic Insights in Acute and Chronic Patients.

Fritsch H, Giovanetti M, Clemente L, da Rocha Fernandes G, Fonseca V, de Lima M Genes (Basel). 2024; 15(11).

PMID: 39596565 PMC: 11593632. DOI: 10.3390/genes15111365.


A bootstrap model comparison test for identifying genes with context-specific patterns of genetic regulation.

Malakhov M, Dai B, Shen X, Pan W Ann Appl Stat. 2024; 18(3):1840-1857.

PMID: 39421855 PMC: 11484521. DOI: 10.1214/23-aoas1859.


Multivariate adaptive shrinkage improves cross-population transcriptome prediction and association studies in underrepresented populations.

S Araujo D, Nguyen C, Hu X, Mikhaylova A, Gignoux C, Ardlie K HGG Adv. 2023; 4(4):100216.

PMID: 37869564 PMC: 10589725. DOI: 10.1016/j.xhgg.2023.100216.


References
1.
Bae S, Choi S, Kim S, Park T . Prediction of Quantitative Traits Using Common Genetic Variants: Application to Body Mass Index. Genomics Inform. 2017; 14(4):149-159. PMC: 5287118. DOI: 10.5808/GI.2016.14.4.149. View

2.
Zeng P, Zhou X, Huang S . Prediction of gene expression with cis-SNPs using mixed models and regularization methods. BMC Genomics. 2017; 18(1):368. PMC: 5425981. DOI: 10.1186/s12864-017-3759-6. View

3.
Dudbridge F . Power and predictive accuracy of polygenic risk scores. PLoS Genet. 2013; 9(3):e1003348. PMC: 3605113. DOI: 10.1371/journal.pgen.1003348. View

4.
Mancuso N, Shi H, Goddard P, Kichaev G, Gusev A, Pasaniuc B . Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. Am J Hum Genet. 2017; 100(3):473-487. PMC: 5339290. DOI: 10.1016/j.ajhg.2017.01.031. View

5.
Sarkar R, Rao A, Meher P, Nepolean T, Mohapatra T . Evaluation of random forest regression for prediction of breeding value from genomewide SNPs. J Genet. 2015; 94(2):187-92. DOI: 10.1007/s12041-015-0501-5. View