MetaPhat: Detecting and Decomposing Multivariate Associations From Univariate Genome-Wide Association Statistics
Overview
Affiliations
Background: Multivariate testing tools that integrate multiple genome-wide association studies (GWAS) have become important as the number of phenotypes gathered from study cohorts and biobanks has increased. While these tools have been shown to boost statistical power considerably over univariate tests, an important remaining challenge is to interpret which traits are driving the multivariate association and which traits are just passengers with minor contributions to the genotype-phenotypes association statistic.
Results: We introduce MetaPhat, a novel bioinformatics tool to conduct GWAS of multiple correlated traits using univariate GWAS results and to decompose multivariate associations into sets of central traits based on intuitive trace plots that visualize Bayesian Information Criterion (BIC) and -value statistics of multivariate association models. We validate MetaPhat with Global Lipids Genetics Consortium GWAS results, and we apply MetaPhat to univariate GWAS results for 21 heritable and correlated polyunsaturated lipid species from 2,045 Finnish samples, detecting seven independent loci associated with a cluster of lipid species. In most cases, we are able to decompose these multivariate associations to only three to five central traits out of all 21 traits included in the analyses. We release MetaPhat as an open source tool written in Python with built-in support for multi-processing, quality control, clumping and intuitive visualizations using the R software.
Conclusion: MetaPhat efficiently decomposes associations between multivariate phenotypes and genetic variants into smaller sets of central traits and improves the interpretation and specificity of genome-phenome associations. MetaPhat is freely available under the MIT license at: https://sourceforge.net/projects/meta-pheno-association-tracer.
Salenius K, Valja N, Thusberg S, Iris F, Ladd-Acosta C, Roos C BMC Psychiatry. 2024; 24(1):934.
PMID: 39696186 PMC: 11658126. DOI: 10.1186/s12888-024-06392-w.
Jung H, Jung H, Baek E, Kwon S, Kang J, Lim J Commun Biol. 2024; 7(1):180.
PMID: 38351177 PMC: 10864389. DOI: 10.1038/s42003-024-05874-7.
Integration of Biomarker Polygenic Risk Score Improves Prediction of Coronary Heart Disease.
Lin J, Mars N, Fu Y, Ripatti P, Kiiskinen T, Tukiainen T JACC Basic Transl Sci. 2024; 8(12):1489-1499.
PMID: 38205343 PMC: 10774750. DOI: 10.1016/j.jacbts.2023.07.006.
Genome-wide association analysis of plasma lipidome identifies 495 genetic associations.
Ottensmann L, Tabassum R, Ruotsalainen S, Gerl M, Klose C, Widen E Nat Commun. 2023; 14(1):6934.
PMID: 37907536 PMC: 10618167. DOI: 10.1038/s41467-023-42532-8.
LARGE-SCALE MULTIVARIATE SPARSE REGRESSION WITH APPLICATIONS TO UK BIOBANK.
Qian J, Tanigawa Y, Li R, Tibshirani R, Rivas M, Hastie T Ann Appl Stat. 2022; 16(3):1891-1918.
PMID: 36091495 PMC: 9454085. DOI: 10.1214/21-aoas1575.