» Articles » PMID: 39274004

An Extended Application of the Fast Multi-Locus Ridge Regression Algorithm in Genome-Wide Association Studies of Categorical Phenotypes

Overview
Journal Plants (Basel)
Date 2024 Sep 14
PMID 39274004
Authors
Affiliations
Soon will be listed here.
Abstract

Categorical (either binary or ordinal) quantitative traits are widely observed to measure count and resistance in plants. Unlike continuous traits, categorical traits often provide less detailed insights into genetic variation and possess a more complex underlying genetic architecture, which presents additional challenges for their genome-wide association studies. Meanwhile, methods designed for binary or continuous phenotypes are commonly used to inappropriately analyze ordinal traits, which leads to the loss of original phenotype information and the detection power of quantitative trait nucleotides (QTN). To address these issues, fast multi-locus ridge regression (FastRR), which was originally designed for continuous traits, is used to directly analyze binary or ordinal traits in this study. FastRR includes three stages of continuous transformation, variable reduction, and parameter estimation, and it can computationally handle categorical phenotype data instead of link functions introduced or methods inappropriately used. A series of simulation studies demonstrate that, compared with four other continuous or binary or ordinal approaches, including logistic regression, FarmCPU, FaST-LMM, and POLMM, the FastRR method outperforms in the detection of small-effect QTN, accuracy of estimated effect, and computation speed. We applied FastRR to 14 binary or ordinal phenotypes in the real dataset and identified 479 significant loci and 76 known genes, at least seven times as many as detected by other algorithms. These findings underscore the potential of FastRR as a very useful tool for genome-wide association studies and novel gene mining of binary and ordinal traits.

References
1.
Atwell S, Huang Y, Vilhjalmsson B, Willems G, Horton M, Li Y . Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature. 2010; 465(7298):627-31. PMC: 3023908. DOI: 10.1038/nature08800. View

2.
Loh P, Tucker G, Bulik-Sullivan B, Vilhjalmsson B, Finucane H, Salem R . Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet. 2015; 47(3):284-90. PMC: 4342297. DOI: 10.1038/ng.3190. View

3.
Xu Y, Xing L, Su J, Zhang X, Qiu W . Model-based clustering for identifying disease-associated SNPs in case-control genome-wide association studies. Sci Rep. 2019; 9(1):13686. PMC: 6757104. DOI: 10.1038/s41598-019-50229-6. View

4.
Tamba C, Ni Y, Zhang Y . Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies. PLoS Comput Biol. 2017; 13(1):e1005357. PMC: 5308866. DOI: 10.1371/journal.pcbi.1005357. View

5.
Zhang J, Chen M, Wen Y, Zhang Y, Lu Y, Wang S . A Fast Multi-Locus Ridge Regression Algorithm for High-Dimensional Genome-Wide Association Studies. Front Genet. 2021; 12:649196. PMC: 8041068. DOI: 10.3389/fgene.2021.649196. View