R$^{2}$s for Correlated Data: Phylogenetic Models, LMMs, and GLMMs

Overview

Journal Syst Biol

Publisher Oxford University Press

Specialty Biology

Date 2018 Sep 22

PMID 30239975

Citations 57

Authors

Anthony R Ives

Affiliations

Soon will be listed here.

Abstract

Many researchers want to report an $R^{2}$ to measure the variance explained by a model. When the model includes correlation among data, such as phylogenetic models and mixed models, defining an $R^{2}$ faces two conceptual problems. (i) It is unclear how to measure the variance explained by predictor (independent) variables when the model contains covariances. (ii) Researchers may want the $R^{2}$ to include the variance explained by the covariances by asking questions such as "How much of the data is explained by phylogeny?" Here, I investigated three $R^{2}$s for phylogenetic and mixed models. $R^{2}_{resid}$ is an extension of the ordinary least-squares $R^{2}$ that weights residuals by variances and covariances estimated by the model; it is closely related to $R^{2}_{glmm}$ presented by Nakagawa and Schielzeth (2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods Ecol. Evol. 4:133-142). $R^{2}_{pred}$ is based on predicting each residual from the fitted model and computing the variance between observed and predicted values. $R^{2}_{lik}$ is based on the likelihood of fitted models, and therefore, reflects the amount of information that the models contain. These three $R^{2}$s are formulated as partial $R^{2}$s, making it possible to compare the contributions of predictor variables and variance components (phylogenetic signal and random effects) to the fit of models. Because partial $R^{2}$s compare a full model with a reduced model without components of the full model, they are distinct from marginal $R^{2}$s that partition additive components of the variance. I assessed the properties of the $R^{2}$s for phylogenetic models using simulations for continuous and binary response data (phylogenetic generalized least squares and phylogenetic logistic regression). Because the $R^{2}$s are designed broadly for any model for correlated data, I also compared $R^{2}$s for linear mixed models and generalized linear mixed models. $R^{2}_{resid}$, $R^{2}_{pred}$, and $R^{2}_{lik}$ all have similar performance in describing the variance explained by different components of models. However, $R^{2}_{pred}$ gives the most direct answer to the question of how much variance in the data is explained by a model. $R^{2}_{resid}$ is most appropriate for comparing models fit to different data sets, because it does not depend on sample sizes. And $R^{2}_{lik}$ is most appropriate to assess the importance of different components within the same model applied to the same data, because it is most closely associated with statistical significance tests.

Citing Articles

Negative global-scale association between genetic diversity and speciation rates in mammals.

Afonso Silva A, Maliet O, Aristide L, Nogues-Bravo D, Upham N, Jetz W Nat Commun. 2025; 16(1):1796.

PMID: 39979262 PMC: 11842793. DOI: 10.1038/s41467-025-56820-y.

Metabolic rate of angiosperm seeds: effects of allometry, phylogeny and bioclimate.

Dalziell E, Tomlinson S, Merritt D, Lewandrowski W, Turner S, Withers P Proc Biol Sci. 2025; 292(2041):20242683.

PMID: 39968610 PMC: 11836704. DOI: 10.1098/rspb.2024.2683.

Does metabolic rate influence genome-wide amino acid composition in the course of animal evolution?.

Wang W, Zhang D Evol Lett. 2025; 9(1):137-149.

PMID: 39906584 PMC: 11790228. DOI: 10.1093/evlett/qrae061.

A phylogenetic approach to comparative genomics.

Dewar A, Belcher L, West S Nat Rev Genet. 2025; .

PMID: 39779997 PMC: 7617348. DOI: 10.1038/s41576-024-00803-0.

Shell Constraints on Evolutionary Body Size-Limb Size Allometry Can Explain Morphological Conservatism in the Turtle Body Plan.

Hermanson G, Evers S Ecol Evol. 2024; 14(11):e70504.

PMID: 39539674 PMC: 11557996. DOI: 10.1002/ece3.70504.