Phylogenetic Analysis That Models Compositional Heterogeneity over the Tree
Overview
Authors
Affiliations
Molecular sequences in a phylogenetic analysis can differ in composition, and that shows that the process of evolution can change over time. However, models of evolution in common use are homogeneous over the tree, and if used in a phylogenetic analysis with compositionally tree-heterogeneous datasets these models can recover incorrect trees. The NDCH or Node-Discrete Compositional Heterogeneity model is able to model such data by accommodating differences in composition over the tree. Usage, problems, and limitations of this model are discussed, and a modification, the NDCH2 model, is described that can ameliorate some of these problems and limitations. Using these models can greatly increase the fit of the model to the data and can find better tree topologies. These models and various statistical tests are illustrated using a bacterial SSU rRNA dataset. These models are implemented in the software P4, and files for the analyses described here are made available.