» Articles » PMID: 10093217

Compositional Bias May Affect Both DNA-based and Protein-based Phylogenetic Reconstructions

Overview
Journal J Mol Evol
Specialty Biochemistry
Date 1999 Mar 27
PMID 10093217
Citations 89
Authors
Affiliations
Soon will be listed here.
Abstract

It is now well-established that compositional bias in DNA sequences can adversely affect phylogenetic analysis based on those sequences. Phylogenetic analyses based on protein sequences are generally considered to be more reliable than those derived from the corresponding DNA sequences because it is believed that the use of encoded protein sequences circumvents the problems caused by nucleotide compositional biases in the DNA sequences. There exists, however, a correlation between AT/GC bias at the nucleotide level and content of AT- and GC-rich codons and their corresponding amino acids. Consequently, protein sequences can also be affected secondarily by nucleotide compositional bias. Here, we report that DNA bias not only may affect phylogenetic analysis based on DNA sequences, but also drives a protein bias which may affect analyses based on protein sequences. We present a striking example where common phylogenetic tools fail to recover the correct tree from complete animal mitochondrial protein-coding sequences. The data set is very extensive, containing several thousand sites per sequence, and the incorrect phylogenetic trees are statistically very well supported. Additionally, neither the use of the LogDet/paralinear transform nor removal of positions in the protein alignment with AT- or GC-rich codons allowed recovery of the correct tree. Two taxa with a large compositional bias continually group together in these analyses, despite a lack of close biological relatedness. We conclude that even protein-based phylogenetic trees may be misleading, and we advise caution in phylogenetic reconstruction using protein sequences, especially those that are compositionally biased.

Citing Articles

Unraveling myriapod evolution: sealion, a novel quartet-based approach for evaluating phylogenetic uncertainty.

Kuck P, Wilkinson M, Romahn J, Seidel N, Meusemann K, Wagele J NAR Genom Bioinform. 2025; 7(1):lqaf018.

PMID: 40060371 PMC: 11886814. DOI: 10.1093/nargab/lqaf018.


Characterization and organelle genome sequencing of Pyropia species from Myanmar.

San M, Kawamura Y, Kimura K, Witharana E, Shimogiri T, Aye S Sci Rep. 2023; 13(1):15677.

PMID: 37735516 PMC: 10514050. DOI: 10.1038/s41598-023-42262-3.


Early Divergence and Gene Exchange Highways in the Evolutionary History of Mesoaciditogales.

Farrell A, Nesbo C, Zhaxybayeva O Genome Biol Evol. 2023; 15(9).

PMID: 37616556 PMC: 10476701. DOI: 10.1093/gbe/evad156.


DNA Sequences Are as Useful as Protein Sequences for Inferring Deep Phylogenies.

Kapli P, Kotari I, Telford M, Goldman N, Yang Z Syst Biol. 2023; 72(5):1119-1135.

PMID: 37366056 PMC: 10627555. DOI: 10.1093/sysbio/syad036.


Mitochondrial genome comparison reveals the evolution of cnidarians.

Feng H, Lv S, Li R, Shi J, Wang J, Cao P Ecol Evol. 2023; 13(6):e10157.

PMID: 37325715 PMC: 10261974. DOI: 10.1002/ece3.10157.