» Articles » PMID: 19503618

The Impact of Outgroup Choice and Missing Data on Major Seed Plant Phylogenetics Using Genome-wide EST Data

Overview
Journal PLoS One
Date 2009 Jun 9
PMID 19503618
Citations 21
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Genome level analyses have enhanced our view of phylogenetics in many areas of the tree of life. With the production of whole genome DNA sequences of hundreds of organisms and large-scale EST databases a large number of candidate genes for inclusion into phylogenetic analysis have become available. In this work, we exploit the burgeoning genomic data being generated for plant genomes to address one of the more important plant phylogenetic questions concerning the hierarchical relationships of the several major seed plant lineages (angiosperms, Cycadales, Gingkoales, Gnetales, and Coniferales), which continues to be a work in progress, despite numerous studies using single, few or several genes and morphology datasets. Although most recent studies support the notion that gymnosperms and angiosperms are monophyletic and sister groups, they differ on the topological arrangements within each major group.

Methodology: We exploited the EST database to construct a supermatrix of DNA sequences (over 1,200 concatenated orthologous gene partitions for 17 taxa) to examine non-flowering seed plant relationships. This analysis employed programs that offer rapid and robust orthology determination of novel, short sequences from plant ESTs based on reference seed plant genomes. Our phylogenetic analysis retrieved an unbiased (with respect to gene choice), well-resolved and highly supported phylogenetic hypothesis that was robust to various outgroup combinations.

Conclusions: We evaluated character support and the relative contribution of numerous variables (e.g. gene number, missing data, partitioning schemes, taxon sampling and outgroup choice) on tree topology, stability and support metrics. Our results indicate that while missing characters and order of addition of genes to an analysis do not influence branch support, inadequate taxon sampling and limited choice of outgroup(s) can lead to spurious inference of phylogeny when dealing with phylogenomic scale data sets. As expected, support and resolution increases significantly as more informative characters are added, until reaching a threshold, beyond which support metrics stabilize, and the effect of adding conflicting characters is minimized.

Citing Articles

Possible effect of mutations on serological detection of Borrelia burgdorferi sensu stricto ospC major groups: An in-silico study.

Mechai S, Coatsworth H, Ogden N PLoS One. 2023; 18(10):e0292741.

PMID: 37815990 PMC: 10564231. DOI: 10.1371/journal.pone.0292741.


Phylogenetic analysis of classical swine fever virus isolates from China.

Zhu X, Liu M, Wu X, Ma W, Zhao X Arch Virol. 2021; 166(8):2255-2261.

PMID: 34003359 DOI: 10.1007/s00705-021-05084-0.


Large Phylogenomic Data sets Reveal Deep Relationships and Trait Evolution in Chlorophyte Green Algae.

Li X, Hou Z, Xu C, Shi X, Yang L, Lewis L Genome Biol Evol. 2021; 13(7).

PMID: 33950183 PMC: 8271138. DOI: 10.1093/gbe/evab101.


Paralogs and off-target sequences improve phylogenetic resolution in a densely-sampled study of the breadfruit genus (Artocarpus, Moraceae).

Gardner E, Johnson M, Pereira J, Ahmad Puad A, Arifiani D, Wickett N Syst Biol. 2020; .

PMID: 32970819 PMC: 8048387. DOI: 10.1093/sysbio/syaa073.


The study of inter-specific relationships of Bromus genus based on SCoT and ISSR molecular markers.

Safari H, Zebarjadi A, Kahrizi D, Ashraf Jafari A Mol Biol Rep. 2019; 46(5):5209-5223.

PMID: 31313131 DOI: 10.1007/s11033-019-04978-2.


References
1.
Abascal F, Zardoya R, Posada D . ProtTest: selection of best-fit models of protein evolution. Bioinformatics. 2005; 21(9):2104-5. DOI: 10.1093/bioinformatics/bti263. View

2.
Soltis D, Soltis P, Zanis M . Phylogeny of seed plants based on evidence from eight genes. Am J Bot. 2011; 89(10):1670-81. DOI: 10.3732/ajb.89.10.1670. View

3.
Soltis P, Soltis D, Chase M . Angiosperm phylogeny inferred from multiple genes as a tool for comparative biology. Nature. 1999; 402(6760):402-4. DOI: 10.1038/46528. View

4.
Yu J, Hu S, Wang J, Wong G, Li S, Liu B . A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science. 2002; 296(5565):79-92. DOI: 10.1126/science.1068037. View

5.
Donoghue M, Doyle J . Seed plant phylogeny: Demise of the anthophyte hypothesis?. Curr Biol. 2000; 10(3):R106-9. DOI: 10.1016/s0960-9822(00)00304-3. View