» Articles » PMID: 30903685

ASTRAL-MP: Scaling ASTRAL to Very Large Datasets Using Randomization and Parallelization

Overview
Journal Bioinformatics
Specialty Biology
Date 2019 Mar 24
PMID 30903685
Citations 33
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Evolutionary histories can change from one part of the genome to another. The potential for discordance between the gene trees has motivated the development of summary methods that reconstruct a species tree from an input collection of gene trees. ASTRAL is a widely used summary method and has been able to scale to relatively large datasets. However, the size of genomic datasets is quickly growing. Despite its relative efficiency, the current single-threaded implementation of ASTRAL is falling behind the data growth trends is not able to analyze the largest available datasets in a reasonable time.

Results: ASTRAL uses dynamic programing and is not trivially parallel. In this paper, we introduce ASTRAL-MP, the first version of ASTRAL that can exploit parallelism and also uses randomization techniques to speed up some of its steps. Importantly, ASTRAL-MP can take advantage of not just multiple CPU cores but also one or several graphics processing units (GPUs). The ASTRAL-MP code scales very well with increasing CPU cores, and its GPU version, implemented in OpenCL, can have up to 158× speedups compared to ASTRAL-III. Using GPUs and multiple cores, ASTRAL-MP is able to analyze datasets with 10 000 species or datasets with more than 100 000 genes in <2 days.

Availability And Implementation: ASTRAL-MP is available at https://github.com/smirarab/ASTRAL/tree/MP.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

Sperm competition intensity shapes divergence in both sperm morphology and reproductive genes across murine rodents.

Kopania E, Thomas G, Hutter C, Mortimer S, Callahan C, Roycroft E Evolution. 2024; 79(1):11-27.

PMID: 39392918 PMC: 11663510. DOI: 10.1093/evolut/qpae146.


Selection Across the Three-Dimensional Structure of Venom Proteins from North American Scolopendromorph Centipedes.

Ellsworth S, Rautsaw R, Ward M, Holding M, Rokyta D J Mol Evol. 2024; 92(4):505-524.

PMID: 39026042 DOI: 10.1007/s00239-024-10191-y.


Single-fly genome assemblies fill major phylogenomic gaps across the Drosophilidae Tree of Life.

Kim B, Gellert H, Church S, Suvorov A, Anderson S, Barmina O PLoS Biol. 2024; 22(7):e3002697.

PMID: 39024225 PMC: 11257246. DOI: 10.1371/journal.pbio.3002697.


genome assembly and population genomics of a shrub tree (Hance) krass provide insights into the adaptive color variations.

Huang W, Xu B, Guo W, Huang Z, Li Y, Wu W Front Plant Sci. 2024; 15:1365686.

PMID: 38751846 PMC: 11094225. DOI: 10.3389/fpls.2024.1365686.


Phylogenomic Discordance is Driven by Wide-Spread Introgression and Incomplete Lineage Sorting During Rapid Species Diversification Within Rattlesnakes (Viperidae: Crotalus and Sistrurus).

Myers E, Rautsaw R, Borja M, Jones J, Grunwald C, Holding M Syst Biol. 2024; 73(4):722-741.

PMID: 38695290 PMC: 11906154. DOI: 10.1093/sysbio/syae018.