» Articles » PMID: 22628519

RAxML-Light: a Tool for Computing Terabyte Phylogenies

Overview
Journal Bioinformatics
Specialty Biology
Date 2012 May 26
PMID 22628519
Citations 65
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Due to advances in molecular sequencing and the increasingly rapid collection of molecular data, the field of phyloinformatics is transforming into a computational science. Therefore, new tools are required that can be deployed in supercomputing environments and that scale to hundreds or thousands of cores.

Results: We describe RAxML-Light, a tool for large-scale phylogenetic inference on supercomputers under maximum likelihood. It implements a light-weight checkpointing mechanism, deploys 128-bit (SSE3) and 256-bit (AVX) vector intrinsics, offers two orthogonal memory saving techniques and provides a fine-grain production-level message passing interface parallelization of the likelihood function. To demonstrate scalability and robustness of the code, we inferred a phylogeny on a simulated DNA alignment (1481 taxa, 20 000 000 bp) using 672 cores. This dataset requires one terabyte of RAM to compute the likelihood score on a single tree. CODE AVAILABILITY: https://github.com/stamatak/RAxML-Light-1.0.5 DATA AVAILABILITY: http://www.exelixis-lab.org/onLineMaterial.tar.bz2

Contact: alexandros.stamatakis@h-its.org

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

Insights into global antimicrobial resistance dynamics through the sequencing of enteric bacteria from U.S. international travelers.

Sridhar S, Worby C, Worby C, Bronson R, Turbett S, Oliver E bioRxiv. 2025; .

PMID: 39974885 PMC: 11838388. DOI: 10.1101/2025.01.27.635056.


Whole-genome sequencing-based genetic diversity, transmission dynamics, and drug-resistant mutations in isolated from extrapulmonary tuberculosis patients in western Ethiopia.

Chekesa B, Singh H, Gonzalez-Juarbe N, Vashee S, Wiscovitch-Russo R, Dupont C Front Public Health. 2024; 12:1399731.

PMID: 39185123 PMC: 11341482. DOI: 10.3389/fpubh.2024.1399731.


Hierarchical Heuristic Species Delimitation Under the Multispecies Coalescent Model with Migration.

Kornai D, Jiao X, Ji J, Flouri T, Yang Z Syst Biol. 2024; 73(6):1015-1037.

PMID: 39180155 PMC: 11637770. DOI: 10.1093/sysbio/syae050.


Genomic insights of Salmonella isolated from dry fermented sausage production chains in Spain and France.

Ferrer-Bustins N, Yvon C, Martin B, Leclerc V, Leblanc J, Corominas L Sci Rep. 2024; 14(1):11660.

PMID: 38777847 PMC: 11111747. DOI: 10.1038/s41598-024-62141-9.


Genomes of nine biofilm-forming filamentous strains of Cyanobacteria (genera , and gen. nov.) isolated from mangrove habitats of Guadeloupe (Lesser Antilles).

Halary S, Duval C, Marie B, Bernard C, Piquet B, Gros O FEMS Microbes. 2024; 5:xtad024.

PMID: 38213393 PMC: 10781437. DOI: 10.1093/femsmc/xtad024.


References
1.
Ronquist F, Huelsenbeck J . MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003; 19(12):1572-4. DOI: 10.1093/bioinformatics/btg180. View

2.
Ayres D, Darling A, Zwickl D, Beerli P, Holder M, Lewis P . BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics. Syst Biol. 2011; 61(1):170-3. PMC: 3243739. DOI: 10.1093/sysbio/syr100. View

3.
Izquierdo-Carrasco F, Smith S, Stamatakis A . Algorithms, data structures, and numerics for likelihood-based phylogenetic inference of huge trees. BMC Bioinformatics. 2011; 12:470. PMC: 3267785. DOI: 10.1186/1471-2105-12-470. View

4.
Rambaut A, Grassly N . Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comput Appl Biosci. 1997; 13(3):235-8. DOI: 10.1093/bioinformatics/13.3.235. View