» Articles » PMID: 33590861

Twelve Years of SAMtools and BCFtools

Overview
Journal Gigascience
Specialties Biology
Genetics
Date 2021 Feb 16
PMID 33590861
Citations 4014
Authors
Affiliations
Soon will be listed here.
Abstract

Background: SAMtools and BCFtools are widely used programs for processing and analysing high-throughput sequencing data. They include tools for file format conversion and manipulation, sorting, querying, statistics, variant calling, and effect analysis amongst other methods.

Findings: The first version appeared online 12 years ago and has been maintained and further developed ever since, with many new features and improvements added over the years. The SAMtools and BCFtools packages represent a unique collection of tools that have been used in numerous other software projects and countless genomic pipelines.

Conclusion: Both SAMtools and BCFtools are freely available on GitHub under the permissive MIT licence, free for both non-commercial and commercial use. Both packages have been installed >1 million times via Bioconda. The source code and documentation are available from https://www.htslib.org.

Citing Articles

Next-generation sequencing-based population genetics unravels the evolutionary history of Rhodomyrtus tomentosa in China.

Xu X, Liao B, Liao S, Qin Q, He C, Ding X BMC Plant Biol. 2025; 25(1):338.

PMID: 40089704 DOI: 10.1186/s12870-025-06364-6.


Direct measurement of the male germline mutation rate in individuals using sequential sperm samples.

Shoag J, Srinivasa A, Loh C, Liu M, Lassen E, Melanaphy S Nat Commun. 2025; 16(1):2546.

PMID: 40089484 DOI: 10.1038/s41467-025-57507-0.


Benchmarking long-read structural variant calling tools and combinations for detecting somatic variants in cancer genomes.

Aydin S, Yilmaz K, Acar A Sci Rep. 2025; 15(1):8707.

PMID: 40082509 PMC: 11906795. DOI: 10.1038/s41598-025-92750-x.


Integration of therapeutic cargo into the human genome with programmable type V-K CAST.

Liu J, Aliaga Goltsman D, Alexander L, Khayi K, Hong J, Dunham D Nat Commun. 2025; 16(1):2427.

PMID: 40082411 PMC: 11906591. DOI: 10.1038/s41467-025-57416-2.


The genome sequence of a flea beetle, (Marsham, 1802).

Geiser M, Sims I Wellcome Open Res. 2025; 10:62.

PMID: 40078959 PMC: 11897695. DOI: 10.12688/wellcomeopenres.23697.1.


References
1.
Danecek P, McCarthy S, Durbin R . A Method for Checking Genomic Integrity in Cultured Cell Lines from SNP Genotyping Data. PLoS One. 2016; 11(5):e0155014. PMC: 4866717. DOI: 10.1371/journal.pone.0155014. View

2.
Schilbert H, Rempel A, Pucker B . Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data. Plants (Basel). 2020; 9(4). PMC: 7238416. DOI: 10.3390/plants9040439. View

3.
Danecek P, Bonfield J, Liddle J, Marshall J, Ohan V, Pollard M . Twelve years of SAMtools and BCFtools. Gigascience. 2021; 10(2). PMC: 7931819. DOI: 10.1093/gigascience/giab008. View

4.
DePristo M, Banks E, Poplin R, Garimella K, Maguire J, Hartl C . A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011; 43(5):491-8. PMC: 3083463. DOI: 10.1038/ng.806. View

5.
Pightling A, Petronella N, Pagotto F . Choice of reference-guided sequence assembler and SNP caller for analysis of Listeria monocytogenes short-read sequence data greatly influences rates of error. BMC Res Notes. 2015; 8:748. PMC: 4672502. DOI: 10.1186/s13104-015-1689-4. View