» Articles » PMID: 27769991

Computational Pan-genomics: Status, Promises and Challenges

Overview
Journal Brief Bioinform
Specialty Biology
Date 2016 Oct 23
PMID 27769991
Citations 131
Affiliations
Soon will be listed here.
Abstract

Many disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms are needed. We will witness the rapid extension of computational pan-genomics, a new sub-area of research in computational biology. In this article, we generalize existing definitions and understand a pan-genome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to computational pan-genomics can help address many of the problems currently faced in various domains.

Citing Articles

Combining genotyping approaches improves resolution for association mapping: a case study in tropical maize under water stress conditions.

de Pontes F, Machado I, Silveira M, Lobo A, Sabadin F, Fritsche-Neto R Front Plant Sci. 2025; 15:1442008.

PMID: 39917602 PMC: 11798985. DOI: 10.3389/fpls.2024.1442008.


Pan-genomics: Insight into the Functional Genome, Applications, Advancements, and Challenges.

Sarawad A, Hosagoudar S, Parvatikar P Curr Genomics. 2025; 26(1):2-14.

PMID: 39911277 PMC: 11793047. DOI: 10.2174/0113892029311541240627111506.


Advancements in omics technologies: Molecular mechanisms of acute lung injury and acute respiratory distress syndrome (Review).

Zheng Z, Qiao X, Yin J, Kong J, Han W, Qin J Int J Mol Med. 2025; 55(3.

PMID: 39749711 PMC: 11722059. DOI: 10.3892/ijmm.2024.5479.


b-move: Faster Lossless Approximate Pattern Matching in a Run-Length Compressed Index.

Depuydt L, Renders L, Van de Vyver S, Veys L, Gagie T, Fostier J Res Sq. 2024; .

PMID: 39606487 PMC: 11601852. DOI: 10.21203/rs.3.rs-5367343/v1.


A gentle introduction to pangenomics.

Matthews C, Watson-Haigh N, Burton R, Sheppard A Brief Bioinform. 2024; 25(6).

PMID: 39552065 PMC: 11570541. DOI: 10.1093/bib/bbae588.


References
1.
Holden M, Hsu L, Kurt K, Weinert L, Mather A, Harris S . A genomic portrait of the emergence, evolution, and global spread of a methicillin-resistant Staphylococcus aureus pandemic. Genome Res. 2013; 23(4):653-64. PMC: 3613582. DOI: 10.1101/gr.147710.112. View

2.
Dilthey A, Cox C, Iqbal Z, Nelson M, McVean G . Improved genome inference in the MHC using a population reference graph. Nat Genet. 2015; 47(6):682-8. PMC: 4449272. DOI: 10.1038/ng.3257. View

3.
Thorvaldsdottir H, Robinson J, Mesirov J . Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2012; 14(2):178-92. PMC: 3603213. DOI: 10.1093/bib/bbs017. View

4.
Wick R, Schultz M, Zobel J, Holt K . Bandage: interactive visualization of de novo genome assemblies. Bioinformatics. 2015; 31(20):3350-2. PMC: 4595904. DOI: 10.1093/bioinformatics/btv383. View

5.
Barabaschi D, Guerra D, Lacrima K, Laino P, Michelotti V, Urso S . Emerging knowledge from genome sequencing of crop species. Mol Biotechnol. 2011; 50(3):250-66. DOI: 10.1007/s12033-011-9443-1. View