» Articles » PMID: 32900369

ComHapDet: a Spatial Community Detection Algorithm for Haplotype Assembly

Overview
Journal BMC Genomics
Publisher Biomed Central
Specialty Genetics
Date 2020 Sep 9
PMID 32900369
Citations 5
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Haplotypes, the ordered lists of single nucleotide variations that distinguish chromosomal sequences from their homologous pairs, may reveal an individual's susceptibility to hereditary and complex diseases and affect how our bodies respond to therapeutic drugs. Reconstructing haplotypes of an individual from short sequencing reads is an NP-hard problem that becomes even more challenging in the case of polyploids. While increasing lengths of sequencing reads and insert sizes helps improve accuracy of reconstruction, it also exacerbates computational complexity of the haplotype assembly task. This has motivated the pursuit of algorithmic frameworks capable of accurate yet efficient assembly of haplotypes from high-throughput sequencing data.

Results: We propose a novel graphical representation of sequencing reads and pose the haplotype assembly problem as an instance of community detection on a spatial random graph. To this end, we construct a graph where each read is a node with an unknown community label associating the read with the haplotype it samples. Haplotype reconstruction can then be thought of as a two-step procedure: first, one recovers the community labels on the nodes (i.e., the reads), and then uses the estimated labels to assemble the haplotypes. Based on this observation, we propose ComHapDet - a novel assembly algorithm for diploid and ployploid haplotypes which allows both bialleleic and multi-allelic variants.

Conclusions: Performance of the proposed algorithm is benchmarked on simulated as well as experimental data obtained by sequencing Chromosome 5 of tetraploid biallelic Solanum-Tuberosum (Potato). The results demonstrate the efficacy of the proposed method and that it compares favorably with the existing techniques.

Citing Articles

GCphase: an SNP phasing method using a graph partition and error correction algorithm.

Luo J, Wang J, Zhai H, Wang J BMC Bioinformatics. 2024; 25(1):267.

PMID: 39160480 PMC: 11331634. DOI: 10.1186/s12859-024-05901-8.


Pairwise comparative analysis of six haplotype assembly methods based on users' experience.

Sun S, Cheng F, Han D, Wei S, Zhong A, Massoudian S BMC Genom Data. 2023; 24(1):35.

PMID: 37386408 PMC: 10311811. DOI: 10.1186/s12863-023-01134-5.


flopp: Extremely Fast Long-Read Polyploid Haplotype Phasing by Uniform Tree Partitioning.

Shaw J, Yu Y J Comput Biol. 2022; 29(2):195-211.

PMID: 35041529 PMC: 8892958. DOI: 10.1089/cmb.2021.0436.


Genomics and functional genomics in Leishmania and Trypanosoma cruzi: statuses, challenges and perspectives.

Bartholomeu D, Teixeira S, Kaysel Cruz A Mem Inst Oswaldo Cruz. 2021; 116:e200634.

PMID: 33787768 PMC: 8011669. DOI: 10.1590/0074-02760200634.


Selected Research Articles from the 2019 International Workshop on Computational Network Biology: Modeling, Analysis, and Control (CNB-MAC).

Yoon B, Qian X, Kahveci T, Pal R BMC Genomics. 2020; 21(Suppl 9):584.

PMID: 32900374 PMC: 7487676. DOI: 10.1186/s12864-020-06934-y.

References
1.
Hashemi A, Zhu B, Vikalo H . Sparse Tensor Decomposition for Haplotype Assembly of Diploids and Polyploids. BMC Genomics. 2018; 19(Suppl 4):191. PMC: 5872563. DOI: 10.1186/s12864-018-4551-y. View

2.
Puljiz Z, Vikalo H . Decoding Genetic Variations: Communications-Inspired Haplotype Assembly. IEEE/ACM Trans Comput Biol Bioinform. 2016; 13(3):518-30. DOI: 10.1109/TCBB.2015.2462367. View

3.
Bonizzoni P, Dondi R, Klau G, Pirola Y, Pisanti N, Zaccaria S . On the Minimum Error Correction Problem for Haplotype Assembly in Diploid and Polyploid Genomes. J Comput Biol. 2016; 23(9):718-36. DOI: 10.1089/cmb.2015.0220. View

4.
Aguiar D, Istrail S . HapCompass: a fast cycle basis algorithm for accurate haplotype assembly of sequence data. J Comput Biol. 2012; 19(6):577-90. PMC: 3375639. DOI: 10.1089/cmb.2012.0084. View

5.
Bansal V, Bafna V . HapCUT: an efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics. 2008; 24(16):i153-9. DOI: 10.1093/bioinformatics/btn298. View