» Articles » PMID: 32928274

Merqury: Reference-free Quality, Completeness, and Phasing Assessment for Genome Assemblies

Overview
Journal Genome Biol
Specialties Biology
Genetics
Date 2020 Sep 15
PMID 32928274
Citations 1065
Authors
Affiliations
Soon will be listed here.
Abstract

Recent long-read assemblies often exceed the quality and completeness of available reference genomes, making validation challenging. Here we present Merqury, a novel tool for reference-free assembly evaluation based on efficient k-mer set operations. By comparing k-mers in a de novo assembly to those found in unassembled high-accuracy reads, Merqury estimates base-level accuracy and completeness. For trios, Merqury can also evaluate haplotype-specific accuracy, completeness, phase block continuity, and switch errors. Multiple visualizations, such as k-mer spectrum plots, can be generated for evaluation. We demonstrate on both human and plant genomes that Merqury is a fast and robust method for assembly validation.

Citing Articles

A near-complete genome assembly of Fragaria iinumae.

Du H, He Y, Chen M, Zheng X, Gui D, Tang J BMC Genomics. 2025; 26(1):253.

PMID: 40087556 DOI: 10.1186/s12864-025-11440-0.


Evaluating long-read assemblers to assemble several aphididae genomes.

Burger N, Nicolis V, Botha A Brief Bioinform. 2025; 26(2).

PMID: 40079265 PMC: 11904405. DOI: 10.1093/bib/bbaf105.


The genome sequence of a flea beetle, (Marsham, 1802).

Geiser M, Sims I Wellcome Open Res. 2025; 10:62.

PMID: 40078959 PMC: 11897695. DOI: 10.12688/wellcomeopenres.23697.1.


The genome sequence of the Coppice Mining Bee, (Linnaeus, 1758).

Falk S, Monks J Wellcome Open Res. 2025; 10:102.

PMID: 40078958 PMC: 11897692. DOI: 10.12688/wellcomeopenres.23746.1.


The genome sequence of the Dotted Footman moth, (Hufnagel, 1767).

Fletcher C, Lees D Wellcome Open Res. 2025; 10:106.

PMID: 40078957 PMC: 11897693. DOI: 10.12688/wellcomeopenres.23766.1.


References
1.
Mikheenko A, Prjibelski A, Saveliev V, Antipov D, Gurevich A . Versatile genome assembly evaluation with QUAST-LG. Bioinformatics. 2018; 34(13):i142-i150. PMC: 6022658. DOI: 10.1093/bioinformatics/bty266. View

2.
Edge P, Bafna V, Bansal V . HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies. Genome Res. 2016; 27(5):801-812. PMC: 5411775. DOI: 10.1101/gr.213462.116. View

3.
Dilthey A, Cox C, Iqbal Z, Nelson M, McVean G . Improved genome inference in the MHC using a population reference graph. Nat Genet. 2015; 47(6):682-8. PMC: 4449272. DOI: 10.1038/ng.3257. View

4.
Zook J, Chapman B, Wang J, Mittelman D, Hofmann O, Hide W . Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls. Nat Biotechnol. 2014; 32(3):246-51. DOI: 10.1038/nbt.2835. View

5.
Falconer E, Hills M, Naumann U, Poon S, Chavez E, Sanders A . DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat Methods. 2012; 9(11):1107-12. PMC: 3580294. DOI: 10.1038/nmeth.2206. View