» Articles » PMID: 23990416

The MaSuRCA Genome Assembler

Overview
Journal Bioinformatics
Specialty Biology
Date 2013 Aug 31
PMID 23990416
Citations 701
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Second-generation sequencing technologies produce high coverage of the genome by short reads at a low cost, which has prompted development of new assembly methods. In particular, multiple algorithms based on de Bruijn graphs have been shown to be effective for the assembly problem. In this article, we describe a new hybrid approach that has the computational efficiency of de Bruijn graph methods and the flexibility of overlap-based assembly strategies, and which allows variable read lengths while tolerating a significant level of sequencing error. Our method transforms large numbers of paired-end reads into a much smaller number of longer 'super-reads'. The use of super-reads allows us to assemble combinations of Illumina reads of differing lengths together with longer reads from 454 and Sanger sequencing technologies, making it one of the few assemblers capable of handling such mixtures. We call our system the Maryland Super-Read Celera Assembler (abbreviated MaSuRCA and pronounced 'mazurka').

Results: We evaluate the performance of MaSuRCA against two of the most widely used assemblers for Illumina data, Allpaths-LG and SOAPdenovo2, on two datasets from organisms for which high-quality assemblies are available: the bacterium Rhodobacter sphaeroides and chromosome 16 of the mouse genome. We show that MaSuRCA performs on par or better than Allpaths-LG and significantly better than SOAPdenovo on these data, when evaluated against the finished sequence. We then show that MaSuRCA can significantly improve its assemblies when the original data are augmented with long reads.

Availability: MaSuRCA is available as open-source code at ftp://ftp.genome.umd.edu/pub/MaSuRCA/. Previous (pre-publication) releases have been publicly available for over a year.

Contact: alekseyz@ipst.umd.edu.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

The pan-genome of Spodoptera frugiperda provides new insights into genome evolution and horizontal gene transfer.

Huang Y, Rao H, Su B, Lv J, Lin J, Wang X Commun Biol. 2025; 8(1):407.

PMID: 40069391 PMC: 11897360. DOI: 10.1038/s42003-025-07707-7.


Advantages of Mutant Generation by Genome Rearrangements of Non-Conventional Yeast via Direct Nuclease Transfection.

Oda A, Yasukawa T, Tamura M, Sano A, Masuo N, Ohta K Genes Cells. 2025; 30(2):e70010.

PMID: 40065658 PMC: 11894362. DOI: 10.1111/gtc.70010.


Draft genome of the endangered visayan spotted deer (, a Philippine endemic species.

Javier M, Noblezada A, Sienes P, Guino-O R, Palomar-Abesamis N, Malay M GigaByte. 2025; 2025:gigabyte150.

PMID: 40041424 PMC: 11876970. DOI: 10.46471/gigabyte.150.


The genome of strain melanoliber.

Pradeep P, Kolipakala R, Nagarajan D Microbiol Resour Announc. 2025; 14(3):e0080924.

PMID: 39964164 PMC: 11895477. DOI: 10.1128/mra.00809-24.


De novo genome hybrid assembly and annotation of the endangered and euryhaline fish Aphanius iberus (Valenciennes, 1846) with identification of genes potentially involved in salinity adaptation.

Lopez-Solano A, Doadrio I, Nester T, Perea S BMC Genomics. 2025; 26(1):136.

PMID: 39939939 PMC: 11817801. DOI: 10.1186/s12864-025-11327-0.


References
1.
Mullikin J, Ning Z . The phusion assembler. Genome Res. 2003; 13(1):81-90. PMC: 430959. DOI: 10.1101/gr.731003. View

2.
Miller J, Delcher A, Koren S, Venter E, Walenz B, Brownley A . Aggressive assembly of pyrosequencing reads with mates. Bioinformatics. 2008; 24(24):2818-24. PMC: 2639302. DOI: 10.1093/bioinformatics/btn548. View

3.
Kurtz S, Phillippy A, Delcher A, Smoot M, Shumway M, Antonescu C . Versatile and open software for comparing large genomes. Genome Biol. 2004; 5(2):R12. PMC: 395750. DOI: 10.1186/gb-2004-5-2-r12. View

4.
Idury R, Waterman M . A new algorithm for DNA sequence assembly. J Comput Biol. 1995; 2(2):291-306. DOI: 10.1089/cmb.1995.2.291. View

5.
Zerbino D, Birney E . Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008; 18(5):821-9. PMC: 2336801. DOI: 10.1101/gr.074492.107. View