» Articles » PMID: 20027311

Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-read De Novo Assembler

Overview
Journal PLoS One
Date 2009 Dec 23
PMID 20027311
Citations 117
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Despite the short length of their reads, micro-read sequencing technologies have shown their usefulness for de novo sequencing. However, especially in eukaryotic genomes, complex repeat patterns are an obstacle to large assemblies.

Principal Findings: We present a novel heuristic algorithm, Pebble, which uses paired-end read information to resolve repeats and scaffold contigs to produce large-scale assemblies. In simulations, we can achieve weighted median scaffold lengths (N50) of above 1 Mbp in Bacteria and above 100 kbp in more complex organisms. Using real datasets we obtained a 96 kbp N50 in Pseudomonas syringae and a unique 147 kbp scaffold of a ferret BAC clone. We also present an efficient algorithm called Rock Band for the resolution of repeats in the case of mixed length assemblies, where different sequencing platforms are combined to obtain a cost-effective assembly.

Conclusions: These algorithms extend the utility of short read only assemblies into large complex genomes. They have been implemented and made available within the open-source Velvet short-read de novo assembler.

Citing Articles

Comparative genomic analysis and optimization of astaxanthin production of Rhodotorula paludigena TL35-5 and Rhodotorula sampaioana PL61-2.

Hoondee P, Phuengjayaem S, Kingkaew E, Rojsitthisak P, Sritularak B, Thompho S PLoS One. 2024; 19(7):e0304699.

PMID: 38995888 PMC: 11244826. DOI: 10.1371/journal.pone.0304699.


Genomic and biological insights of bacteriophages JNUWH1 and JNUWD in the arms race against bacterial resistance.

Zhang H, You J, Pan X, Hu Y, Zhang Z, Zhang X Front Microbiol. 2024; 15:1407039.

PMID: 38989022 PMC: 11233448. DOI: 10.3389/fmicb.2024.1407039.


Isolation and identification of specific phage C-3 and G21-7 against Avian pathogenic and its application to one-day-old geese.

Wang T, Zhang L, Zhang Y, Tong P, Ma W, Wang Y Front Microbiol. 2024; 15:1385860.

PMID: 38962142 PMC: 11221357. DOI: 10.3389/fmicb.2024.1385860.


Applications of de Bruijn graphs in microbiome research.

Dufault-Thompson K, Jiang X Imeta. 2024; 1(1):e4.

PMID: 38867733 PMC: 10989854. DOI: 10.1002/imt2.4.


Draft genome sequence and morphological data of PLACP1, a thermophilic chloramphenicol-resistant bacterium isolated from thermophilic sludge.

Tseng H, Matsutani M, Fujimoto N, Ohnishi A Data Brief. 2024; 54:110447.

PMID: 38708301 PMC: 11068547. DOI: 10.1016/j.dib.2024.110447.


References
1.
Chaisson M, Pevzner P . Short read fragment assembly of bacterial genomes. Genome Res. 2007; 18(2):324-30. PMC: 2203630. DOI: 10.1101/gr.7088808. View

2.
Pevzner P, Tang H . Fragment assembly with double-barreled data. Bioinformatics. 2001; 17 Suppl 1:S225-33. DOI: 10.1093/bioinformatics/17.suppl_1.s225. View

3.
Wheeler D, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A . The complete genome of an individual by massively parallel DNA sequencing. Nature. 2008; 452(7189):872-6. DOI: 10.1038/nature06884. View

4.
Myers E . The fragment assembly string graph. Bioinformatics. 2005; 21 Suppl 2:ii79-85. DOI: 10.1093/bioinformatics/bti1114. View

5.
Slater G, Birney E . Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics. 2005; 6:31. PMC: 553969. DOI: 10.1186/1471-2105-6-31. View