» Articles » PMID: 22821567

MetaVelvet: an Extension of Velvet Assembler to De Novo Metagenome Assembly from Short Sequence Reads

Overview
Specialty Biochemistry
Date 2012 Jul 24
PMID 22821567
Citations 267
Authors
Affiliations
Soon will be listed here.
Abstract

An important step in 'metagenomics' analysis is the assembly of multiple genomes from mixed sequence reads of multiple species in a microbial community. Most conventional pipelines use a single-genome assembler with carefully optimized parameters. A limitation of a single-genome assembler for de novo metagenome assembly is that sequences of highly abundant species are likely misidentified as repeats in a single genome, resulting in a number of small fragmented scaffolds. We extended a single-genome assembler for short reads, known as 'Velvet', to metagenome assembly, which we called 'MetaVelvet', for mixed short reads of multiple species. Our fundamental concept was to first decompose a de Bruijn graph constructed from mixed short reads into individual sub-graphs, and second, to build scaffolds based on each decomposed de Bruijn sub-graph as an isolate species genome. We made use of two features, the coverage (abundance) difference and graph connectivity, for the decomposition of the de Bruijn graph. For simulated datasets, MetaVelvet succeeded in generating significantly higher N50 scores than any single-genome assemblers. MetaVelvet also reconstructed relatively low-coverage genome sequences as scaffolds. On real datasets of human gut microbial read data, MetaVelvet produced longer scaffolds and increased the number of predicted genes.

Citing Articles

The mycobiome in human cancer: analytical challenges, molecular mechanisms, and therapeutic implications.

Ding T, Liu C, Li Z Mol Cancer. 2025; 24(1):18.

PMID: 39815314 PMC: 11734361. DOI: 10.1186/s12943-025-02227-8.


Decontamination of DNA sequences from a Streptomyces genome for optimal genome mining.

de Oliveira R, Garrido L, Padilla G Braz J Microbiol. 2025; 56(1):79-89.

PMID: 39812972 PMC: 11885714. DOI: 10.1007/s42770-024-01598-2.


Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation.

Song H, Tithi S, Brown C, Aylward F, Jensen R, Zhang L PeerJ. 2025; 13():e18515.

PMID: 39807156 PMC: 11727651. DOI: 10.7717/peerj.18515.


kMetaShot: a fast and reliable taxonomy classifier for metagenome-assembled genomes.

Defazio G, Tangaro M, Pesole G, Fosso B Brief Bioinform. 2025; 26(1).

PMID: 39749666 PMC: 11695915. DOI: 10.1093/bib/bbae680.


Targeted protein evolution in the gut microbiome by diversity-generating retroelements.

Macadangdang B, Wang Y, Woodward C, Revilla J, Shaw B, Sasaninia K bioRxiv. 2024; .

PMID: 39605476 PMC: 11601372. DOI: 10.1101/2024.11.15.621889.


References
1.
Venter J, Remington K, Heidelberg J, Halpern A, Rusch D, Eisen J . Environmental genome shotgun sequencing of the Sargasso Sea. Science. 2004; 304(5667):66-74. DOI: 10.1126/science.1093857. View

2.
Nishito Y, Osana Y, Hachiya T, Popendorf K, Toyoda A, Fujiyama A . Whole genome assembly of a natto production strain Bacillus subtilis natto from very short read data. BMC Genomics. 2010; 11:243. PMC: 2867830. DOI: 10.1186/1471-2164-11-243. View

3.
Yang B, Peng Y, Leung H, Yiu S, Chen J, Chin F . Unsupervised binning of environmental genomic fragments based on an error robust selection of l-mers. BMC Bioinformatics. 2010; 11 Suppl 2:S5. PMC: 3165929. DOI: 10.1186/1471-2105-11-S2-S5. View

4.
Qin J, Li R, Raes J, Arumugam M, Burgdorf K, Manichanh C . A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010; 464(7285):59-65. PMC: 3779803. DOI: 10.1038/nature08821. View

5.
Zerbino D, Birney E . Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008; 18(5):821-9. PMC: 2336801. DOI: 10.1101/gr.074492.107. View