» Articles » PMID: 12952883

PCAP: a Whole-genome Assembly Program

Overview
Journal Genome Res
Specialty Genetics
Date 2003 Sep 4
PMID 12952883
Citations 109
Authors
Affiliations
Soon will be listed here.
Abstract

We describe a whole-genome assembly program named PCAP for processing tens of millions of reads. The PCAP program has several features to address efficiency and accuracy issues in assembly. Multiple processors are used to perform most time-consuming computations in assembly. A more sensitive method is used to avoid missing overlaps caused by sequencing errors. Repetitive regions of reads are detected on the basis of many overlaps with other reads, instead of many shorter word matches with other reads. Contaminated end regions of reads are identified and removed. Generation of a consensus sequence for a contig is based on an alignment of reads in the contig, in which both base quality values and coverage information are used to determine every consensus base. The PCAP program was tested on a mouse whole-genome data set of 30 million reads and a human Chromosome 20 data set of 1.7 million reads. The program is freely available for academic use.

Citing Articles

Accurate assembly of full-length consensus for viral quasispecies.

Tian J, Gao Z, Li M, Bao E, Zhao J BMC Bioinformatics. 2025; 26(1):36.

PMID: 39893441 PMC: 11787740. DOI: 10.1186/s12859-025-06045-z.


Identification and functional analyses of drought stress resistance genes by transcriptomics of the Mongolian grassland plant Chloris virgata.

Namuunaa G, Bujin B, Yamagami A, Bolortuya B, Kawabata S, Ogawa H BMC Plant Biol. 2025; 25(1):44.

PMID: 39794690 PMC: 11724609. DOI: 10.1186/s12870-025-06046-3.


Phylomitogenomics bolsters the high-level classification of Demospongiae (phylum Porifera).

Lavrov D, Diaz M, Maldonado M, Morrow C, Perez T, Pomponi S PLoS One. 2023; 18(12):e0287281.

PMID: 38048310 PMC: 10695373. DOI: 10.1371/journal.pone.0287281.


Lightweight Pattern Matching Method for DNA Sequencing in Internet of Medical Things.

Rexie J, Raimond K, Murugaaboopathy M, Brindha D, Mulugeta H Comput Intell Neurosci. 2022; 2022:6980335.

PMID: 36120669 PMC: 9477578. DOI: 10.1155/2022/6980335.


Transcriptome Analysis of , Which Shows the Fastest Germination and Growth in the Major Mongolian Grassland Plant.

Bolortuya B, Kawabata S, Yamagami A, Davaapurev B, Takahashi F, Inoue K Front Plant Sci. 2021; 12:684987.

PMID: 34262584 PMC: 8275185. DOI: 10.3389/fpls.2021.684987.


References
1.
Myers E, Sutton G, Delcher A, Dew I, Fasulo D, Flanigan M . A whole-genome assembly of Drosophila. Science. 2000; 287(5461):2196-204. DOI: 10.1126/science.287.5461.2196. View

2.
Huang X, Madan A . CAP3: A DNA sequence assembly program. Genome Res. 1999; 9(9):868-77. PMC: 310812. DOI: 10.1101/gr.9.9.868. View

3.
Liang F, Holt I, Pertea G, Karamycheva S, Salzberg S, Quackenbush J . An optimized protocol for analysis of EST sequences. Nucleic Acids Res. 2000; 28(18):3657-65. PMC: 110731. DOI: 10.1093/nar/28.18.3657. View

4.
Pevzner P, Tang H, Waterman M . An Eulerian path approach to DNA fragment assembly. Proc Natl Acad Sci U S A. 2001; 98(17):9748-53. PMC: 55524. DOI: 10.1073/pnas.171285098. View

5.
Batzoglou S, Jaffe D, Stanley K, Butler J, Gnerre S, Mauceli E . ARACHNE: a whole-genome shotgun assembler. Genome Res. 2002; 12(1):177-89. PMC: 155255. DOI: 10.1101/gr.208902. View