» Articles » PMID: 15342561

De Novo Repeat Classification and Fragment Assembly

Overview
Journal Genome Res
Specialty Genetics
Date 2004 Sep 3
PMID 15342561
Citations 81
Authors
Affiliations
Soon will be listed here.
Abstract

Repetitive sequences make up a significant fraction of almost any genome, and an important and still open question in bioinformatics is how to represent all repeats in DNA sequences. We propose a new approach to repeat classification that represents all repeats in a genome as a mosaic of sub-repeats. Our key algorithmic idea also leads to new approaches to multiple alignment and fragment assembly. In particular, we show that our FragmentGluer assembler improves on Phrap and ARACHNE in assembly of BACs and bacterial genomes.

Citing Articles

GenomeDecoder: inferring segmental duplications in highly repetitive genomic regions.

Zhang Z, Gupta I, Pevzner P Bioinformatics. 2025; 41(2).

PMID: 39908455 PMC: 11842051. DOI: 10.1093/bioinformatics/btaf058.


Maximum-scoring path sets on pangenome graphs of constant treewidth.

Brejova B, Gagie T, Herencsarova E, Vinar T Front Bioinform. 2024; 4:1391086.

PMID: 39011297 PMC: 11246863. DOI: 10.3389/fbinf.2024.1391086.


KaMRaT: a C++ toolkit for k-mer count matrix dimension reduction.

Xue H, Gallopin M, Marchet C, Nguyen H, Wang Y, Laine A Bioinformatics. 2024; 40(3).

PMID: 38444086 PMC: 10942800. DOI: 10.1093/bioinformatics/btae090.


Phables: from fragmented assemblies to high-quality bacteriophage genomes.

Mallawaarachchi V, Roach M, Decewicz P, Papudeshi B, Giles S, Grigson S Bioinformatics. 2023; 39(10).

PMID: 37738590 PMC: 10563150. DOI: 10.1093/bioinformatics/btad586.


Fully automated annotation of mitochondrial genomes using a cluster-based approach with de Bruijn graphs.

Fiedler L, Middendorf M, Bernt M Front Genet. 2023; 14:1250907.

PMID: 37636259 PMC: 10448254. DOI: 10.3389/fgene.2023.1250907.


References
1.
Thompson J, Higgins D, Gibson T . CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994; 22(22):4673-80. PMC: 308517. DOI: 10.1093/nar/22.22.4673. View

2.
Huang X . A contig assembly program based on sensitive detection of fragment overlaps. Genomics. 1992; 14(1):18-25. DOI: 10.1016/s0888-7543(05)80277-0. View

3.
Myers E . Toward simplifying and accurately formulating fragment assembly. J Comput Biol. 1995; 2(2):275-90. DOI: 10.1089/cmb.1995.2.275. View

4.
Idury R, Waterman M . A new algorithm for DNA sequence assembly. J Comput Biol. 1995; 2(2):291-306. DOI: 10.1089/cmb.1995.2.291. View

5.
Huang X . An improved sequence assembly program. Genomics. 1996; 33(1):21-31. DOI: 10.1006/geno.1996.0155. View