» Articles » PMID: 31290946

Noise-cancelling Repeat Finder: Uncovering Tandem Repeats in Error-prone Long-read Sequencing Data

Overview
Journal Bioinformatics
Specialty Biology
Date 2019 Jul 11
PMID 31290946
Citations 29
Authors
Affiliations
Soon will be listed here.
Abstract

Summary: Tandem DNA repeats can be sequenced with long-read technologies, but cannot be accurately deciphered due to the lack of computational tools taking high error rates of these technologies into account. Here we introduce Noise-Cancelling Repeat Finder (NCRF) to uncover putative tandem repeats of specified motifs in noisy long reads produced by Pacific Biosciences and Oxford Nanopore sequencers. Using simulations, we validated the use of NCRF to locate tandem repeats with motifs of various lengths and demonstrated its superior performance as compared to two alternative tools. Using real human whole-genome sequencing data, NCRF identified long arrays of the (AATGG)n repeat involved in heat shock stress response.

Availability And Implementation: NCRF is implemented in C, supported by several python scripts, and is available in bioconda and at https://github.com/makovalab-psu/NoiseCancellingRepeatFinder.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

Navigating triplet repeats sequencing: concepts, methodological challenges and perspective for Huntington's disease.

Maestri S, Scalzo D, Damaggio G, Zobel M, Besusso D, Cattaneo E Nucleic Acids Res. 2024; 53(1.

PMID: 39676657 PMC: 11724279. DOI: 10.1093/nar/gkae1155.


De novo assembly and characterization of a highly degenerated ZW sex chromosome in the fish Megaleporinus macrocephalus.

Souza-Borges C, Utsunomia R, Varani A, Uliano-Silva M, Lira L, Butzge A Gigascience. 2024; 13.

PMID: 39589439 PMC: 11590113. DOI: 10.1093/gigascience/giae085.


High resolution long-read telomere sequencing reveals dynamic mechanisms in aging and cancer.

Schmidt T, Tyer C, Rughani P, Haggblom C, Jones J, Dai X Nat Commun. 2024; 15(1):5149.

PMID: 38890299 PMC: 11189484. DOI: 10.1038/s41467-024-48917-7.


Evolution of ancient satellite DNAs in extant alligators and caimans (Crocodylia, Reptilia).

Sales-Oliveira V, Dos Santos R, Goes C, Calegari R, Garrido-Ramos M, Altmanova M BMC Biol. 2024; 22(1):47.

PMID: 38413947 PMC: 10900743. DOI: 10.1186/s12915-024-01847-8.


Mdwgan-gp: data augmentation for gene expression data based on multiple discriminator WGAN-GP.

Li R, Wu J, Li G, Liu J, Xuan J, Zhu Q BMC Bioinformatics. 2023; 24(1):427.

PMID: 37957576 PMC: 10644641. DOI: 10.1186/s12859-023-05558-9.


References
1.
Plohl M, Luchetti A, Mestrovic N, Mantovani B . Satellite DNAs between selfishness and functionality: structure, genomics and evolution of tandem repeats in centromeric (hetero)chromatin. Gene. 2008; 409(1-2):72-82. DOI: 10.1016/j.gene.2007.11.013. View

2.
Altemose N, Miga K, Maggioni M, Willard H . Genomic characterization of large heterochromatic gaps in the human genome assembly. PLoS Comput Biol. 2014; 10(5):e1003628. PMC: 4022460. DOI: 10.1371/journal.pcbi.1003628. View

3.
Zhang W, Li J, Suzuki K, Qu J, Wang P, Zhou J . Aging stem cells. A Werner syndrome stem cell model unveils heterochromatin alterations as a driver of human aging. Science. 2015; 348(6239):1160-3. PMC: 4494668. DOI: 10.1126/science.aaa1356. View

4.
Peona V, Weissensteiner M, Suh A . How complete are "complete" genome assemblies?-An avian perspective. Mol Ecol Resour. 2018; 18(6):1188-1195. DOI: 10.1111/1755-0998.12933. View

5.
Lower S, McGurk M, Clark A, Barbash D . Satellite DNA evolution: old ideas, new approaches. Curr Opin Genet Dev. 2018; 49:70-78. PMC: 5975084. DOI: 10.1016/j.gde.2018.03.003. View