» Articles » PMID: 26484757

Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping

Overview
Specialty Biology
Date 2015 Oct 21
PMID 26484757
Citations 8
Authors
Affiliations
Soon will be listed here.
Abstract

Segmental duplications and other highly repetitive regions of genomes contribute significantly to cells' regulatory programs. Advancements in next generation sequencing enabled genome-wide profiling of protein-DNA interactions by chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq). However, interactions in highly repetitive regions of genomes have proven difficult to map since short reads of 50-100 base pairs (bps) from these regions map to multiple locations in reference genomes. Standard analytical methods discard such multi-mapping reads and the few that can accommodate them are prone to large false positive and negative rates. We developed Perm-seq, a prior-enhanced read allocation method for ChIP-seq experiments, that can allocate multi-mapping reads in highly repetitive regions of the genomes with high accuracy. We comprehensively evaluated Perm-seq, and found that our prior-enhanced approach significantly improves multi-read allocation accuracy over approaches that do not utilize additional data types. The statistical formalism underlying our approach facilitates supervising of multi-read allocation with a variety of data sources including histone ChIP-seq. We applied Perm-seq to 64 ENCODE ChIP-seq datasets from GM12878 and K562 cells and identified many novel protein-DNA interactions in segmental duplication regions. Our analysis reveals that although the protein-DNA interactions sites are evolutionarily less conserved in repetitive regions, they share the overall sequence characteristics of the protein-DNA interactions in non-repetitive regions.

Citing Articles

Accurate allocation of multimapped reads enables regulatory element analysis at repeats.

Morrissey A, Shi J, James D, Mahony S Genome Res. 2024; 34(6):937-951.

PMID: 38986578 PMC: 11293539. DOI: 10.1101/gr.278638.123.


Taming transposable elements in livestock and poultry: a review of their roles and applications.

Zhao P, Peng C, Fang L, Wang Z, Liu G Genet Sel Evol. 2023; 55(1):50.

PMID: 37479995 PMC: 10362595. DOI: 10.1186/s12711-023-00821-2.


Segmentation and genome annotation algorithms for identifying chromatin state and other genomic patterns.

Libbrecht M, Chan R, Hoffman M PLoS Comput Biol. 2021; 17(10):e1009423.

PMID: 34648491 PMC: 8516206. DOI: 10.1371/journal.pcbi.1009423.


Sequence deeper without sequencing more: Bayesian resolution of ambiguously mapped reads.

Shah R, Ruthenburg A PLoS Comput Biol. 2021; 17(4):e1008926.

PMID: 33872311 PMC: 8084338. DOI: 10.1371/journal.pcbi.1008926.


Mobile genomics: tools and techniques for tackling transposons.

ONeill K, Brocks D, Gale Hammell M Philos Trans R Soc Lond B Biol Sci. 2020; 375(1795):20190345.

PMID: 32075565 PMC: 7061981. DOI: 10.1098/rstb.2019.0345.


References
1.
Wang J, Zhuang J, Iyer S, Lin X, Whitfield T, Greven M . Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome Res. 2012; 22(9):1798-812. PMC: 3431495. DOI: 10.1101/gr.139105.112. View

2.
Langmead B, Trapnell C, Pop M, Salzberg S . Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009; 10(3):R25. PMC: 2690996. DOI: 10.1186/gb-2009-10-3-r25. View

3.
Bailey J, Gu Z, Clark R, Reinert K, Samonte R, Schwartz S . Recent segmental duplications in the human genome. Science. 2002; 297(5583):1003-7. DOI: 10.1126/science.1072047. View

4.
Ku M, Jaffe J, Koche R, Rheinbay E, Endoh M, Koseki H . H2A.Z landscapes and dual modifications in pluripotent and multipotent stem cells underlie complex genome regulatory functions. Genome Biol. 2012; 13(10):R85. PMC: 3491413. DOI: 10.1186/gb-2012-13-10-r85. View

5.
Wang R, Hsu H, Blattler A, Wang Y, Lan X, Wang Y . LOcating non-unique matched tags (LONUT) to improve the detection of the enriched regions for ChIP-seq data. PLoS One. 2013; 8(6):e67788. PMC: 3692479. DOI: 10.1371/journal.pone.0067788. View