» Articles » PMID: 20871106

A Gibbs Sampling Strategy Applied to the Mapping of Ambiguous Short-sequence Tags

Overview
Journal Bioinformatics
Specialty Biology
Date 2010 Sep 28
PMID 20871106
Citations 27
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is widely used in biological research. ChIP-seq experiments yield many ambiguous tags that can be mapped with equal probability to multiple genomic sites. Such ambiguous tags are typically eliminated from consideration resulting in a potential loss of important biological information.

Results: We have developed a Gibbs sampling-based algorithm for the genomic mapping of ambiguous sequence tags. Our algorithm relies on the local genomic tag context to guide the mapping of ambiguous tags. The Gibbs sampling procedure we use simultaneously maps ambiguous tags and updates the probabilities used to infer correct tag map positions. We show that our algorithm is able to correctly map more ambiguous tags than existing mapping methods. Our approach is also able to uncover mapped genomic sites from highly repetitive sequences that can not be detected based on unique tags alone, including transposable elements, segmental duplications and peri-centromeric regions. This mapping approach should prove to be useful for increasing biological knowledge on the too often neglected repetitive genomic regions.

Availability: http://esbg.gatech.edu/jordan/software/map

Contact: king.jordan@biology.gatech.edu

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

Human Endogenous Retroviruses in Glioblastoma Multiforme.

Yuan Z, Yang Y, Zhang N, Soto C, Jiang X, An Z Microorganisms. 2021; 9(4).

PMID: 33917421 PMC: 8067472. DOI: 10.3390/microorganisms9040764.


Computational Modeling and Analysis to Predict Intracellular Parasite Epitope Characteristics Using Random Forest Technique.

Javadi A, Khamesipour A, Monajemi F, Ghazisaeedi M Iran J Public Health. 2020; 49(1):125-133.

PMID: 32309231 PMC: 7152625.


Mobile genomics: tools and techniques for tackling transposons.

ONeill K, Brocks D, Gale Hammell M Philos Trans R Soc Lond B Biol Sci. 2020; 375(1795):20190345.

PMID: 32075565 PMC: 7061981. DOI: 10.1098/rstb.2019.0345.


Rapid, Paralog-Sensitive CNV Analysis of 2457 Human Genomes Using QuicK-mer2.

Shen F, Kidd J Genes (Basel). 2020; 11(2).

PMID: 32013076 PMC: 7073954. DOI: 10.3390/genes11020141.


Is it time to change the reference genome?.

Ballouz S, Dobin A, Gillis J Genome Biol. 2019; 20(1):159.

PMID: 31399121 PMC: 6688217. DOI: 10.1186/s13059-019-1774-4.


References
1.
Langmead B, Trapnell C, Pop M, Salzberg S . Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009; 10(3):R25. PMC: 2690996. DOI: 10.1186/gb-2009-10-3-r25. View

2.
Thurman R, Day N, Noble W, Stamatoyannopoulos J . Identification of higher-order functional domains in the human ENCODE regions. Genome Res. 2007; 17(6):917-27. PMC: 1891350. DOI: 10.1101/gr.6081407. View

3.
Lawrence C, Altschul S, Boguski M, Liu J, Neuwald A, Wootton J . Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science. 1993; 262(5131):208-14. DOI: 10.1126/science.8211139. View

4.
Barski A, Cuddapah S, Cui K, Roh T, Schones D, Wang Z . High-resolution profiling of histone methylations in the human genome. Cell. 2007; 129(4):823-37. DOI: 10.1016/j.cell.2007.05.009. View

5.
Zhang Y, Liu T, Meyer C, Eeckhoute J, Johnson D, Bernstein B . Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008; 9(9):R137. PMC: 2592715. DOI: 10.1186/gb-2008-9-9-r137. View