» Articles » PMID: 18990215

Syntenator: Multiple Gene Order Alignments with a Gene-specific Scoring Function

Overview
Publisher Biomed Central
Date 2008 Nov 8
PMID 18990215
Citations 8
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Identification of homologous regions or conserved syntenies across genomes is one crucial step in comparative genomics. This task is usually performed by genome alignment softwares like WABA or blastz. In case of conserved syntenies, such regions are defined as conserved gene orders. On the gene order level, homologous regions can even be found between distantly related genomes, which do not align on the nucleotide sequence level.

Results: We present a novel approach to identify regions of conserved synteny across multiple genomes. Syntenator represents genomes and alignments thereof as partial order graphs (POGs). These POGs are aligned by a dynamic programming approach employing a gene-specific scoring function. The scoring function reflects the level of protein sequence similarity for each possible gene pair. Our method consistently defines larger homologous regions in pairwise gene order alignments than nucleotide-level comparisons. Our method is superior to methods that work on predefined homology gene sets (as implemented in Blockfinder). Syntenator successfully reproduces 80% of the EnsEMBL man-mouse conserved syntenic blocks. The full potential of our method becomes visible by comparing remotely related genomes and multiple genomes. Gene order alignments potentially resolve up to 75% of the EnsEMBL 1:many orthology relations and 27% of the many:many orthology relations.

Conclusion: We propose Syntenator as a software solution to reliably infer conserved syntenies among distantly related genomes. The software is available from http://www2.tuebingen.mpg.de/abt4/plone.

Citing Articles

Finding approximate gene clusters with Gecko 3.

Winter S, Jahn K, Wehner S, Kuchenbecker L, Marz M, Stoye J Nucleic Acids Res. 2016; 44(20):9600-9610.

PMID: 27679480 PMC: 5175365. DOI: 10.1093/nar/gkw843.


PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees.

Lucas J, Muffato M, Crollius H BMC Bioinformatics. 2014; 15:268.

PMID: 25103980 PMC: 4155083. DOI: 10.1186/1471-2105-15-268.


Kerfuffle: a web tool for multi-species gene colocalization analysis.

Aboukhalil R, Fendler B, Atwal G BMC Bioinformatics. 2013; 14:22.

PMID: 23327649 PMC: 3598493. DOI: 10.1186/1471-2105-14-22.


Controversies in modern evolutionary biology: the imperative for error detection and quality control.

Prosdocimi F, Linard B, Pontarotti P, Poch O, Thompson J BMC Genomics. 2012; 13:5.

PMID: 22217008 PMC: 3311146. DOI: 10.1186/1471-2164-13-5.


i-ADHoRe 3.0--fast and sensitive detection of genomic homology in extremely large data sets.

Proost S, Fostier J, De Witte D, Dhoedt B, Demeester P, Van de Peer Y Nucleic Acids Res. 2011; 40(2):e11.

PMID: 22102584 PMC: 3258164. DOI: 10.1093/nar/gkr955.


References
1.
Grasso C, Lee C . Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems. Bioinformatics. 2004; 20(10):1546-56. DOI: 10.1093/bioinformatics/bth126. View

2.
Murphy W, Pevzner P, OBrien S . Mammalian phylogenomics comes of age. Trends Genet. 2004; 20(12):631-9. DOI: 10.1016/j.tig.2004.09.005. View

3.
Stein L, Bao Z, Blasiar D, Blumenthal T, Brent M, Chen N . The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics. PLoS Biol. 2003; 1(2):E45. PMC: 261899. DOI: 10.1371/journal.pbio.0000045. View

4.
Tatusov R, Fedorova N, Jackson J, Jacobs A, Kiryutin B, Koonin E . The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003; 4:41. PMC: 222959. DOI: 10.1186/1471-2105-4-41. View

5.
Schwartz S, Kent W, Smit A, Zhang Z, Baertsch R, Hardison R . Human-mouse alignments with BLASTZ. Genome Res. 2003; 13(1):103-7. PMC: 430961. DOI: 10.1101/gr.809403. View