Syntenator: Multiple Gene Order Alignments with a Gene-specific Scoring Function
Overview
Affiliations
Background: Identification of homologous regions or conserved syntenies across genomes is one crucial step in comparative genomics. This task is usually performed by genome alignment softwares like WABA or blastz. In case of conserved syntenies, such regions are defined as conserved gene orders. On the gene order level, homologous regions can even be found between distantly related genomes, which do not align on the nucleotide sequence level.
Results: We present a novel approach to identify regions of conserved synteny across multiple genomes. Syntenator represents genomes and alignments thereof as partial order graphs (POGs). These POGs are aligned by a dynamic programming approach employing a gene-specific scoring function. The scoring function reflects the level of protein sequence similarity for each possible gene pair. Our method consistently defines larger homologous regions in pairwise gene order alignments than nucleotide-level comparisons. Our method is superior to methods that work on predefined homology gene sets (as implemented in Blockfinder). Syntenator successfully reproduces 80% of the EnsEMBL man-mouse conserved syntenic blocks. The full potential of our method becomes visible by comparing remotely related genomes and multiple genomes. Gene order alignments potentially resolve up to 75% of the EnsEMBL 1:many orthology relations and 27% of the many:many orthology relations.
Conclusion: We propose Syntenator as a software solution to reliably infer conserved syntenies among distantly related genomes. The software is available from http://www2.tuebingen.mpg.de/abt4/plone.
Finding approximate gene clusters with Gecko 3.
Winter S, Jahn K, Wehner S, Kuchenbecker L, Marz M, Stoye J Nucleic Acids Res. 2016; 44(20):9600-9610.
PMID: 27679480 PMC: 5175365. DOI: 10.1093/nar/gkw843.
Lucas J, Muffato M, Crollius H BMC Bioinformatics. 2014; 15:268.
PMID: 25103980 PMC: 4155083. DOI: 10.1186/1471-2105-15-268.
Kerfuffle: a web tool for multi-species gene colocalization analysis.
Aboukhalil R, Fendler B, Atwal G BMC Bioinformatics. 2013; 14:22.
PMID: 23327649 PMC: 3598493. DOI: 10.1186/1471-2105-14-22.
Prosdocimi F, Linard B, Pontarotti P, Poch O, Thompson J BMC Genomics. 2012; 13:5.
PMID: 22217008 PMC: 3311146. DOI: 10.1186/1471-2164-13-5.
i-ADHoRe 3.0--fast and sensitive detection of genomic homology in extremely large data sets.
Proost S, Fostier J, De Witte D, Dhoedt B, Demeester P, Van de Peer Y Nucleic Acids Res. 2011; 40(2):e11.
PMID: 22102584 PMC: 3258164. DOI: 10.1093/nar/gkr955.