» Articles » PMID: 11544202

SGP-1: Prediction and Validation of Homologous Genes Based on Sequence Alignments

Overview
Journal Genome Res
Specialty Genetics
Date 2001 Sep 7
PMID 11544202
Citations 37
Authors
Affiliations
Soon will be listed here.
Abstract

Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of depends little on species-specific properties such as codon usage or the nucleotide distribution. may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors.

Citing Articles

Comparative complete chloroplast genome of Geum japonicum: evolution and phylogenetic analysis.

Xie J, Miao Y, Zhang X, Zhang G, Guo B, Luo G J Plant Res. 2023; 137(1):37-48.

PMID: 37917204 DOI: 10.1007/s10265-023-01502-3.


Integrated entropy-based approach for analyzing exons and introns in DNA sequences.

Li J, Zhang L, Li H, Ping Y, Xu Q, Wang R BMC Bioinformatics. 2019; 20(Suppl 8):283.

PMID: 31182012 PMC: 6557737. DOI: 10.1186/s12859-019-2772-y.


Whole-Genome Alignment and Comparative Annotation.

Armstrong J, Fiddes I, Diekhans M, Paten B Annu Rev Anim Biosci. 2018; 7:41-64.

PMID: 30379572 PMC: 6450745. DOI: 10.1146/annurev-animal-020518-115005.


OMGene: mutual improvement of gene models through optimisation of evolutionary conservation.

Dunne M, Kelly S BMC Genomics. 2018; 19(1):307.

PMID: 29703150 PMC: 5923031. DOI: 10.1186/s12864-018-4704-z.


An optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm.

Chowdhury B, Garai A, Garai G BMC Bioinformatics. 2017; 18(1):460.

PMID: 29065853 PMC: 5655831. DOI: 10.1186/s12859-017-1874-7.


References
1.
Bafna V, Huson D . The conserved exon method for gene finding. Proc Int Conf Intell Syst Mol Biol. 2000; 8:3-12. View

2.
Wiehe T, Guigo R, Miller W . Genome sequence comparisons: hurdles in the fast lane to functional genomics. Brief Bioinform. 2001; 1(4):381-8. DOI: 10.1093/bib/1.4.381. View

3.
Gelfand M, Mironov A, Pevzner P . Gene recognition via spliced sequence alignment. Proc Natl Acad Sci U S A. 1996; 93(17):9061-6. PMC: 38595. DOI: 10.1073/pnas.93.17.9061. View

4.
Claverie J . Computational methods for exon detection. Mol Biotechnol. 1998; 10(1):27-48. DOI: 10.1007/BF02745861. View

5.
Tarchini R, Biddle P, Wineland R, Tingey S, Rafalski A . The complete sequence of 340 kb of DNA around the rice Adh1-adh2 region reveals interrupted colinearity with maize chromosome 4. Plant Cell. 2000; 12(3):381-91. PMC: 139838. DOI: 10.1105/tpc.12.3.381. View