» Articles » PMID: 18096039

CONTRAST: a Discriminative, Phylogeny-free Approach to Multiple Informant De Novo Gene Prediction

Overview
Journal Genome Biol
Specialties Biology
Genetics
Date 2007 Dec 22
PMID 18096039
Citations 37
Authors
Affiliations
Soon will be listed here.
Abstract

We describe CONTRAST, a gene predictor which directly incorporates information from multiple alignments rather than employing phylogenetic models. This is accomplished through the use of discriminative machine learning techniques, including a novel training algorithm. We use a two-stage approach, in which a set of binary classifiers designed to recognize coding region boundaries is combined with a global model of gene structure. CONTRAST predicts exact coding region structures for 65% more human genes than the previous state-of-the-art method, misses 46% fewer exons and displays comparable gains in specificity.

Citing Articles

Comparative Genome Annotation.

Nachtweide S, Romoth L, Stanke M Methods Mol Biol. 2024; 2802:165-187.

PMID: 38819560 DOI: 10.1007/978-1-0716-3838-5_7.


Translation and natural selection of micropeptides from long non-canonical RNAs.

Patraquim P, Magny E, Pueyo J, Platero A, Couso J Nat Commun. 2022; 13(1):6515.

PMID: 36316320 PMC: 9622821. DOI: 10.1038/s41467-022-34094-y.


Machine learning in postgenomic biology and personalized medicine.

Ray A Wiley Interdiscip Rev Data Min Knowl Discov. 2022; 12(2).

PMID: 35966173 PMC: 9371441. DOI: 10.1002/widm.1451.


Building the Chordata Olfactory Receptor Database using more than 400,000 receptors annotated by Genome2OR.

Han W, Wu Y, Zeng L, Zhao S Sci China Life Sci. 2022; 65(12):2539-2551.

PMID: 35696018 DOI: 10.1007/s11427-021-2081-6.


Ancient evolutionary signals of protein-coding sequences allow the discovery of new genes in the Drosophila melanogaster genome.

Casimiro-Soriguer C, Rubio A, Jimenez J, Perez-Pulido A BMC Genomics. 2020; 21(1):210.

PMID: 32138644 PMC: 7059364. DOI: 10.1186/s12864-020-6632-y.


References
1.
Korf I, Flicek P, Duan D, Brent M . Integrating genomic homology into gene structure prediction. Bioinformatics. 2001; 17 Suppl 1:S140-8. DOI: 10.1093/bioinformatics/17.suppl_1.s140. View

2.
DeCaprio D, Vinson J, Pearson M, Montgomery P, Doherty M, Galagan J . Conrad: gene prediction using conditional random fields. Genome Res. 2007; 17(9):1389-98. PMC: 1950907. DOI: 10.1101/gr.6558107. View

3.
Birney E, Clamp M, Durbin R . GeneWise and Genomewise. Genome Res. 2004; 14(5):988-95. PMC: 479130. DOI: 10.1101/gr.1865504. View

4.
Karolchik D, Baertsch R, Diekhans M, Furey T, Hinrichs A, Lu Y . The UCSC Genome Browser Database. Nucleic Acids Res. 2003; 31(1):51-4. PMC: 165576. DOI: 10.1093/nar/gkg129. View

5.
Keibler E, Brent M . Eval: a software package for analysis of genome annotations. BMC Bioinformatics. 2003; 4:50. PMC: 270064. DOI: 10.1186/1471-2105-4-50. View