» Articles » PMID: 20175896

Unifying Generative and Discriminative Learning Principles

Overview
Publisher Biomed Central
Specialty Biology
Date 2010 Feb 24
PMID 20175896
Citations 2
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The recognition of functional binding sites in genomic DNA remains one of the fundamental challenges of genome research. During the last decades, a plethora of different and well-adapted models has been developed, but only little attention has been payed to the development of different and similarly well-adapted learning principles. Only recently it was noticed that discriminative learning principles can be superior over generative ones in diverse bioinformatics applications, too.

Results: Here, we propose a generalization of generative and discriminative learning principles containing the maximum likelihood, maximum a posteriori, maximum conditional likelihood, maximum supervised posterior, generative-discriminative trade-off, and penalized generative-discriminative trade-off learning principles as special cases, and we illustrate its efficacy for the recognition of vertebrate transcription factor binding sites.

Conclusions: We find that the proposed learning principle helps to improve the recognition of transcription factor binding sites, enabling better computational approaches for extracting as much information as possible from valuable wet-lab data. We make all implementations available in the open-source library Jstacs so that this learning principle can be easily applied to other classification problems in the field of genome and epigenome analysis.

Citing Articles

Varying levels of complexity in transcription factor binding motifs.

Keilwagen J, Grau J Nucleic Acids Res. 2015; 43(18):e119.

PMID: 26116565 PMC: 4605289. DOI: 10.1093/nar/gkv577.


Effective automated feature construction and selection for classification of biological sequences.

Kamath U, De Jong K, Shehu A PLoS One. 2014; 9(7):e99982.

PMID: 25033270 PMC: 4102475. DOI: 10.1371/journal.pone.0099982.

References
1.
Stormo G, Schneider T, Gold L, Ehrenfeucht A . Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. Nucleic Acids Res. 1982; 10(9):2997-3011. PMC: 320670. DOI: 10.1093/nar/10.9.2997. View

2.
Redhead E, Bailey T . Discriminative motif discovery in DNA and protein sequences using the DEME algorithm. BMC Bioinformatics. 2007; 8:385. PMC: 2194741. DOI: 10.1186/1471-2105-8-385. View

3.
Zhang M, Marr T . A weight array method for splicing signal analysis. Comput Appl Biosci. 1993; 9(5):499-509. DOI: 10.1093/bioinformatics/9.5.499. View

4.
Maragkakis M, Reczko M, Simossis V, Alexiou P, Papadopoulos G, Dalamagas T . DIANA-microT web server: elucidating microRNA functions through target prediction. Nucleic Acids Res. 2009; 37(Web Server issue):W273-6. PMC: 2703977. DOI: 10.1093/nar/gkp292. View

5.
Hand D . From evidence to understanding: a commentary on Fisher (1922) 'On the mathematical foundations of theoretical statistics'. Philos Trans A Math Phys Eng Sci. 2015; 373(2039). PMC: 4360088. DOI: 10.1098/rsta.2014.0252. View