Fuzzy Association Rules for Biological Data Analysis: a Case Study on Yeast

Overview

Journal BMC Bioinformatics

Publisher Biomed Central

Specialty Biology

Date 2008 Feb 21

PMID 18284669

Citations 6

Authors

Francisco J Lopez

Armando Blanco

Fernando Garcia

Carlos Cano

Antonio Marin

Affiliations

Soon will be listed here.

Abstract

Background: Last years' mapping of diverse genomes has generated huge amounts of biological data which are currently dispersed through many databases. Integration of the information available in the various databases is required to unveil possible associations relating already known data. Biological data are often imprecise and noisy. Fuzzy set theory is specially suitable to model imprecise data while association rules are very appropriate to integrate heterogeneous data.

Results: In this work we propose a novel fuzzy methodology based on a fuzzy association rule mining method for biological knowledge extraction. We apply this methodology over a yeast genome dataset containing heterogeneous information regarding structural and functional genome features. A number of association rules have been found, many of them agreeing with previous research in the area. In addition, a comparison between crisp and fuzzy results proves the fuzzy associations to be more reliable than crisp ones.

Conclusion: An integrative approach as the one carried out in this work can unveil significant knowledge which is currently hidden and dispersed through the existing biological databases. It is shown that fuzzy association rules can model this knowledge in an intuitive way by using linguistic labels and few easy-understandable parameters.

Citing Articles

Data- and expert-driven rule induction and filtering framework for functional interpretation and description of gene sets.

Gruca A, Sikora M J Biomed Semantics. 2017; 8(1):23.

PMID: 28651634 PMC: 5483958. DOI: 10.1186/s13326-017-0129-x.

Mining Association Rules among Gene Functions in Clusters of Similar Gene Expression Maps.

An L, Obradovic Z, Smith D, Bodenreider O, Megalooikonomou V IEEE Int Conf Bioinform Biomed Workshops. 2015; 2009:254-259.

PMID: 25635265 PMC: 4307020. DOI: 10.1109/BIBMW.2009.5332104.

CisMiner: genome-wide in-silico cis-regulatory module prediction by fuzzy itemset mining.

Navarro C, Lopez F, Cano C, Garcia-Alcalde F, Blanco A PLoS One. 2014; 9(9):e108065.

PMID: 25268582 PMC: 4182448. DOI: 10.1371/journal.pone.0108065.

A primer to frequent itemset mining for bioinformatics.

Naulaerts S, Meysman P, Bittremieux W, Vu T, Vanden Berghe W, Goethals B Brief Bioinform. 2013; 16(2):216-31.

PMID: 24162173 PMC: 4364064. DOI: 10.1093/bib/bbt074.

Biomedical application of fuzzy association rules for identifying breast cancer biomarkers.

Lopez F, Cuadros M, Cano C, Concha A, Blanco A Med Biol Eng Comput. 2012; 50(9):981-90.

PMID: 22622817 DOI: 10.1007/s11517-012-0914-8.

References

Castrillo J, Oliver S . Yeast as a touchstone in post-genomic research: strategies for integrative analysis in functional genomics. J Biochem Mol Biol. 2004; 37(1):93-106. DOI: 10.5483/bmbrep.2004.37.1.093. View

Prelic A, Bleuler S, Zimmermann P, Wille A, Buhlmann P, Gruissem W . A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics. 2006; 22(9):1122-9. DOI: 10.1093/bioinformatics/btl060. View

Pederson D, Morse R . Effect of transcription of yeast chromatin on DNA topology in vivo. EMBO J. 1990; 9(6):1873-81. PMC: 551893. DOI: 10.1002/j.1460-2075.1990.tb08313.x. View

Marin A, Gallardo M, Kato Y, Shirahige K, Gutierrez G, Ohta K . Relationship between G+C content, ORF-length and mRNA concentration in Saccharomyces cerevisiae. Yeast. 2003; 20(8):703-11. DOI: 10.1002/yea.992. View

Kanehisa M, Bork P . Bioinformatics in the post-sequence era. Nat Genet. 2003; 33 Suppl:305-10. DOI: 10.1038/ng1109. View

Warringer J, Blomberg A . Evolutionary constraints on yeast protein size. BMC Evol Biol. 2006; 6:61. PMC: 1560397. DOI: 10.1186/1471-2148-6-61. View

Swift S, Tucker A, Vinciotti V, Martin N, Orengo C, Liu X . Consensus clustering and functional interpretation of gene-expression data. Genome Biol. 2004; 5(11):R94. PMC: 545785. DOI: 10.1186/gb-2004-5-11-r94. View

Zhong W, Sternberg P . Automated data integration for developmental biological research. Development. 2007; 134(18):3227-38. DOI: 10.1242/dev.001073. View

Narayanan A, Keedwell E, Olsson B . Artificial intelligence techniques for bioinformatics. Appl Bioinformatics. 2004; 1(4):191-222. View

10.

Eisen M, Spellman P, Brown P, Botstein D . Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A. 1998; 95(25):14863-8. PMC: 24541. DOI: 10.1073/pnas.95.25.14863. View

11.

Cho R, Campbell M, Winzeler E, Steinmetz L, Conway A, Wodicka L . A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell. 1998; 2(1):65-73. DOI: 10.1016/s1097-2765(00)80114-8. View

12.

Lee M, Garrard W . Positive DNA supercoiling generates a chromatin conformation characteristic of highly active genes. Proc Natl Acad Sci U S A. 1991; 88(21):9675-9. PMC: 52781. DOI: 10.1073/pnas.88.21.9675. View

13.

Peck L, Wang J . Transcriptional block caused by a negative supercoiling induced structural change in an alternating CG sequence. Cell. 1985; 40(1):129-37. DOI: 10.1016/0092-8674(85)90316-2. View

14.

Huh W, Falvo J, Gerke L, Carroll A, Howson R, Weissman J . Global analysis of protein localization in budding yeast. Nature. 2003; 425(6959):686-91. DOI: 10.1038/nature02026. View

15.

Bhaskar H, Hoyle D, Singh S . Machine learning in bioinformatics: a brief survey and recommendations for practitioners. Comput Biol Med. 2005; 36(10):1104-25. DOI: 10.1016/j.compbiomed.2005.09.002. View

16.

Al-Shahrour F, Minguez P, Tarraga J, Medina I, Alloza E, Montaner D . FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments. Nucleic Acids Res. 2007; 35(Web Server issue):W91-6. PMC: 1933151. DOI: 10.1093/nar/gkm260. View

17.

Marin A, Wang M, Gutierrez G . Short-range compositional correlation in the yeast genome depends on transcriptional orientation. Gene. 2004; 333:151-5. DOI: 10.1016/j.gene.2004.02.016. View

18.

Perez-Ortin J, Alepuz P, Moreno J . Genomics and gene transcription kinetics in yeast. Trends Genet. 2007; 23(5):250-7. DOI: 10.1016/j.tig.2007.03.006. View

19.

Joyce A, Palsson B . The model organism as a system: integrating 'omics' data sets. Nat Rev Mol Cell Biol. 2006; 7(3):198-210. DOI: 10.1038/nrm1857. View

20.

Creighton C, Hanash S . Mining gene expression databases for association rules. Bioinformatics. 2002; 19(1):79-86. DOI: 10.1093/bioinformatics/19.1.79. View