» Articles » PMID: 24564370

CorrelaGenes: a New Tool for the Interpretation of the Human Transcriptome

Overview
Publisher Biomed Central
Specialty Biology
Date 2014 Feb 26
PMID 24564370
Citations 4
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The amount of gene expression data available in public repositories has grown exponentially in the last years, now requiring new data mining tools to transform them in information easily accessible to biologists.

Results: By exploiting expression data publicly available in the Gene Expression Omnibus (GEO) database, we developed a new bioinformatics tool aimed at the identification of genes whose expression appeared simultaneously altered in different experimental conditions, thus suggesting co-regulation or coordinated action in the same biological process. To accomplish this task, we used the 978 human GEO Curated DataSets and we manually performed the selection of 2,109 pair-wise comparisons based on their biological rationale. The lists of differentially expressed genes, obtained from the selected comparisons, were stored in a PostgreSQL database and used as data source for the CorrelaGenes tool. Our application uses a customized Association Rule Mining (ARM) algorithm to identify sets of genes showing expression profiles correlated with a gene of interest. The significance of the correlation is measured coupling the Lift, a well-known standard ARM index, and the χ(2) p value. The manually curated selection of the comparisons and the developed algorithm constitute a new approach in the field of gene expression profiling studies. Simulation performed on 100 randomly selected target genes allowed us to evaluate the efficiency of the procedure and to obtain preliminary data demonstrating the consistency of the results.

Conclusions: The preliminary results of the simulation showed how CorrelaGenes could contribute to the characterization of molecular pathways and biological processes integrating data obtained from other applications and available in public repositories.

Citing Articles

Synergistic horizontal transfer of antibiotic resistance genes and transposons in the infant gut microbial genome.

Ding Y, Jiang X, Wu J, Wang Y, Zhao L, Pan Y mSphere. 2023; 9(1):e0060823.

PMID: 38112433 PMC: 10826358. DOI: 10.1128/msphere.00608-23.


An Association Rule Mining Approach to Discover lncRNAs Expression Patterns in Cancer Datasets.

Cremaschi P, Carriero R, Astrologo S, Coli C, Lisa A, Parolo S Biomed Res Int. 2015; 2015:146250.

PMID: 26273587 PMC: 4530207. DOI: 10.1155/2015/146250.


NETTAB 2012 on "Integrated Bio-Search".

Romano P, Lisacek F, Masseroli M BMC Bioinformatics. 2014; 15 Suppl 1:S1.

PMID: 24564635 PMC: 4015131. DOI: 10.1186/1471-2105-15-S1-S1.


CorrelaGenes: a new tool for the interpretation of the human transcriptome.

Cremaschi P, Rovida S, Sacchi L, Lisa A, Calvi F, Montecucco A BMC Bioinformatics. 2014; 15 Suppl 1:S6.

PMID: 24564370 PMC: 4016313. DOI: 10.1186/1471-2105-15-S1-S6.

References
1.
Rung J, Brazma A . Reuse of public genome-wide gene expression data. Nat Rev Genet. 2012; 14(2):89-99. DOI: 10.1038/nrg3394. View

2.
Cremaschi P, Rovida S, Sacchi L, Lisa A, Calvi F, Montecucco A . CorrelaGenes: a new tool for the interpretation of the human transcriptome. BMC Bioinformatics. 2014; 15 Suppl 1:S6. PMC: 4016313. DOI: 10.1186/1471-2105-15-S1-S6. View

3.
Rhodes D, Kalyana-Sundaram S, Mahavisno V, Varambally R, Yu J, Briggs B . Oncomine 3.0: genes, pathways, and networks in a collection of 18,000 cancer gene expression profiles. Neoplasia. 2007; 9(2):166-80. PMC: 1813932. DOI: 10.1593/neo.07112. View

4.
Ashburner M, Ball C, Blake J, Botstein D, Butler H, Cherry J . Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000; 25(1):25-9. PMC: 3037419. DOI: 10.1038/75556. View

5.
Sato M, Sakota M, Nakayama K . Human PRP19 interacts with prolyl-hydroxylase PHD3 and inhibits cell death in hypoxia. Exp Cell Res. 2010; 316(17):2871-82. DOI: 10.1016/j.yexcr.2010.06.018. View