Semantically Enabled and Statistically Supported Biological Hypothesis Testing with Tissue Microarray Databases
Overview
Affiliations
Background: Although many biological databases are applying semantic web technologies, meaningful biological hypothesis testing cannot be easily achieved. Database-driven high throughput genomic hypothesis testing requires both of the capabilities of obtaining semantically relevant experimental data and of performing relevant statistical testing for the retrieved data. Tissue Microarray (TMA) data are semantically rich and contains many biologically important hypotheses waiting for high throughput conclusions.
Methods: An application-specific ontology was developed for managing TMA and DNA microarray databases by semantic web technologies. Data were represented as Resource Description Framework (RDF) according to the framework of the ontology. Applications for hypothesis testing (Xperanto-RDF) for TMA data were designed and implemented by (1) formulating the syntactic and semantic structures of the hypotheses derived from TMA experiments, (2) formulating SPARQLs to reflect the semantic structures of the hypotheses, and (3) performing statistical test with the result sets returned by the SPARQLs.
Results: When a user designs a hypothesis in Xperanto-RDF and submits it, the hypothesis can be tested against TMA experimental data stored in Xperanto-RDF. When we evaluated four previously validated hypotheses as an illustration, all the hypotheses were supported by Xperanto-RDF.
Conclusions: We demonstrated the utility of high throughput biological hypothesis testing. We believe that preliminary investigation before performing highly controlled experiment can be benefited.
TMAinspiration: Decode Interdependencies in Multifactorial Tissue Microarray Data.
Boecker F, Buerger H, Mallela N, Korsching E Cancer Inform. 2016; 15:143-9.
PMID: 27398021 PMC: 4928646. DOI: 10.4137/CIN.S39112.
A semantic web framework to integrate cancer omics data with biological knowledge.
Holford M, McCusker J, Cheung K, Krauthammer M BMC Bioinformatics. 2012; 13 Suppl 1:S10.
PMID: 22373303 PMC: 3471346. DOI: 10.1186/1471-2105-13-S1-S10.