Semantic Web Repositories for Genomics Data Using the EXframe Platform
Overview
Biomedical Engineering
Affiliations
Background: With the advent of inexpensive assay technologies, there has been an unprecedented growth in genomics data as well as the number of databases in which it is stored. In these databases, sample annotation using ontologies and controlled vocabularies is becoming more common. However, the annotation is rarely available as Linked Data, in a machine-readable format, or for standardized queries using SPARQL. This makes large-scale reuse, or integration with other knowledge bases very difficult.
Methods: To address this challenge, we have developed the second generation of our eXframe platform, a reusable framework for creating online repositories of genomics experiments. This second generation model now publishes Semantic Web data. To accomplish this, we created an experiment model that covers provenance, citations, external links, assays, biomaterials used in the experiment, and the data collected during the process. The elements of our model are mapped to classes and properties from various established biomedical ontologies. Resource Description Framework (RDF) data is automatically produced using these mappings and indexed in an RDF store with a built-in Sparql Protocol and RDF Query Language (SPARQL) endpoint.
Conclusions: Using the open-source eXframe software, institutions and laboratories can create Semantic Web repositories of their experiments, integrate it with heterogeneous resources and make it interoperable with the vast Semantic Web of biomedical knowledge.
Knowledge Representation and Management: a Linked Data Perspective.
Barros M, Couto F Yearb Med Inform. 2016; (1):178-183.
PMID: 27830248 PMC: 5171581. DOI: 10.15265/IY-2016-022.
Harnessing Big Data for Systems Pharmacology.
Xie L, Draizen E, Bourne P Annu Rev Pharmacol Toxicol. 2016; 57:245-262.
PMID: 27814027 PMC: 5626567. DOI: 10.1146/annurev-pharmtox-010716-104659.