» Articles » PMID: 30888410

Curating Gene Sets: Challenges and Opportunities for Integrative Analysis

Overview
Specialty Biology
Date 2019 Mar 20
PMID 30888410
Citations 3
Authors
Affiliations
Soon will be listed here.
Abstract

Genomic data interpretation often requires analyses that move from a gene-by-gene focus to a focus on sets of genes that are associated with biological phenomena such as molecular processes, phenotypes, diseases, drug interactions or environmental conditions. Unique challenges exist in the curation of gene sets beyond the challenges in curation of individual genes. Here we highlight a literature curation workflow whereby gene sets are curated from peer-reviewed published data into GeneWeaver (GW), a data repository and analysis platform. We describe the system features that allow for a flexible yet precise curation procedure. We illustrate the value of curation by gene sets through analysis of independently curated sets that relate to the integrated stress response, showing that sets curated from independent sources all share significant Jaccard similarity. A suite of reproducible analysis tools is provided in GW as services to carry out interactive functional investigation of user-submitted gene sets within the context of over 150 000 gene sets constructed from publicly available resources and published gene lists. A curation interface supports the ability of users to design and maintain curation workflows of gene sets, including assigning, reviewing and releasing gene sets within a curation project context.

Citing Articles

Integration of evidence across human and model organism studies: A meeting report.

Palmer R, Johnson E, Won H, Polimanti R, Kapoor M, Chitre A Genes Brain Behav. 2021; :e12738.

PMID: 33893716 PMC: 8365690. DOI: 10.1111/gbb.12738.


Genetic analysis of amyotrophic lateral sclerosis identifies contributing pathways and cell types.

Saez-Atienzar S, Bandres-Ciga S, Langston R, Kim J, Choi S, Reynolds R Sci Adv. 2021; 7(3).

PMID: 33523907 PMC: 7810371. DOI: 10.1126/sciadv.abd9036.


mitch: multi-contrast pathway enrichment for multi-omics and single-cell profiling data.

Kaspi A, Ziemann M BMC Genomics. 2020; 21(1):447.

PMID: 32600408 PMC: 7325150. DOI: 10.1186/s12864-020-06856-9.

References
1.
Zhang P, McGrath B, Reinert J, Olsen D, Lei L, Gill S . The GCN2 eIF2alpha kinase is required for adaptation to amino acid deprivation in mice. Mol Cell Biol. 2002; 22(19):6681-8. PMC: 134046. DOI: 10.1128/MCB.22.19.6681-6688.2002. View

2.
Kohler S, Doelken S, Mungall C, Bauer S, Firth H, Bailleul-Forestier I . The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 2013; 42(Database issue):D966-74. PMC: 3965098. DOI: 10.1093/nar/gkt1026. View

3.
Lein E, Hawrylycz M, Ao N, Ayres M, Bensinger A, Bernard A . Genome-wide atlas of gene expression in the adult mouse brain. Nature. 2006; 445(7124):168-76. DOI: 10.1038/nature05453. View

4.
Han J, Back S, Hur J, Lin Y, Gildersleeve R, Shan J . ER-stress-induced transcriptional regulation increases protein synthesis leading to cell death. Nat Cell Biol. 2013; 15(5):481-90. PMC: 3692270. DOI: 10.1038/ncb2738. View

5.
Liberzon A, Subramanian A, Pinchback R, Thorvaldsdottir H, Tamayo P, Mesirov J . Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011; 27(12):1739-40. PMC: 3106198. DOI: 10.1093/bioinformatics/btr260. View