» Articles » PMID: 37601315

GOgetter: A Pipeline for Summarizing and Visualizing GO Slim Annotations for Plant Genetic Data

Overview
Journal Appl Plant Sci
Date 2023 Aug 21
PMID 37601315
Authors
Affiliations
Soon will be listed here.
Abstract

Premise: The functional annotation of genes is a crucial component of genomic analyses. A common way to summarize functional annotations is with hierarchical gene ontologies, such as the Gene Ontology (GO) Resource. GO includes information about the cellular location, molecular function(s), and products/processes that genes produce or are involved in. For a set of genes, summarizing GO annotations using pre-defined, higher-order terms (GO slims) is often desirable in order to characterize the overall function of the data set, and it is impractical to do this manually.

Methods And Results: The GOgetter pipeline consists of bash and Python scripts. From an input FASTA file of nucleotide gene sequences, it outputs text and image files that list (1) the best hit for each input gene in a set of reference gene models, (2) all GO terms and annotations associated with those hits, and (3) a summary and visualization of GO slim categories for the data set. These output files can be queried further and analyzed statistically, depending on the downstream need(s).

Conclusions: GO annotations are a widely used "universal language" for describing gene functions and products. GOgetter is a fast and easy-to-implement pipeline for obtaining, summarizing, and visualizing GO slim categories associated with a set of genes.

Citing Articles

GOgetter: A pipeline for summarizing and visualizing GO slim annotations for plant genetic data.

Sessa E, Masalia R, Arrigo N, Barker M, Pelosi J Appl Plant Sci. 2023; 11(4):e11536.

PMID: 37601315 PMC: 10439822. DOI: 10.1002/aps3.11536.

References
1.
. One thousand plant transcriptomes and the phylogenomics of green plants. Nature. 2019; 574(7780):679-685. PMC: 6872490. DOI: 10.1038/s41586-019-1693-2. View

2.
Marx H, Jorgensen S, Wisely E, Li Z, Dlugosch K, Barker M . Pilot RNA-seq data from 24 species of vascular plants at Harvard Forest. Appl Plant Sci. 2021; 9(2):e11409. PMC: 7910807. DOI: 10.1002/aps3.11409. View

3.
Petersen T, Brunak S, von Heijne G, Nielsen H . SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods. 2011; 8(10):785-6. DOI: 10.1038/nmeth.1701. View

4.
Kumar S, Suleski M, Craig J, Kasprowicz A, Sanderford M, Li M . TimeTree 5: An Expanded Resource for Species Divergence Times. Mol Biol Evol. 2022; . PMC: 9400175. DOI: 10.1093/molbev/msac174. View

5.
Ashburner M, Ball C, Blake J, Botstein D, Butler H, Cherry J . Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000; 25(1):25-9. PMC: 3037419. DOI: 10.1038/75556. View