» Articles » PMID: 39205165

VOGDB-Database of Virus Orthologous Groups

Overview
Journal Viruses
Publisher MDPI
Specialty Microbiology
Date 2024 Aug 29
PMID 39205165
Authors
Affiliations
Soon will be listed here.
Abstract

Computational models of homologous protein groups are essential in sequence bioinformatics. Due to the diversity and rapid evolution of viruses, the grouping of protein sequences from virus genomes is particularly challenging. The low sequence similarities of homologous genes in viruses require specific approaches for sequence- and structure-based clustering. Furthermore, the annotation of virus genomes in public databases is not as consistent and up to date as for many cellular genomes. To tackle these problems, we have developed VOGDB, which is a database of virus orthologous groups. VOGDB is a multi-layer database that progressively groups viral genes into groups connected by increasingly remote similarity. The first layer is based on pair-wise sequence similarities, the second layer is based on the sequence profile alignments, and the third layer uses predicted protein structures to find the most remote similarity. VOGDB groups allow for more sensitive homology searches of novel genes and increase the chance of predicting annotations or inferring phylogeny. VOGD B uses all virus genomes from RefSeq and partially reannotates them. VOGDB is updated with every RefSeq release. The unique feature of VOGDB is the inclusion of both prokaryotic and eukaryotic viruses in the same clustering process, which makes it possible to explore old evolutionary relationships of the two groups. VOGDB is freely available at vogdb.org under the CC BY 4.0 license.

Citing Articles

Towards a unifying phylogenomic framework for tailed phages.

Weinheimer A, Ha A, Aylward F PLoS Genet. 2025; 21(2):e1011595.

PMID: 39908317 PMC: 11835377. DOI: 10.1371/journal.pgen.1011595.


zol and fai: large-scale targeted detection and evolutionary investigation of gene clusters.

Salamzade R, Tran P, Martin C, Manson A, Gilmore M, Earl A Nucleic Acids Res. 2025; 53(3).

PMID: 39907107 PMC: 11795205. DOI: 10.1093/nar/gkaf045.


Screening great ape museum specimens for DNA viruses.

Hammerle M, Guellil M, Trgovec-Greif L, Cheronet O, Sawyer S, Ruiz-Gartzia I Sci Rep. 2024; 14(1):29806.

PMID: 39616255 PMC: 11608371. DOI: 10.1038/s41598-024-80780-w.


Tailless and filamentous prophages are predominant in marine Vibrio.

Steensen K, Seneca J, Bartlau N, Yu X, Hussain F, Polz M ISME J. 2024; 18(1).

PMID: 39423289 PMC: 11630473. DOI: 10.1093/ismejo/wrae202.

References
1.
Li W, ONeill K, Haft D, DiCuccio M, Chetvernin V, Badretdin A . RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation. Nucleic Acids Res. 2020; 49(D1):D1020-D1028. PMC: 7779008. DOI: 10.1093/nar/gkaa1105. View

2.
Koonin E, Krupovic M, Dolja V . The global virome: How much diversity and how many independent origins?. Environ Microbiol. 2022; 25(1):40-44. DOI: 10.1111/1462-2920.16207. View

3.
Turner D, Shkoporov A, Lood C, Millard A, Dutilh B, Alfenas-Zerbini P . Abolishment of morphology-based taxa and change to binomial species names: 2022 taxonomy update of the ICTV bacterial viruses subcommittee. Arch Virol. 2023; 168(2):74. PMC: 9868039. DOI: 10.1007/s00705-022-05694-2. View

4.
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O . Highly accurate protein structure prediction with AlphaFold. Nature. 2021; 596(7873):583-589. PMC: 8371605. DOI: 10.1038/s41586-021-03819-2. View

5.
Pearson W . An introduction to sequence similarity ("homology") searching. Curr Protoc Bioinformatics. 2013; Chapter 3:3.1.1-3.1.8. PMC: 3820096. DOI: 10.1002/0471250953.bi0301s42. View