» Articles » PMID: 19190775

Using Sequence Similarity Networks for Visualization of Relationships Across Diverse Protein Superfamilies

Overview
Journal PLoS One
Date 2009 Feb 5
PMID 19190775
Citations 261
Authors
Affiliations
Soon will be listed here.
Abstract

The dramatic increase in heterogeneous types of biological data--in particular, the abundance of new protein sequences--requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity--GPCRs and kinases from humans, and the crotonase superfamily of enzymes--we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.

Citing Articles

Insights into putative alginate lyases from epipelagic and mesopelagic communities of the global ocean.

Lozada M, Dionisi H Sci Rep. 2025; 15(1):8111.

PMID: 40057569 PMC: 11890756. DOI: 10.1038/s41598-025-92960-3.


Characterization of Tetrathionate Hydrolase from Acidothermophilic Sulfur-Oxidizing Archaeon Ar-4.

Wang P, Li L, Liu L, Qin Y, Li X, Yin H Int J Mol Sci. 2025; 26(3).

PMID: 39941105 PMC: 11818568. DOI: 10.3390/ijms26031338.


Metatranscriptomes-based sequence similarity networks uncover genetic signatures within parasitic freshwater microbial eukaryotes.

Monjot A, Rousseau J, Bittner L, Lepere C Microbiome. 2025; 13(1):43.

PMID: 39915863 PMC: 11800578. DOI: 10.1186/s40168-024-02027-0.


Structural insights into the enzymatic breakdown of azomycin-derived antibiotics by 2-nitroimdazole hydrolase (NnhA).

Ahmed F, Liu J, Royan S, Warden A, Esquirol L, Pandey G Commun Biol. 2024; 7(1):1676.

PMID: 39702827 PMC: 11659421. DOI: 10.1038/s42003-024-07336-6.


Photosynthetic directed endosymbiosis to investigate the role of bioenergetics in chloroplast function and evolution.

De B, Cournoyer J, Cournoyer J, Gao Y, Wallace C, Bram S Nat Commun. 2024; 15(1):10622.

PMID: 39658562 PMC: 11632070. DOI: 10.1038/s41467-024-54051-1.


References
1.
Manning G, Whyte D, Martinez R, Hunter T, Sudarsanam S . The protein kinase complement of the human genome. Science. 2002; 298(5600):1912-34. DOI: 10.1126/science.1075762. View

2.
Hall R, Brown S, Fedorov A, Fedorov E, Xu C, Babbitt P . Structural diversity within the mononuclear and binuclear active sites of N-acetyl-D-glucosamine-6-phosphate deacetylase. Biochemistry. 2007; 46(27):7953-62. DOI: 10.1021/bi700544c. View

3.
Goh C, Bogan A, Joachimiak M, Walther D, Cohen F . Co-evolution of proteins with their interaction partners. J Mol Biol. 2000; 299(2):283-93. DOI: 10.1006/jmbi.2000.3732. View

4.
Huang C, Cha S, Wang H, Xie J, Cobb M . WNKs: protein kinases with a unique kinase domain. Exp Mol Med. 2007; 39(5):565-73. DOI: 10.1038/emm.2007.62. View

5.
Frickey T, Lupas A . CLANS: a Java application for visualizing protein families based on pairwise similarity. Bioinformatics. 2004; 20(18):3702-4. DOI: 10.1093/bioinformatics/bth444. View