» Articles » PMID: 24204229

Comprehensive Repertoire of Foldable Regions Within Whole Genomes

Overview
Specialty Biology
Date 2013 Nov 9
PMID 24204229
Citations 21
Authors
Affiliations
Soon will be listed here.
Abstract

In order to get a comprehensive repertoire of foldable domains within whole proteomes, including orphan domains, we developed a novel procedure, called SEG-HCA. From only the information of a single amino acid sequence, SEG-HCA automatically delineates segments possessing high densities in hydrophobic clusters, as defined by Hydrophobic Cluster Analysis (HCA). These hydrophobic clusters mainly correspond to regular secondary structures, which together form structured or foldable regions. Genome-wide analyses revealed that SEG-HCA is opposite of disorder predictors, both addressing distinct structural states. Interestingly, there is however an overlap between the two predictions, including small segments of disordered sequences, which undergo coupled folding and binding. SEG-HCA thus gives access to these specific domains, which are generally poorly represented in domain databases. Comparison of the whole set of SEG-HCA predictions with the Conserved Domain Database (CDD) also highlighted a wide proportion of predicted large (length >50 amino acids) segments, which are CDD orphan. These orphan sequences may either correspond to highly divergent members of already known families or belong to new families of domains. Their comprehensive description thus opens new avenues to investigate new functional and/or structural features, which remained so far uncovered. Altogether, the data described here provide new insights into the protein architecture and organization throughout the three kingdoms of life.

Citing Articles

The ribosome profiling landscape of yeast reveals a high diversity in pervasive translation.

Papadopoulos C, Arbes H, Cornu D, Chevrollier N, Blanchet S, Roginski P Genome Biol. 2024; 25(1):268.

PMID: 39402662 PMC: 11472626. DOI: 10.1186/s13059-024-03403-7.


Human paraneoplastic antigen Ma2 (PNMA2) forms icosahedral capsids that can be engineered for mRNA delivery.

Madigan V, Zhang Y, Raghavan R, Wilkinson M, Faure G, Puccio E Proc Natl Acad Sci U S A. 2024; 121(11):e2307812120.

PMID: 38437549 PMC: 10945824. DOI: 10.1073/pnas.2307812120.


Digging into the 3D Structure Predictions of AlphaFold2 with Low Confidence: Disorder and Beyond.

Bruley A, Mornon J, Duprat E, Callebaut I Biomolecules. 2022; 12(10).

PMID: 36291675 PMC: 9599455. DOI: 10.3390/biom12101467.


Exploring the Peptide Potential of Genomes.

Papadopoulos C, Chevrollier N, Lopes A Methods Mol Biol. 2022; 2405:63-82.

PMID: 35298808 DOI: 10.1007/978-1-0716-1855-4_3.


New Genomic Signals Underlying the Emergence of Human Proto-Genes.

Grandchamp A, Berk K, Dohmen E, Bornberg-Bauer E Genes (Basel). 2022; 13(2).

PMID: 35205330 PMC: 8871994. DOI: 10.3390/genes13020284.


References
1.
Dosztanyi Z, Meszaros B, Simon I . ANCHOR: web server for predicting protein binding regions in disordered proteins. Bioinformatics. 2009; 25(20):2745-6. PMC: 2759549. DOI: 10.1093/bioinformatics/btp518. View

2.
Apic G, Gough J, Teichmann S . Domain combinations in archaeal, eubacterial and eukaryotic proteomes. J Mol Biol. 2001; 310(2):311-25. DOI: 10.1006/jmbi.2001.4776. View

3.
Ward J, Sodhi J, McGuffin L, Buxton B, Jones D . Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol. 2004; 337(3):635-45. DOI: 10.1016/j.jmb.2004.02.002. View

4.
Cheng J . DOMAC: an accurate, hybrid protein domain prediction server. Nucleic Acids Res. 2007; 35(Web Server issue):W354-6. PMC: 1933197. DOI: 10.1093/nar/gkm390. View

5.
Moore A, Bjorklund A, Ekman D, Bornberg-Bauer E, Elofsson A . Arrangements in the modular evolution of proteins. Trends Biochem Sci. 2008; 33(9):444-51. DOI: 10.1016/j.tibs.2008.05.008. View