» Articles » PMID: 24564523

FISH Amyloid - a New Method for Finding Amyloidogenic Segments in Proteins Based on Site Specific Co-occurrence of Aminoacids

Overview
Publisher Biomed Central
Specialty Biology
Date 2014 Feb 26
PMID 24564523
Citations 32
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Amyloids are proteins capable of forming fibrils whose intramolecular contact sites assume densely packed zipper pattern. Their oligomers can underlie serious diseases, e.g. Alzheimer's and Parkinson's diseases. Recent studies show that short segments of aminoacids can be responsible for amyloidogenic properties of a protein. A few hundreds of such peptides have been experimentally found but experimental testing of all candidates is currently not feasible. Here we propose an original machine learning method for classification of aminoacid sequences, based on discovering a segment with a discriminative pattern of site-specific co-occurrences between sequence elements. The pattern is based on the positions of residues with correlated occurrence over a sliding window of a specified length. The algorithm first recognizes the most relevant training segment in each positive training instance. Then the classification is based on maximal distances between co-occurrence matrix of the relevant segments in positive training sequences and the matrix from negative training segments. The method was applied for studying sequences of aminoacids with regard to their amyloidogenic properties.

Results: Our method was first trained on available datasets of hexapeptides with the amyloidogenic classification, using 5 or 6-residue sliding windows. Depending on the choice of training and testing datasets, the area under ROC curve obtained the value up to 0.80 for experimental, and 0.95 for computationally generated (with 3D profile method) datasets. Importantly, the results on 5-residue segments were not significantly worse, although the classification required that algorithm first recognized the most relevant training segments. The dataset of long sequences, such as sup35 prion and a few other amyloid proteins, were applied to test the method and gave encouraging results. Our web tool FISH Amyloid was trained on all available experimental data 4-10 residues long, offers prediction of amyloidogenic segments in protein sequences.

Conclusions: We proposed a new original classification method which recognizes co-occurrence patterns in sequences. The method reveals characteristic classification pattern of the data and finds the segments where its scoring is the strongest, also in long training sequences. Applied to the problem of amyloidogenic segments recognition, it showed a good potential for classification problems in bioinformatics.

Citing Articles

Proteomic Evidence for Amyloidogenic Cross-Seeding in Fibrinaloid Microclots.

Kell D, Pretorius E Int J Mol Sci. 2024; 25(19).

PMID: 39409138 PMC: 11476703. DOI: 10.3390/ijms251910809.


Enhancing protein aggregation prediction: a unified analysis leveraging graph convolutional networks and active learning.

Sun J, Song J, Kim J, Kang S, Park E, Seo S RSC Adv. 2024; 14(43):31439-31450.

PMID: 39363998 PMC: 11447823. DOI: 10.1039/d4ra06285j.


Amyloidogenic regions in beta-strands II and III modulate the aggregation and toxicity of SOD1 in living cells.

McAlary L, Nan J, Shyu C, Sher M, Plotkin S, Cashman N Open Biol. 2024; 14(6):230418.

PMID: 38835240 PMC: 11285818. DOI: 10.1098/rsob.230418.


AggreProt: a web server for predicting and engineering aggregation prone regions in proteins.

Planas-Iglesias J, Borko S, Swiatkowski J, Elias M, Havlasek M, Salamon O Nucleic Acids Res. 2024; 52(W1):W159-W169.

PMID: 38801076 PMC: 11223854. DOI: 10.1093/nar/gkae420.


Aggrescan4D: structure-informed analysis of pH-dependent protein aggregation.

Barcenas O, Kuriata A, Zalewski M, Iglesias V, Pintado-Grima C, Firlik G Nucleic Acids Res. 2024; 52(W1):W170-W175.

PMID: 38738618 PMC: 11223845. DOI: 10.1093/nar/gkae382.


References
1.
Tartaglia G, Vendruscolo M . Proteome-level interplay between folding and aggregation propensities of proteins. J Mol Biol. 2010; 402(5):919-28. DOI: 10.1016/j.jmb.2010.08.013. View

2.
Stanislawski J, Kotulska M, Unold O . Machine learning methods can replace 3D profile method in classification of amyloidogenic hexapeptides. BMC Bioinformatics. 2013; 14:21. PMC: 3566972. DOI: 10.1186/1471-2105-14-21. View

3.
Uversky V, Fink A . Conformational constraints for amyloid fibrillation: the importance of being unfolded. Biochim Biophys Acta. 2004; 1698(2):131-53. DOI: 10.1016/j.bbapap.2003.12.008. View

4.
Sawaya M, Sambashivan S, Nelson R, Ivanova M, Sievers S, Apostol M . Atomic structures of amyloid cross-beta spines reveal varied steric zippers. Nature. 2007; 447(7143):453-7. DOI: 10.1038/nature05695. View

5.
Nelson R, Sawaya M, Balbirnie M, Madsen A, Riekel C, Grothe R . Structure of the cross-beta spine of amyloid-like fibrils. Nature. 2005; 435(7043):773-8. PMC: 1479801. DOI: 10.1038/nature03680. View