The Use of E-scores to Determine the Quality of Protein Alignments
Overview
Toxicology
Affiliations
In 2001, the FAO/WHO suggested a procedure for performing FASTA or BLAST searches, and a threshold of greater than 35% identity in 80 or greater amino acids to identify potential allergenic cross-reactivity of transgene encoded proteins in genetically enhanced crops. Transgene encoded proteins meeting or exceeding this threshold would require additional in vitro evaluation for allergy safety. In work described herein, a method to calculate an E-score threshold is proposed for utilizing the full capability of bioinformatics to accurately identify potential cross-reactive allergens. The threshold E-score, 3.9E-07, was produced using a test dataset of 7695 corn protein sequences and a method that entailed FASTA searches of the FARRP 7 allergen database with each of the dataset sequences using a conventional full length and an 80 amino acid sliding window FASTA comparison followed by an evaluation of E-score distribution. The results show that this E-score threshold identifies known corn allergens and it displays a false positive rate for known allergens that is comparable to that obtained from the 2001 FAO/WHO guidance. Furthermore, the E-score threshold is of sufficient stringency that it rejects the majority of false positive, composition-based anomalies and is 100% effective at identifying Bet v 1 cross-reactive allergens.
van Ree R, Sapiter Ballerda D, Berin M, Beuf L, Chang A, Gadermaier G Front Allergy. 2022; 2:700533.
PMID: 35386979 PMC: 8974746. DOI: 10.3389/falgy.2021.700533.
Mullins E, Bresson J, Dalmay T, Dewhurst I, Epstein M, Firbank L EFSA J. 2022; 20(1):e07044.
PMID: 35106091 PMC: 8787593. DOI: 10.2903/j.efsa.2022.7044.
Anderson J, Herman R, Carlson A, Mathesius C, Maxwell C, Mirsky H GM Crops Food. 2021; 12(1):282-291.
PMID: 33472515 PMC: 7833765. DOI: 10.1080/21645698.2020.1869492.
Yang Q, Yu W, Wu H, Zhang C, Sun S, Liu Q Plant Biotechnol J. 2020; 19(3):490-501.
PMID: 32945115 PMC: 7955878. DOI: 10.1111/pbi.13478.
Identification and in silico bioinformatics analysis of PR10 proteins in cashew nut.
Bastiaan-Net S, Pina-Perez M, Dekkers B, Westphal A, America A, Ariens R Protein Sci. 2020; 29(7):1581-1595.
PMID: 32219913 PMC: 7314402. DOI: 10.1002/pro.3856.