» Articles » PMID: 14962936

A Comparison of Scoring Functions for Protein Sequence Profile Alignment

Overview
Journal Bioinformatics
Specialty Biology
Date 2004 Feb 14
PMID 14962936
Citations 46
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: In recent years, several methods have been proposed for aligning two protein sequence profiles, with reported improvements in alignment accuracy and homolog discrimination versus sequence-sequence methods (e.g. BLAST) and profile-sequence methods (e.g. PSI-BLAST). Profile-profile alignment is also the iterated step in progressive multiple sequence alignment algorithms such as CLUSTALW. However, little is known about the relative performance of different profile-profile scoring functions. In this work, we evaluate the alignment accuracy of 23 different profile-profile scoring functions by comparing alignments of 488 pairs of sequences with identity < or =30% against structural alignments. We optimize parameters for all scoring functions on the same training set and use profiles of alignments from both PSI-BLAST and SAM-T99. Structural alignments are constructed from a consensus between the FSSP database and CE structural aligner. We compare the results with sequence-sequence and sequence-profile methods, including BLAST and PSI-BLAST.

Results: We find that profile-profile alignment gives an average improvement over our test set of typically 2-3% over profile-sequence alignment and approximately 40% over sequence-sequence alignment. No statistically significant difference is seen in the relative performance of most of the scoring functions tested. Significantly better results are obtained with profiles constructed from SAM-T99 alignments than from PSI-BLAST alignments.

Availability: Source code, reference alignments and more detailed results are freely available at http://phylogenomics.berkeley.edu/profilealignment/

Citing Articles

Comprehensive evolutionary analysis of growth-regulating factor gene family revealing the potential molecular basis under multiple hormonal stress in crops.

Wang W, Cheng M, Wei X, Wang R, Fan F, Wang Z Front Plant Sci. 2023; 14:1174955.

PMID: 37063175 PMC: 10102486. DOI: 10.3389/fpls.2023.1174955.


PhenoTrack3D: an automatic high-throughput phenotyping pipeline to track maize organs over time.

Daviet B, Fernandez R, Cabrera-Bosquet L, Pradal C, Fournier C Plant Methods. 2022; 18(1):130.

PMID: 36482291 PMC: 9730636. DOI: 10.1186/s13007-022-00961-4.


Genome-Wide Identification and Analysis of the Metallothionein Genes in Genus.

Cheng M, Yuan H, Wang R, Zou J, Liang T, Yang F Int J Mol Sci. 2021; 22(17).

PMID: 34502554 PMC: 8431808. DOI: 10.3390/ijms22179651.


Identification of neural progenitor cells and their progeny reveals long distance migration in the developing octopus brain.

Deryckere A, Styfhals R, Elagoz A, Maes G, Seuntjens E Elife. 2021; 10.

PMID: 34425939 PMC: 8384421. DOI: 10.7554/eLife.69161.


Estimating statistical significance of local protein profile-profile alignments.

Margelevicius M BMC Bioinformatics. 2019; 20(1):419.

PMID: 31409275 PMC: 6693267. DOI: 10.1186/s12859-019-2913-3.