» Articles » PMID: 6100188

Aligning Amino Acid Sequences: Comparison of Commonly Used Methods

Overview
Journal J Mol Evol
Specialty Biochemistry
Date 1984 Jan 1
PMID 6100188
Citations 124
Authors
Affiliations
Soon will be listed here.
Abstract

We examined two extensive families of protein sequences using four different alignment schemes that employ various degrees of "weighting" in order to determine which approach is most sensitive in establishing relationships. All alignments used a similarity approach based on a general algorithm devised by Needleman and Wunsch. The approaches included a simple program, UM (unitary matrix), whereby only identities are scored; a scheme in which the genetic code is used as a basis for weighting (GC); another that employs a matrix based on structural similarity of amino acids taken together with the genetic basis of mutation (SG); and a fourth that uses the empirical log-odds matrix (LOM) developed by Dayhoff on the basis of observed amino acid replacements. The two sequence families examined were (a) nine different globins and (b) nine different tyrosine kinase-like proteins. It was assumed a priori that all members of a family share common ancestry. In cases where two sequences were more than 30% identical, alignments by all four methods were almost always the same. In cases where the percentage identity was less than 20%, however, there were often significant differences in the alignments. On the average, the Dayhoff LOM approach was the most effective in verifying distant relationships, as judged by an empirical "jumbling test." This was not universally the case, however, and in some instances the simple UM was actually as good or better. Trees constructed on the basis of the various alignments differed with regard to their limb lengths, but had essentially the same branching orders. We suggest some reasons for the different effectivenesses of the four approaches in the two different sequence settings, and offer some rules of thumb for assessing the significance of sequence relationships.

Citing Articles

Characterization on the oncogenic effect of the missense mutations of p53 via machine learning.

Pan Q, Portelli S, Nguyen T, Ascher D Brief Bioinform. 2023; 25(1).

PMID: 38018912 PMC: 10685404. DOI: 10.1093/bib/bbad428.


Construction and characterization of an infectious cDNA clone of potato virus S developed from selected populations that survived genetic bottlenecks.

Li X, Hataya T Virol J. 2019; 16(1):18.

PMID: 30728059 PMC: 6364481. DOI: 10.1186/s12985-019-1124-x.


Guiding the humoral response against HIV-1 toward a MPER adjacent region by immunization with a VLP-formulated antibody-selected envelope variant.

Beltran-Pavez C, Ferreira C, Merino-Mansilla A, Fabra-Garcia A, Casadella M, Noguera-Julian M PLoS One. 2018; 13(12):e0208345.

PMID: 30566493 PMC: 6300218. DOI: 10.1371/journal.pone.0208345.


Differential Shape of Geminivirus Mutant Spectra Across Cultivated and Wild Hosts With Invariant Viral Consensus Sequences.

Sanchez-Campos S, Dominguez-Huerta G, Diaz-Martinez L, Tomas D, Navas-Castillo J, Moriones E Front Plant Sci. 2018; 9:932.

PMID: 30013589 PMC: 6036239. DOI: 10.3389/fpls.2018.00932.


Lethal mutagenesis of an RNA plant virus via lethal defection.

Diaz-Martinez L, Brichette-Mieg I, Pineno-Ramos A, Dominguez-Huerta G, Grande-Perez A Sci Rep. 2018; 8(1):1444.

PMID: 29362502 PMC: 5780445. DOI: 10.1038/s41598-018-19829-6.


References
1.
FITCH W, MARGOLIASH E . Construction of phylogenetic trees. Science. 1967; 155(3760):279-84. DOI: 10.1126/science.155.3760.279. View

2.
Rapp U, Goldsborough M, Mark G, Bonner T, Groffen J, Reynolds Jr F . Structure and biological activity of v-raf, a unique oncogene transduced by a retrovirus. Proc Natl Acad Sci U S A. 1983; 80(14):4218-22. PMC: 384008. DOI: 10.1073/pnas.80.14.4218. View

3.
McLachlan A . Repeating sequences and gene duplication in proteins. J Mol Biol. 1972; 64(2):417-37. DOI: 10.1016/0022-2836(72)90508-6. View

4.
Garlick R, Riggs A . The amino acid sequence of a major polypeptide chain of earthworm hemoglobin. J Biol Chem. 1982; 257(15):9005-15. View

5.
Kitamura N, Kitamura A, Toyoshima K, Hirayama Y, Yoshida M . Avian sarcoma virus Y73 genome sequence and structural similarity of its transforming gene product to that of Rous sarcoma virus. Nature. 1982; 297(5863):205-8. DOI: 10.1038/297205a0. View