A general method applicable to the search for similarities in the amino acid sequence of two proteins

https://doi.org/10.1016/0022-2836(70)90057-4Get rights and content

Abstract

A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed. From these findings it is possible to determine whether significant homology exists between the proteins. This information is used to trace their possible evolutionary development.

The maximum match is a number dependent upon the similarity of the sequences. One of its definitions is the largest number of amino acids of one protein that can be matched with those of a second protein allowing for all possible interruptions in either of the sequences. While the interruptions give rise to a very large number of comparisons, the method efficiently excludes from consideration those comparisons that cannot contribute to the maximum match.

Comparisons are made from the smallest unit of significance, a pair of amino acids, one from each protein. All possible pairs are represented by a two-dimensional array, and all possible comparisons are represented by pathways through the array. For this maximum match only certain of the possible pathways must be evaluated. A numerical value, one in this case, is assigned to every cell in the array representing like amino acids. The maximum match is the largest number that would result from summing the cell values of every pathway.

References (11)

  • G. Braunitzer
  • R. Canfield

    J. Biol. Chem

    (1963)
  • W. Fitch

    J. Mol. Biol

    (1966)
  • W. Konigsberg et al.

    J. Biol. Chem

    (1963)
  • D.G. Smyth et al.

    J. Biol. Chem

    (1963)
There are more references available in the full text version of this article.

Cited by (0)

This work was supported in part by grants to one of us (S.B.N.) from the U.S. Public Health Service (1 501 FR 05370 02) and from Merck Sharp & Dohme.

View full text