Abstract
We assemble here properties of certain dissimilarity coefficients and are specially concerned with their metric and Euclidean status. No attempt is made to be exhaustive as far as coefficients are concerned, but certain mathematical results that we have found useful are presented and should help establish similar properties for other coefficients. The response to different types of data is investigated, leading to guidance on the choice of an appropriate coefficient.
Résumé
Ce travail présente quelques propriétés de certains coefficients de ressemblance et en particulier leur capacité de produire des matrices de distance métriques et euclidiennes. Sans prétendre être exhaustifs dans cette revue de coefficients, nous présentons certains résultats mathématiques que nous croyons intéressants et qui pourraient être établis pour d'autres coefficients. Finalement, nous analysons la réponse des mesures de ressemblance face à différents types de données, ce qui permet de formuler des recommandations quant au choix d'un coefficient.
Similar content being viewed by others
References
BAKER, F.B. (1974), “Stability of Two Hierarchical Grouping Techniques. Case 1: Sensitivity to Data Errors,”Journal of the American Statistical Association, 69, 440–445.
BLASHFIELD, R.K. (1976), “Mixture Model Tests of Cluster Analysis: Accuracy of Four Agglomerative Hierarchical Methods,”Psychological Bulletin, 83, 377–388.
BLOOM, S.A. (1981), “Similarity Indices in Community Studies: Potential Pitfalls,”Marine Ecology Progress Series, 5, 125–128.
CAILLIEZ, F. (1983), “The Analytical Solution to the Additive Constant Problem,”Psychometrika, 48, 305–308.
CAILLIEZ, F., and PAGES, J.-P. (1976),Introduction à l'analyse des données, Paris: Société de Mathématiques appliquées et de Sciences humaines.
CHARLTON, J.R.H., and WYNN, H.P. (1985), “Metric Scaling and Infinitely Divisible Distributions: Schoenberg's Theorem,” Personal Communication.
CUNNINGHAM, K.M., and OGILVIE, J.C. (1972), “Evaluation of Hierarchical Grouping Techniques: A Preliminary Study,”Computer Journal, 15, 209–213.
EESTABROOK, G.F., and ROGERS, D.J. (1966), “A General Method of Taxonomic Description for a Computed Similarity Measure,”BioScience, 16, 789–793.
EVERITT, B. (1974),Cluster Analysis, London: Heinemann Educational Books.
FAITH, D.P. (1985), “Distance Methods and the Approximation of Most-Parsimonious Trees,”Systematic Zoology, 34, 312–325.
FISHER, L., and VAN NESS, J.W. (1971), “Admissible Clustering Procedures,”Biometrika, 58, 91–104.
GOWER, J.C. (1971), “A General Coefficient of Similarity and Some of its Properties,”Biometrics, 27, 857–871.
GOWER, J.C. (1982), “Euclidean Distance Geometry,”Mathematical Scientist, 7, 1–14.
GOWER, J.C. (1984a), “Multivariate Analysis: Ordination, Multidimensional Scaling and Allied Topics,” inHandbook of Applicable Mathematics, Vol. VI: Statistics, Part B, Ed. E. Lloyd, Chichester: John Wiley and Sons, 727–781.
GOWER, J.C. (1984b), “Distance Matrices and Their Euclidean Approximation,” inData Analysis and Informatics, 3, Eds. E. Diday, M. Jambu, L. Lebart, J. Pagès and R. Tomassone, Amsterdam: North-Holland, 3–21.
GOWER, J.C. (1985), “Measures of Similarity, Dissimilarity, and Distance,” inEncyclopedia of Statistical Sciences, Vol. 5, Eds. S. Kotz, N.L. Johnson and C.B. Read, New York: John Wiley and Sons, 397–405.
HAJDU, L.J. (1981), “Graphical Comparison of Resemblance Measures in Phytosociology,”Vegetatio, 48, 47–59.
HUBERT, L. (1974), “Approximate Evaluation Techniques for the Single-Link and Complete-Link Hierarchical Clustering Procedures,”Journal of the American Statistical Association, 69, 698–704.
JACCARD, P. (1901), “Etude Comparative de la Distribution Florale dans une Portion des Alpes et des Jura,”Bulletin de la Société vaudoise des Sciences Naturelles, 37, 547–579.
JARDINE, N., and SIBSON, R. (1968), “The Construction of Hierarchic and Non-Hierarchic Classifications,”Computer Journal, 11, 177–184.
KULCZYNSKI, S. (1928), “Die Pflanzenassoziationen der Pieninen,”Bulletin international de l'Académie polonaise des Sciences et des Lettres, Classe des Sciences mathématiques et naturelles, Série B, Supplément II (1927), 57–203.
LEGENDRE, P., and CHODOROWSKI, A. (1977), “A Generalization of Jaccard's Association Coefficient for Q Analysis of Multi-State Ecological Data Matrices,”Ekologia Polska, 25, 297–308.
LEGENDRE, P., DALLOT, S., and LEGENDRE, L. (1985), “Succession of Species Within a Community: Chronological Clustering, with Applications to Marine and Freshwater Zooplankton,”American Naturalist, 125, 257–288.
LEGENDRE, L., and LEGENDRE, P. (1983a),Numerical Ecology, Developments in Environmental Modelling, Vol. 3, Amsterdam: Elsevier Scientific Publishing Company.
LEGENDRE, L. and LEGENDRE, P. (1983b), “Partitioning Ordered Variables into Discrete States for Discriminant Analysis of Ecological Classifications,”Canadian Journal of Zoology, 61, 1002–1010.
LINGOES, J.C. (1971), “Some Boundary Conditions for a Monotone Analysis of Symmetric Matrices,”Psychometrika, 36, 195–203.
MIRSKY, L. (1955).Introduction to Linear Algebra, Oxford: Oxford University Press.
ORLOCI, L. (1978),Multivariate Analysis in Vegetation Research, Second Edition, The Hague: Dr. W. Junk B.V.
RAND, W.M. (1971), “Objective Criteria for the Evaluation of Clustering Methods,”Journal of the American Statistical Association, 66, 846–850.
RENKONEN, O. (1938), “Statistisch-ökologische Untersuchungen über die terrestische Käferwelt der finnischen Bruchmoore,”Annales Zoologici Societatis Zoologicae-Botanicae Fennicae ‘Vanamo’, 6, 1–231.
SCHOENBERG, I.J. (1935), “Remarks to Maurice Fréchet's article ‘Sur la définition axiomatique d'une classe d'espaces vectoriels distanciés applicables vectoriellement sur l'espace de Hilbert’,”Annals of Mathematics, 36, 724–732.
SIBSON, R. (1971), “Some Observations on a Paper by Lance and Williams,”Computer Journal, 14, 156–157.
SIBSON, R. (1979), “Studies in the Robustness of Multidimensional Scaling: Perturbational Analysis of Classical Scaling,”Journal of the Royal Statistical Society, Series B, 41, 217–229.
SPATH, H. (1980),Cluster Analysis Algorithms for Data Reduction and Classification of Objects, translated by Ursula Bull, Chichester: Ellis Horwood Ltd., and New York: John Wiley and Sons.
WILLIAMS, W.T., CLIFFORD, H.T., and LANCE, G.N. (1971a), “Group-size Dependence: A Rationale for Choice Between Numerical Classifications,”Computer Journal, 14, 157–162.
WILLIAMS, W.T., LANCE, G.N., DALE, M.B., and CLIFFORD, H.T. (1971b), “Controversy Concerning the Criteria for Taxonometric Strategies,”Computer Journal, 14, 162–165.
WOLDA, H. (1981), “Similarity Indices, Sample Size and Diversity,”Oecologia (Berl.), 50, 296–302.
ZEGERS, F.E. (1986), “Two Classes of Element-Wise Transformations Preserving the Positive Semi-Definite Nature of Coefficient Matrices,”Journal of Classification, 3, 49–53.
Author information
Authors and Affiliations
Additional information
The authors wish to thank the referees, one of whom did a magnificent job in painstakingly checking the detailed algebra and detecting several slips.
Rights and permissions
About this article
Cite this article
Gower, J.C., Legendre, P. Metric and Euclidean properties of dissimilarity coefficients. Journal of Classification 3, 5–48 (1986). https://doi.org/10.1007/BF01896809
Issue Date:
DOI: https://doi.org/10.1007/BF01896809