Skip to main content

Evolutionary Multiobjective Clustering

  • Conference paper
Parallel Problem Solving from Nature - PPSN VIII (PPSN 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3242))

Included in the following conference series:

Abstract

A new approach to data clustering is proposed, in which two or more measures of cluster quality are simultaneously optimized using a multiobjective evolutionary algorithm (EA). For this purpose, the PESA-II EA is adapted for the clustering problem by the incorporation of specialized mutation and initialization procedures, described herein. Two conceptually orthogonal measures of cluster quality are selected for optimization, enabling, for the first time, a clustering algorithm to explore and improve different compromise solutions during the clustering process. Our results, on a diverse suite of 15 real and synthetic data sets – where the correct classes are known – demonstrate a clear advantage to the multiobjective approach: solutions in the discovered Pareto set are objectively better than those obtained when the same EA is applied to optimize just one measure. Moreover, the multiobjective EA exhibits a far more robust level of performance than both the classic k-means and average-link agglomerative clustering algorithms, outperforming them substantially on aggregate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 74.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Supporting material, http://dbk.ch.umist.ac.uk/handl/vienna/

  2. Blake, C., Merz, C.: UCI repository of machine learning databases. Technical report, Department of Information and Computer Sciences, University of California, Irvine (1998), http://, http://www.ics.uci.edu/~mlearn/MLRepository.html

  3. Brucker, P.: On the complexity of clustering problems. In: Optimization and Operations Research, pp. 45–54. Springer, New York (1977)

    Google Scholar 

  4. Cole, R.M.: Clustering with genetic algorithms. Master’s thesis, University of Western Australia, Nedlands 6907, Australia (1998)

    Google Scholar 

  5. Corne, D.W., Jerram, N.R., Knowles, J.D., Oates, M.J.: PESA-II: regionbased selection in evolutionary multiobjective optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), pp. 283–290. Morgan Kaufmann Publishers, San Francisco (2001)

    Google Scholar 

  6. Corne, D.W., Knowles, J.D., Oates, M.J.: The Pareto envelope-based selection algorithm for multiobjective optimization. In: Proceedings of the Parallel Problem Solving from Nature VI Conference, pp. 839–848. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  7. Deb, K.: Multi-objective optimization using evolutionary algorithms. John Wiley & Sons, Chichester (2001)

    MATH  Google Scholar 

  8. Demiriz, A., Bennett, K., Embrechts, M.: Semi-supervised clustering using genetic algorithms. Technical report, Rensselaer Polytechnic Institute, Troy, New York (1999)

    Google Scholar 

  9. Falkenauer, E.: Genetic Algorithms and Grouping Problems. John Wiley & Sons, Chichester (1998)

    Google Scholar 

  10. Gablentz, W., Köppen, M., Dimitriadou, E.: Robust clustering by evolutionary computation. In: 5th Online World Conference on Soft Computing in Industrial Applications (WSC5), The Internet (2000)

    Google Scholar 

  11. Halkidi, M., Vazirgiannis, M., Batistakis, I.: Quality scheme assessment in the clustering process. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 265–267. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  12. Jain, A.K., Murty, M.N., Flynn, P.: Data clustering: a review. ACM Computing Surveys 31(3), 264–323 (1999)

    Article  Google Scholar 

  13. Law, M., Topchy, A., Jain, A.K.: Multiobjective data clustering. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (June 2004) (to appear)

    Google Scholar 

  14. Lozano, J.A., Larrañaga, P.: Applying genetic algorithms to search for the best hierarchical clustering of a dataset. Pattern Recognition Letters 20, 911–918 (1999)

    Article  Google Scholar 

  15. MacQueen, L.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)

    Google Scholar 

  16. Maulik, U., Bandyopadhyay, S.: Genetic algorithm-based clustering technique. Pattern Recognition 33, 1455–1465 (2000)

    Article  Google Scholar 

  17. Radcliffe, N.J.: Equivalence class analysis of genetic algorithms. Complex Systems 5, 183–205 (1991)

    MATH  MathSciNet  Google Scholar 

  18. Topchy, A., Jain, A.K., Punch, W.: A mixture model for clustering ensembles. In: Proceedings SIAM Conf. on Data Mining (2004) (in press)

    Google Scholar 

  19. van Rijsbergen, C.: Information Retrieval, 2nd edn. Butterworths, London (1979)

    Google Scholar 

  20. Vorhees, E.: The effectiveness and efficiency of agglomerative hierarchical clustering in document retrieval. PhD thesis, Department of Computer Science, Cornell University (1985)

    Google Scholar 

  21. Weisstein, E.W.: Box-and-whisker plot. From MathWorld – A Wolfram Web Resource, http://mathworld.wolfram.com/Box-and-WhiskerPlot.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Handl, J., Knowles, J. (2004). Evolutionary Multiobjective Clustering. In: Yao, X., et al. Parallel Problem Solving from Nature - PPSN VIII. PPSN 2004. Lecture Notes in Computer Science, vol 3242. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30217-9_109

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30217-9_109

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23092-2

  • Online ISBN: 978-3-540-30217-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics