Sunday 10:00–10:45 in LG6

Assessing the quality of a clustering

Christian Hennig

Audience level:
Intermediate

Description

There are many different methods for finding groups in data (cluster analysis), and on many datasets they will deliver different results. How good a clustering is for given data depends on the aim of clustering. I will present a number of methods that can be used to assess the quality of a clustering and to compare different clusterings, taking into account different aims of clustering.

Abstract

There are many different methods for finding groups in data (cluster analysis), and on many datasets they will deliver different results. How good a clustering is for given data depends on the aim of clustering and on the user's concept of what makes objects "belong together". I will present some approaches to assess the quality of a clustering and to compare different clusterings. Particularly, I will present some indexes that measure various desirable aspects of a clustering such as stability, separateness of clusters etc. Different aims of clustering can be taken into account by specifying which aspects are particularly relevant in the situation at hand.