A comparison of cluster validity criteria for a mixture of normal distributed data

Amir B. Geva, Yossef Steinberg, Shay Bruckmair, Gerry Nahum

Research output: Contribution to journalArticlepeer-review

28 Scopus citations

Abstract

Many validity criteria have been proposed over the years in order to validate clustering of unlabeled data sets. In this research we compared the performance of several known validity criteria to several new validity criteria for a mixture of normally distributed data. The main group of the new criteria includes modifications of the Gath and Geva partition and average density criteria while one new criterion is based on the generalized Neyman-Pearson (GNP) test for normality. The comparison was performed by using simulated Gaussian data sets, which were built from 1 to 5 clusters in 1-4 dimensions with a variety of clusters means and variances. The clustering process was implemented by the unsupervised optimal fuzzy clustering (UOFC) algorithm that combines the fuzzy c-means (FCM) algorithm and a fuzzy modification of the maximum likelihood estimation algorithm (FMLE). We conclude that in general, there is no single validity criterion that consistently performed much better than the others under all conditions, but nevertheless we can state clearly that some of the new validity criteria showed advantages in validating most of the simulated Gaussian data sets.

Original languageEnglish
Pages (from-to)511-529
Number of pages19
JournalPattern Recognition Letters
Volume21
Issue number6-7
DOIs
StatePublished - 1 Jan 2000

Keywords

  • Cluster validity
  • Entropy maximization
  • Generalized Neyman-Pearson (GNP) criterion
  • Hypothesis testing
  • Mixture of normal distributed data
  • Unsupervised clustering

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'A comparison of cluster validity criteria for a mixture of normal distributed data'. Together they form a unique fingerprint.

Cite this