A New Approach for Tuned Clustering Analysis

Roni Ben Ishay, Maya Herman, Chaim Yosefi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this work, we present a new data mining (DM) approach (called tuned clustering analysis), which integrates clustering, and tuned clustering analysis. Usually, clusters which contain borderline results may be dismissed or ignored during the analysis stage. As a result, hidden insights that may be represented by these clusters, may not be revealed. This may harm the overall DM quality and especially, important hidden insights may be uncovered. Our new approach offers an iterative process which assist the data miner to make appropriate analysis decisions, and avoid dismissing possible insights. The idea is to apply an iterative DM process: clustering, analyzing, presenting new insights, or tuning and re-clustering those clusters which have borderline values. Clusters with borderline values are chosen and a new sub-database is built. Then, the sub-database is split, based on the attribute with the highest Entropy value. The tuning iterations, continues until new insights were found, or if the clusters quality are below a certain threshold. We demonstrated the tuned clustering analysis on real Echo heart measurements, using km-Impute clustering algorithm. During the implementation, initial clusters were produced. Although the quality of the clusters was high, no new medical insights were revealed. Therefore, we applied a clustering tuning and succeeded in finding new medical insights such as the influence of gender and the age on cardiac functioning and clinical modifications, with regard to resilience to diastolic disorder. Applying our approach has successfully managed to reveal new medical insights which were restored from borderline value clusters. This stands in contrast to traditional analysis methods, in which these potential insights may be missed or ignored.

Original languageEnglish
Title of host publicationMachine Learning and Data Mining in Pattern Recognition - 14th International Conference, MLDM 2018, Proceedings
EditorsPetra Perner
PublisherSpringer Verlag
Pages436-452
Number of pages17
ISBN (Print)9783319961354
DOIs
StatePublished - 1 Jan 2018
Event14th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2018 - New York, United States
Duration: 15 Jul 201819 Jul 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10934 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference14th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2018
Country/TerritoryUnited States
CityNew York
Period15/07/1819/07/18

Keywords

  • Clustering
  • Clustering analysis
  • Data mining
  • Imputation
  • Medical data mining
  • Missing values

Fingerprint

Dive into the research topics of 'A New Approach for Tuned Clustering Analysis'. Together they form a unique fingerprint.

Cite this