Weighted segmental K-means initialization for SOM-based speaker clustering

Oshry Ben-Harush, Itshak Lapidot, Hugo Guterman

Research output: Contribution to journalConference articlepeer-review

11 Scopus citations

Abstract

A new approach for initial assignment of data in a speaker clustering application is presented. This approach employs Weighted Segmental K-Means clustering algorithm prior to competitive based learning. The clustering system relies on Self-Organizing Maps (SOM) for speaker modeling and likelihood estimation. Performance is evaluated on 108 two speaker conversations taken from LDC CALLHOME American English Speech corpus using NIST criterion and shows an improvement of approximately 48% in Cluster Error Rate (CER) relative to the randomly initialized clustering system. The number of iterations was reduced significantly, which contributes to both speed and efficiency of the clustering system.

Original languageEnglish
Pages (from-to)24-27
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
StatePublished - 1 Dec 2008
EventINTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, QLD, Australia
Duration: 22 Sep 200826 Sep 2008

Keywords

  • Clustering
  • Initial conditions
  • K-means
  • SOM
  • Speech

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Sensory Systems

Fingerprint

Dive into the research topics of 'Weighted segmental K-means initialization for SOM-based speaker clustering'. Together they form a unique fingerprint.

Cite this