TY - GEN
T1 - Generalized Viterbi-based models for time-series segmentation applied to speaker diarization
AU - Lapidot, Itshak
AU - Bonastre, Jean Francois
N1 - Publisher Copyright:
© Odyssey 2012 - Speaker and Language Recognition Workshop. All rights reserved.
PY - 2012/1/1
Y1 - 2012/1/1
N2 - Time-series clustering is a process which takes into account the input samples chronological sequence. So, in time-series clustering, the samples are not processed independently as a result for a given sample depends on the clustering result of the whole sequence. One of the popular clustering algorithms to handle such dependency is the well-known Hidden-Markov-Model (HMM) trained by the Viterbi statistics. In this work we propose a generalization of the broadly used HMM, denoted Hidden-Distortion-Models (HDMs). Our proposal is based on distortion-based models and transition count, for which probabilistic calculations are no longer mandatory. We will introduce our approach by its mathematical bases. It will be shown that Viterbi based HMM can be seen as a special case of HDM. This proximity allows to us to apply similar approaches for state-model training when the new paradigm is used to learn the sequence dependencies. Speaker diarization application will be presented to show the advantages of the HDM as a clustering algorithm.
AB - Time-series clustering is a process which takes into account the input samples chronological sequence. So, in time-series clustering, the samples are not processed independently as a result for a given sample depends on the clustering result of the whole sequence. One of the popular clustering algorithms to handle such dependency is the well-known Hidden-Markov-Model (HMM) trained by the Viterbi statistics. In this work we propose a generalization of the broadly used HMM, denoted Hidden-Distortion-Models (HDMs). Our proposal is based on distortion-based models and transition count, for which probabilistic calculations are no longer mandatory. We will introduce our approach by its mathematical bases. It will be shown that Viterbi based HMM can be seen as a special case of HDM. This proximity allows to us to apply similar approaches for state-model training when the new paradigm is used to learn the sequence dependencies. Speaker diarization application will be presented to show the advantages of the HDM as a clustering algorithm.
UR - https://www.scopus.com/pages/publications/85068761782
M3 - Conference contribution
AN - SCOPUS:85068761782
T3 - Odyssey 2012 - Speaker and Language Recognition Workshop
SP - 138
EP - 145
BT - Odyssey 2012 - Speaker and Language Recognition Workshop
A2 - Li, Haizhou
A2 - Ma, Bin
A2 - Lee, Kong Aik
PB - Chinese and Oriental Languages Information Processing Society (COLIPS), Speaker and Language Characterization SIG
T2 - Speaker and Language Recognition Workshop, Odyssey 2012
Y2 - 25 June 2012 through 28 June 2012
ER -