Abstract
Modern speech processing applications require operation on signal of interest that is contaminated by high level of noise. This situation calls for a greater robustness in estimation of the speech parameters, a task which is hard to achieve using standard speech models. In this paper, we present an optimal estimation procedure for sound signals (such as speech) that are modeled by harmonic sources. The harmonic model achieves more robust and accurate estimation of voiced speech parameters. Using maximum a posteriori probability framework, successful tracking of pitch parameters is possible in ultra low signal to noise conditions (as low as -15 dB). The performance of the method is evaluated using the Keele pitch detection database with realistic background noise. The results show best performance in comparison to other state-of-the-art pitch detectors. Application of the proposed algorithm in a simple speaker identification system shows significant improvement in the performance.
Original language | English |
---|---|
Pages (from-to) | 76-87 |
Number of pages | 12 |
Journal | IEEE Transactions on Speech and Audio Processing |
Volume | 12 |
Issue number | 1 |
DOIs | |
State | Published - 1 Jan 2004 |
Keywords
- Cramer-Rao bound
- Harmonic model
- MAP estimator
- Markov model
- Maximum likelihood
- Noisy speech
- PDA
- Pitch detection
- Pitch tracking
- Speech denoising
ASJC Scopus subject areas
- Software
- Acoustics and Ultrasonics
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering