TY - GEN
T1 - Spatial covariance matrix estimation for reverberant speech with application to speech enhancement
AU - Weisman, Ran
AU - Tourbabin, Vladimir
AU - Calamia, Paul
AU - Rafaely, Boaz
N1 - Publisher Copyright:
© 2020 ISCA
PY - 2020/1/1
Y1 - 2020/1/1
N2 - A wide range of applications in speech and audio signal processing incorporate a model of room reverberation based on the spatial covariance matrix (SCM). Typically, a diffuse sound field model is used, but although the diffuse model simplifies formulations, it may lead to limited accuracy in realistic sound fields, resulting in potential degradation in performance. While some extensions to the diffuse field SCM recently have been presented, accurate modeling for real sound fields remains an open problem. In this paper, a method for estimating the SCM of reverberant speech is proposed, based on the selection of time-frequency bins dominated by reverberation. The method is data-based and estimates the SCM for a specific acoustic scene. It is therefore applicable to realistic reverberant fields. An application of the proposed method to optimal beamforming for speech enhancement is presented, using the plane wave density function in the spherical harmonics (SH) domain. It is shown that the use of the proposed SCM outperforms the commonly used diffuse field SCM, suggesting the method is more successful in capturing the statistics of the late part of the reverberation.
AB - A wide range of applications in speech and audio signal processing incorporate a model of room reverberation based on the spatial covariance matrix (SCM). Typically, a diffuse sound field model is used, but although the diffuse model simplifies formulations, it may lead to limited accuracy in realistic sound fields, resulting in potential degradation in performance. While some extensions to the diffuse field SCM recently have been presented, accurate modeling for real sound fields remains an open problem. In this paper, a method for estimating the SCM of reverberant speech is proposed, based on the selection of time-frequency bins dominated by reverberation. The method is data-based and estimates the SCM for a specific acoustic scene. It is therefore applicable to realistic reverberant fields. An application of the proposed method to optimal beamforming for speech enhancement is presented, using the plane wave density function in the spherical harmonics (SH) domain. It is shown that the use of the proposed SCM outperforms the commonly used diffuse field SCM, suggesting the method is more successful in capturing the statistics of the late part of the reverberation.
KW - Minimum-variance distortionless response
KW - Reverberant speech
KW - Spatial correlation matrix
KW - Spherical arrays
UR - http://www.scopus.com/inward/record.url?scp=85098204487&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2020-2224
DO - 10.21437/Interspeech.2020-2224
M3 - Conference contribution
AN - SCOPUS:85098204487
SN - 9781713820697
T3 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
SP - 4044
EP - 4048
BT - Interspeech 2020
PB - International Speech Communication Association
T2 - 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020
Y2 - 25 October 2020 through 29 October 2020
ER -