Abstract
Microphone arrays are used in speech signal processing applications such as teleconferencing and telepresence, in order to enhance a desired speech signal in the presence of speech signals from other speakers, reverberation and background noise. These arrays usually provide a single-channel output, so that no spatial information is available in the output signal. However, spatial information on the sound sources may increase the intelligibility of a speech signal perceived by a human listener. This work presents a mathematical framework for generalized spherical array beamforming that in addition to suppressing noise and reverberation, is aiming to preserve spatial information on the sources in the recording venue. The generalized beamforming, formulated in the spherical harmonics domain, is based on binaural sound reproduction where the head-related transfer functions are incorporated into a headphones presentation. The performance of the proposed generalized beamformer is compared to that of a single-channel output maximum-directivity beamformer. Listening tests with human subjects show that when the generalized beamformer is used the intelligibility is improved at low input SNRs.
Original language | English |
---|---|
Pages (from-to) | 238-247 |
Number of pages | 10 |
Journal | IEEE Transactions on Audio, Speech and Language Processing |
Volume | 22 |
Issue number | 1 |
DOIs | |
State | Published - 1 Jan 2014 |
Keywords
- Array processing
- Beamforming
- Binaural sound reproduction
- Spatial release from masking
- Speech intelligibility
- Spherical harmonics
- Spherical microphone arrays
ASJC Scopus subject areas
- Acoustics and Ultrasonics
- Electrical and Electronic Engineering