Spherical array processing with binaural sound reproduction for improved speech intelligibility

Noam R. Shabtai, Boaz Rafaely

Research output: Contribution to journalConference articlepeer-review

3 Scopus citations

Abstract

In telecommunication applications, interfering sounds and reverberation can have a detrimental effect on speech intelligibility. For this reason, microphone arrays have been recently employed in telecommunication systems for natural environments. Currently applied array processing methods typically aim to produce array output which is optimal on signal-based measures e.g. signal-to-noise ratio (SNR). These measures may be particularly appropriate when the receiver is a machine. However, in order to enhance speech intelligibility when the receiver is another human, it may be desired to trigger spatial hearing capabilities of the human auditory system, such as the cocktail party effect. In particular, spatial-release from masking has been investigated. This work presents a spherical array signal processing framework in which array output is generated binaurally using the head-related transfer function. In this framework both target direction is enhanced and spatial information of all sources are perceived by the listener. The performance of the proposed binaural beamformer is compared to the performance of a non-binaural maximum directivity beamformer based on a spatial reproduction listening tests. The average percentage correct decision is calculated over 5 subjects, and is shown to be higher when the binaural beamformer is used for every tested SNR.

Original languageEnglish
Article number055050
JournalProceedings of Meetings on Acoustics
Volume19
DOIs
StatePublished - 19 Jun 2013
Event21st International Congress on Acoustics, ICA 2013 - 165th Meeting of the Acoustical Society of America - Montreal, QC, Canada
Duration: 2 Jun 20137 Jun 2013

ASJC Scopus subject areas

  • Acoustics and Ultrasonics

Fingerprint

Dive into the research topics of 'Spherical array processing with binaural sound reproduction for improved speech intelligibility'. Together they form a unique fingerprint.

Cite this