Learning Sub-Dimensional HRTF Representations Towards Individualization Applications - Traditional and Deep Learning Approaches

Devansh Zurale, Shlomo Dubnov

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Individualized Head Related Transfer Functions (HRTFs) are indispensable in order to accurately reproduce spatial audio over headphones. Encoding the high-dimensional HRTFs to a sub-dimensional space has proven to be useful in many previous research efforts in predicting individualized HRTFs. In this work, we provide a comparative study of some traditional methods such as Principle Component Analysis (PCA) or Multi-Layer Perceptron (MLP) based Autoencoders and the more recent generative deep learning approaches such as a Convolutional Neural Network (CNN) based Vector Quantized Variational Autoencoder (VQ-VAE) for learning HRTF representations. We further demonstrate the benefits of using 3D-CNNs for this task to learn correlations between neighboring HRTFs, along both spatial and frequency dimensions. To this end, we provide evidence suggesting that such a 3D-CNN based approach enables the derived latent space to encode features more representative of the individuality of the HRTFs while also allowing for the representations to be significantly more compact. Finally, we also explore the advantages of such robust representations towards downstream applications of predicting Individualized HRTFs.

Original languageEnglish
Title of host publicationProceedings of the 2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023
PublisherInstitute of Electrical and Electronics Engineers
ISBN (Electronic)9798350323726
DOIs
StatePublished - 1 Jan 2023
Externally publishedYes
Event2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023 - New Paltz, United States
Duration: 22 Oct 202325 Oct 2023

Publication series

NameIEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Volume2023-October
ISSN (Print)1931-1168
ISSN (Electronic)1947-1629

Conference

Conference2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023
Country/TerritoryUnited States
CityNew Paltz
Period22/10/2325/10/23

Keywords

  • Generative AI
  • HRTF Modeling
  • PCA
  • Representation Learning
  • VQVAE

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Learning Sub-Dimensional HRTF Representations Towards Individualization Applications - Traditional and Deep Learning Approaches'. Together they form a unique fingerprint.

Cite this