TOWARDS IMPROVING HARMONIC SENSITIVITY AND PREDICTION STABILITY FOR SINGING MELODY EXTRACTION

Keren Shao, Ke Chen, Taylor Berg-Kirkpatrick, Shlomo Dubnov

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In deep learning research, many melody extraction models rely on redesigning neural network architectures to improve performance. In this paper, we propose an input feature modification and a training objective modification based on two assumptions. First, harmonics in the spectrograms of audio data decay rapidly along the frequency axis. To enhance the model's sensitivity on the trailing harmonics, we modify the Combined Frequency and Periodicity (CFP) representation using discrete z-transform. Second, the vocal and non-vocal segments with extremely short duration are uncommon. To ensure a more stable melody contour, we design a differentiable loss function that prevents the model from predicting such segments. We apply these modifications to several models, including MSNet, FTANet, and a newly introduced model, PianoNet, modified from a piano transcription network. Our experimental results demonstrate that the proposed modifications are empirically effective for singing melody extraction.

Original languageEnglish
Title of host publication24th International Society for Music Information Retrieval Conference, ISMIR 2023 - Proceedings
EditorsAugusto Sarti, Fabio Antonacci, Mark Sandler, Paolo Bestagini, Simon Dixon, Beici Liang, Gael Richard, Johan Pauwels
PublisherInternational Society for Music Information Retrieval
Pages657-663
Number of pages7
ISBN (Electronic)9781732729933
StatePublished - 1 Jan 2023
Externally publishedYes
Event24th International Society for Music Information Retrieval Conference, ISMIR 2023 - Milan, Italy
Duration: 5 Nov 20239 Nov 2023

Publication series

Name24th International Society for Music Information Retrieval Conference, ISMIR 2023 - Proceedings

Conference

Conference24th International Society for Music Information Retrieval Conference, ISMIR 2023
Country/TerritoryItaly
CityMilan
Period5/11/239/11/23

ASJC Scopus subject areas

  • Music
  • Information Systems

Fingerprint

Dive into the research topics of 'TOWARDS IMPROVING HARMONIC SENSITIVITY AND PREDICTION STABILITY FOR SINGING MELODY EXTRACTION'. Together they form a unique fingerprint.

Cite this