TY - GEN
T1 - MUSIC SKETCHNET
T2 - 21st International Society for Music Information Retrieval Conference, ISMIR 2020
AU - Chen, Ke
AU - Wang, Cheng I.
AU - Berg-Kirkpatrick, Taylor
AU - Dubnov, Shlomo
N1 - Publisher Copyright:
© Ke Chen, Cheng-i Wang, Taylor Berg-Kirkpatrick, Shlomo Dubnov.
PY - 2020/1/1
Y1 - 2020/1/1
N2 - Drawing an analogy with automatic image completion systems, we propose Music SketchNet, a neural network framework that allows users to specify partial musical ideas guiding automatic music generation. We focus on generating the missing measures in incomplete monophonic musical pieces, conditioned on surrounding context, and optionally guided by user-specified pitch and rhythm snippets. First, we introduce SketchVAE, a novel variational autoencoder that explicitly factorizes rhythm and pitch contour to form the basis of our proposed model. Then we introduce two discriminative architectures, SketchInpainter and SketchConnector, that in conjunction perform the guided music completion, filling in representations for the missing measures conditioned on surrounding context and user-specified snippets. We evaluate SketchNet on a standard dataset of Irish folk music and compare with models from recent works. When used for music completion, our approach outperforms the state-of-the-art both in terms of objective metrics and subjective listening tests. Finally, we demonstrate that our model can successfully incorporate user-specified snippets during the generation process.
AB - Drawing an analogy with automatic image completion systems, we propose Music SketchNet, a neural network framework that allows users to specify partial musical ideas guiding automatic music generation. We focus on generating the missing measures in incomplete monophonic musical pieces, conditioned on surrounding context, and optionally guided by user-specified pitch and rhythm snippets. First, we introduce SketchVAE, a novel variational autoencoder that explicitly factorizes rhythm and pitch contour to form the basis of our proposed model. Then we introduce two discriminative architectures, SketchInpainter and SketchConnector, that in conjunction perform the guided music completion, filling in representations for the missing measures conditioned on surrounding context and user-specified snippets. We evaluate SketchNet on a standard dataset of Irish folk music and compare with models from recent works. When used for music completion, our approach outperforms the state-of-the-art both in terms of objective metrics and subjective listening tests. Finally, we demonstrate that our model can successfully incorporate user-specified snippets during the generation process.
UR - http://www.scopus.com/inward/record.url?scp=85179211237&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85179211237
T3 - Proceedings of the 21st International Society for Music Information Retrieval Conference, ISMIR 2020
SP - 77
EP - 84
BT - Proceedings of the 21st International Society for Music Information Retrieval Conference, ISMIR 2020
A2 - Cumming, Julie
A2 - Lee, Jin Ha
A2 - McFee, Brian
A2 - Schedl, Markus
A2 - Devaney, Johanna
A2 - Devaney, Johanna
A2 - McKay, Cory
A2 - Zangerle, Eva
A2 - de Reuse, Timothy
PB - International Society for Music Information Retrieval
Y2 - 11 October 2020 through 16 October 2020
ER -