Deep Performer: Score-to-Audio Music Performance Synthesis

Dong, Hao-Wen; Zhou, Cong; Berg-Kirkpatrick, Taylor; McAuley, Julian

Computer Science > Sound

arXiv:2202.06034 (cs)

[Submitted on 12 Feb 2022 (v1), last revised 21 Feb 2022 (this version, v2)]

Title:Deep Performer: Score-to-Audio Music Performance Synthesis

Authors:Hao-Wen Dong, Cong Zhou, Taylor Berg-Kirkpatrick, Julian McAuley

View PDF

Abstract:Music performance synthesis aims to synthesize a musical score into a natural performance. In this paper, we borrow recent advances in text-to-speech synthesis and present the Deep Performer -- a novel system for score-to-audio music performance synthesis. Unlike speech, music often contains polyphony and long notes. Hence, we propose two new techniques for handling polyphonic inputs and providing a fine-grained conditioning in a transformer encoder-decoder model. To train our proposed system, we present a new violin dataset consisting of paired recordings and scores along with estimated alignments between them. We show that our proposed model can synthesize music with clear polyphony and harmonic structures. In a listening test, we achieve competitive quality against the baseline model, a conditional generative audio model, in terms of pitch accuracy, timbre and noise level. Moreover, our proposed model significantly outperforms the baseline on an existing piano dataset in overall quality.

Comments:	ICASSP 2022 final version with appendix
Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
Cite as:	arXiv:2202.06034 [cs.SD]
	(or arXiv:2202.06034v2 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2202.06034

Submission history

From: Hao-Wen Dong [view email]
[v1] Sat, 12 Feb 2022 10:36:52 UTC (1,529 KB)
[v2] Mon, 21 Feb 2022 03:29:43 UTC (1,606 KB)

Computer Science > Sound

Title:Deep Performer: Score-to-Audio Music Performance Synthesis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Deep Performer: Score-to-Audio Music Performance Synthesis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators