PiCoGen2: Piano cover generation with transfer learning approach and weakly aligned data

Tan, Chih-Pin; Ai, Hsin; Chang, Yi-Hsin; Guan, Shuen-Huei; Yang, Yi-Hsuan

Computer Science > Sound

arXiv:2408.01551 (cs)

[Submitted on 2 Aug 2024]

Title:PiCoGen2: Piano cover generation with transfer learning approach and weakly aligned data

Authors:Chih-Pin Tan, Hsin Ai, Yi-Hsin Chang, Shuen-Huei Guan, Yi-Hsuan Yang

View PDF HTML (experimental)

Abstract:Piano cover generation aims to create a piano cover from a pop song. Existing approaches mainly employ supervised learning and the training demands strongly-aligned and paired song-to-piano data, which is built by remapping piano notes to song audio. This would, however, result in the loss of piano information and accordingly cause inconsistencies between the original and remapped piano versions. To overcome this limitation, we propose a transfer learning approach that pre-trains our model on piano-only data and fine-tunes it on weakly-aligned paired data constructed without note remapping. During pre-training, to guide the model to learn piano composition concepts instead of merely transcribing audio, we use an existing lead sheet transcription model as the encoder to extract high-level features from the piano recordings. The pre-trained model is then fine-tuned on the paired song-piano data to transfer the learned composition knowledge to the pop song domain. Our evaluation shows that this training strategy enables our model, named PiCoGen2, to attain high-quality results, outperforming baselines on both objective and subjective metrics across five pop genres.

Comments:	Accepted at the 25th International Society for Music Information Retrieval Conference (ISMIR), 2024
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2408.01551 [cs.SD]
	(or arXiv:2408.01551v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2408.01551

Submission history

From: Chih-Pin Tan [view email]
[v1] Fri, 2 Aug 2024 19:45:18 UTC (5,526 KB)

Computer Science > Sound

Title:PiCoGen2: Piano cover generation with transfer learning approach and weakly aligned data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:PiCoGen2: Piano cover generation with transfer learning approach and weakly aligned data

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators