Modeling the Compatibility of Stem Tracks to Generate Music Mashups

Huang, Jiawen; Wang, Ju-Chiang; Smith, Jordan B. L.; Song, Xuchen; Wang, Yuxuan

Computer Science > Sound

arXiv:2103.14208 (cs)

[Submitted on 26 Mar 2021]

Title:Modeling the Compatibility of Stem Tracks to Generate Music Mashups

Authors:Jiawen Huang, Ju-Chiang Wang, Jordan B. L. Smith, Xuchen Song, Yuxuan Wang

View PDF

Abstract:A music mashup combines audio elements from two or more songs to create a new work. To reduce the time and effort required to make them, researchers have developed algorithms that predict the compatibility of audio elements. Prior work has focused on mixing unaltered excerpts, but advances in source separation enable the creation of mashups from isolated stems (e.g., vocals, drums, bass, etc.). In this work, we take advantage of separated stems not just for creating mashups, but for training a model that predicts the mutual compatibility of groups of excerpts, using self-supervised and semi-supervised methods. Specifically, we first produce a random mashup creation pipeline that combines stem tracks obtained via source separation, with key and tempo automatically adjusted to match, since these are prerequisites for high-quality mashups. To train a model to predict compatibility, we use stem tracks obtained from the same song as positive examples, and random combinations of stems with key and/or tempo unadjusted as negative examples. To improve the model and use more data, we also train on "average" examples: random combinations with matching key and tempo, where we treat them as unlabeled data as their true compatibility is unknown. To determine whether the combined signal or the set of stem signals is more indicative of the quality of the result, we experiment on two model architectures and train them using semi-supervised learning technique. Finally, we conduct objective and subjective evaluations of the system, comparing them to a standard rule-based system.

Comments:	This is a preprint of the paper accepted by AAAI-21. Please cite the version included in the Proceedings of the 35th AAAI Conference on Artificial Intelligence
Subjects:	Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2103.14208 [cs.SD]
	(or arXiv:2103.14208v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2103.14208

Submission history

From: Ju-Chiang Wang [view email]
[v1] Fri, 26 Mar 2021 01:51:11 UTC (2,158 KB)

Computer Science > Sound

Title:Modeling the Compatibility of Stem Tracks to Generate Music Mashups

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Modeling the Compatibility of Stem Tracks to Generate Music Mashups

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators