DisCover: Disentangled Music Representation Learning for Cover Song Identification

Xun, Jiahao; Zhang, Shengyu; Yang, Yanting; Zhu, Jieming; Deng, Liqun; Zhao, Zhou; Dong, Zhenhua; Li, Ruiqi; Zhang, Lichao; Wu, Fei

Abstract:In the field of music information retrieval (MIR), cover song identification (CSI) is a challenging task that aims to identify cover versions of a query song from a massive collection. Existing works still suffer from high intra-song variances and inter-song correlations, due to the entangled nature of version-specific and version-invariant factors in their modeling. In this work, we set the goal of disentangling version-specific and version-invariant factors, which could make it easier for the model to learn invariant music representations for unseen query songs. We analyze the CSI task in a disentanglement view with the causal graph technique, and identify the intra-version and inter-version effects biasing the invariant learning. To block these effects, we propose the disentangled music representation learning framework (DisCover) for CSI. DisCover consists of two critical components: (1) Knowledge-guided Disentanglement Module (KDM) and (2) Gradient-based Adversarial Disentanglement Module (GADM), which block intra-version and inter-version biased effects, respectively. KDM minimizes the mutual information between the learned representations and version-variant factors that are identified with prior domain knowledge. GADM identifies version-variant factors by simulating the representation transitions between intra-song versions, and exploits adversarial distillation for effect blocking. Extensive comparisons with best-performing methods and in-depth analysis demonstrate the effectiveness of DisCover and the and necessity of disentanglement for CSI.

Subjects:	Information Retrieval (cs.IR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2307.09775 [cs.IR]
	(or arXiv:2307.09775v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2307.09775

Computer Science > Information Retrieval

Title:DisCover: Disentangled Music Representation Learning for Cover Song Identification

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators