Task-customized Masked AutoEncoder via Mixture of Cluster-conditional Experts

Liu, Zhili; Chen, Kai; Han, Jianhua; Hong, Lanqing; Xu, Hang; Li, Zhenguo; Kwok, James T.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2402.05382 (cs)

[Submitted on 8 Feb 2024]

Title:Task-customized Masked AutoEncoder via Mixture of Cluster-conditional Experts

Authors:Zhili Liu, Kai Chen, Jianhua Han, Lanqing Hong, Hang Xu, Zhenguo Li, James T. Kwok

View PDF HTML (experimental)

Abstract:Masked Autoencoder~(MAE) is a prevailing self-supervised learning method that achieves promising results in model pre-training. However, when the various downstream tasks have data distributions different from the pre-training data, the semantically irrelevant pre-training information might result in negative transfer, impeding MAE's scalability. To address this issue, we propose a novel MAE-based pre-training paradigm, Mixture of Cluster-conditional Experts (MoCE), which can be trained once but provides customized pre-training models for diverse downstream tasks. Different from the mixture of experts (MoE), our MoCE trains each expert only with semantically relevant images by using cluster-conditional gates. Thus, each downstream task can be allocated to its customized model pre-trained with data most similar to the downstream data. Experiments on a collection of 11 downstream tasks show that MoCE outperforms the vanilla MAE by 2.45\% on average. It also obtains new state-of-the-art self-supervised learning results on detection and segmentation.

Comments:	Accepted by ICLR 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2402.05382 [cs.CV]
	(or arXiv:2402.05382v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2402.05382

Submission history

From: Zhili Liu [view email]
[v1] Thu, 8 Feb 2024 03:46:32 UTC (6,938 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Task-customized Masked AutoEncoder via Mixture of Cluster-conditional Experts

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Task-customized Masked AutoEncoder via Mixture of Cluster-conditional Experts

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators