Adversarially Domain-adaptive Latent Diffusion for Unsupervised Semantic Segmentation

Yu, Jongmin; Sun, Zhongtian; Chi, Chen Bene; Yang, Jinhong; Luo, Shan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2412.16859 (cs)

[Submitted on 22 Dec 2024 (v1), last revised 7 Apr 2025 (this version, v2)]

Title:Adversarially Domain-adaptive Latent Diffusion for Unsupervised Semantic Segmentation

Authors:Jongmin Yu, Zhongtian Sun, Chen Bene Chi, Jinhong Yang, Shan Luo

View PDF HTML (experimental)

Abstract:Semantic segmentation requires extensive pixel-level annotation, motivating unsupervised domain adaptation (UDA) to transfer knowledge from labelled source domains to unlabelled or weakly labelled target domains. One of the most efficient strategies involves using synthetic datasets generated within controlled virtual environments, such as video games or traffic simulators, which can automatically generate pixel-level annotations. However, even when such datasets are available, learning a well-generalised representation that captures both domains remains challenging, owing to probabilistic and geometric discrepancies between the virtual world and real-world imagery. This work introduces a semantic segmentation method based on latent diffusion models, termed Inter-Coder Connected Latent Diffusion (ICCLD), alongside an unsupervised domain adaptation approach. The model employs an inter-coder connection to enhance contextual understanding and preserve fine details, while adversarial learning aligns latent feature distributions across domains during the latent diffusion process. Experiments on GTA5, Synthia, and Cityscapes demonstrate that ICCLD outperforms state-of-the-art UDA methods, achieving mIoU scores of 74.4 (GTA5$\rightarrow$Cityscapes) and 67.2 (Synthia$\rightarrow$Cityscapes).

Comments:	Accepted from CVPR 2025 Workshop PVUW
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2412.16859 [cs.CV]
	(or arXiv:2412.16859v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2412.16859

Submission history

From: Jongmin Yu [view email]
[v1] Sun, 22 Dec 2024 04:55:41 UTC (9,157 KB)
[v2] Mon, 7 Apr 2025 02:01:25 UTC (9,329 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Adversarially Domain-adaptive Latent Diffusion for Unsupervised Semantic Segmentation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Adversarially Domain-adaptive Latent Diffusion for Unsupervised Semantic Segmentation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators