SDMatte: Grafting Diffusion Models for Interactive Matting

Huang, Longfei; Liang, Yu; Zhang, Hao; Chen, Jinwei; Dong, Wei; Chen, Lunde; Liu, Wanyu; Li, Bo; Jiang, Peng-Tao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2508.00443 (cs)

[Submitted on 1 Aug 2025 (v1), last revised 4 Aug 2025 (this version, v2)]

Title:SDMatte: Grafting Diffusion Models for Interactive Matting

Authors:Longfei Huang, Yu Liang, Hao Zhang, Jinwei Chen, Wei Dong, Lunde Chen, Wanyu Liu, Bo Li, Peng-Tao Jiang

View PDF HTML (experimental)

Abstract:Recent interactive matting methods have shown satisfactory performance in capturing the primary regions of objects, but they fall short in extracting fine-grained details in edge regions. Diffusion models trained on billions of image-text pairs, demonstrate exceptional capability in modeling highly complex data distributions and synthesizing realistic texture details, while exhibiting robust text-driven interaction capabilities, making them an attractive solution for interactive matting. To this end, we propose SDMatte, a diffusion-driven interactive matting model, with three key contributions. First, we exploit the powerful priors of diffusion models and transform the text-driven interaction capability into visual prompt-driven interaction capability to enable interactive matting. Second, we integrate coordinate embeddings of visual prompts and opacity embeddings of target objects into U-Net, enhancing SDMatte's sensitivity to spatial position information and opacity information. Third, we propose a masked self-attention mechanism that enables the model to focus on areas specified by visual prompts, leading to better performance. Extensive experiments on multiple datasets demonstrate the superior performance of our method, validating its effectiveness in interactive matting. Our code and model are available at this https URL.

Comments:	Accepted at ICCV 2025, 11 pages, 4 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2508.00443 [cs.CV]
	(or arXiv:2508.00443v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2508.00443

Submission history

From: Longfei Huang [view email]
[v1] Fri, 1 Aug 2025 09:00:48 UTC (2,977 KB)
[v2] Mon, 4 Aug 2025 15:30:18 UTC (2,977 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:SDMatte: Grafting Diffusion Models for Interactive Matting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SDMatte: Grafting Diffusion Models for Interactive Matting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators