Chimera: Compositional Image Generation using Part-based Concepting

Singh, Shivam; Chen, Yiming; Chatterjee, Agneet; Raj, Amit; Hays, James; Yang, Yezhou; Baral, Chitta

Computer Science > Computer Vision and Pattern Recognition

arXiv:2510.18083 (cs)

[Submitted on 20 Oct 2025 (v1), last revised 22 Oct 2025 (this version, v2)]

Title:Chimera: Compositional Image Generation using Part-based Concepting

Authors:Shivam Singh, Yiming Chen, Agneet Chatterjee, Amit Raj, James Hays, Yezhou Yang, Chitta Baral

View PDF HTML (experimental)

Abstract:Personalized image generative models are highly proficient at synthesizing images from text or a single image, yet they lack explicit control for composing objects from specific parts of multiple source images without user specified masks or annotations. To address this, we introduce Chimera, a personalized image generation model that generates novel objects by combining specified parts from different source images according to textual instructions. To train our model, we first construct a dataset from a taxonomy built on 464 unique (part, subject) pairs, which we term semantic atoms. From this, we generate 37k prompts and synthesize the corresponding images with a high-fidelity text-to-image model. We train a custom diffusion prior model with part-conditional guidance, which steers the image-conditioning features to enforce both semantic identity and spatial layout. We also introduce an objective metric PartEval to assess the fidelity and compositional accuracy of generation pipelines. Human evaluations and our proposed metric show that Chimera outperforms other baselines by 14% in part alignment and compositional accuracy and 21% in visual quality.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2510.18083 [cs.CV]
	(or arXiv:2510.18083v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2510.18083

Submission history

From: Shivam Singh [view email]
[v1] Mon, 20 Oct 2025 20:20:47 UTC (45,297 KB)
[v2] Wed, 22 Oct 2025 04:47:22 UTC (45,297 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Chimera: Compositional Image Generation using Part-based Concepting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Chimera: Compositional Image Generation using Part-based Concepting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators