Test-time Prompt Refinement for Text-to-Image Models

Khan, Mohammad Abdul Hafeez; Jain, Yash; Bhattacharyya, Siddhartha; Vineet, Vibhav

Computer Science > Machine Learning

arXiv:2507.22076 (cs)

[Submitted on 22 Jul 2025]

Title:Test-time Prompt Refinement for Text-to-Image Models

Authors:Mohammad Abdul Hafeez Khan, Yash Jain, Siddhartha Bhattacharyya, Vibhav Vineet

View PDF HTML (experimental)

Abstract:Text-to-image (T2I) generation models have made significant strides but still struggle with prompt sensitivity: even minor changes in prompt wording can yield inconsistent or inaccurate outputs. To address this challenge, we introduce a closed-loop, test-time prompt refinement framework that requires no additional training of the underlying T2I model, termed TIR. In our approach, each generation step is followed by a refinement step, where a pretrained multimodal large language model (MLLM) analyzes the output image and the user's prompt. The MLLM detects misalignments (e.g., missing objects, incorrect attributes) and produces a refined and physically grounded prompt for the next round of image generation. By iteratively refining the prompt and verifying alignment between the prompt and the image, TIR corrects errors, mirroring the iterative refinement process of human artists. We demonstrate that this closed-loop strategy improves alignment and visual coherence across multiple benchmark datasets, all while maintaining plug-and-play integration with black-box T2I models.

Comments:	Accepted to ICCV 2025, MARS2 Workshop. Total 14 pages, 12 figures and 3 tables
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2507.22076 [cs.LG]
	(or arXiv:2507.22076v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2507.22076

Submission history

From: Mohammed Abdul Hafeez Khan [view email]
[v1] Tue, 22 Jul 2025 20:30:13 UTC (107,692 KB)

Computer Science > Machine Learning

Title:Test-time Prompt Refinement for Text-to-Image Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Test-time Prompt Refinement for Text-to-Image Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators