Optimized Couplings for Watermarking Large Language Models

Tsur, Dor; Long, Carol Xuan; Verdun, Claudio Mayrink; Hsu, Hsiang; Permuter, Haim; Calmon, Flavio P.

Computer Science > Cryptography and Security

arXiv:2505.08878 (cs)

[Submitted on 13 May 2025]

Title:Optimized Couplings for Watermarking Large Language Models

Authors:Dor Tsur, Carol Xuan Long, Claudio Mayrink Verdun, Hsiang Hsu, Haim Permuter, Flavio P. Calmon

View PDF HTML (experimental)

Abstract:Large-language models (LLMs) are now able to produce text that is, in many cases, seemingly indistinguishable from human-generated content. This has fueled the development of watermarks that imprint a ``signal'' in LLM-generated text with minimal perturbation of an LLM's output. This paper provides an analysis of text watermarking in a one-shot setting. Through the lens of hypothesis testing with side information, we formulate and analyze the fundamental trade-off between watermark detection power and distortion in generated textual quality. We argue that a key component in watermark design is generating a coupling between the side information shared with the watermark detector and a random partition of the LLM vocabulary. Our analysis identifies the optimal coupling and randomization strategy under the worst-case LLM next-token distribution that satisfies a min-entropy constraint. We provide a closed-form expression of the resulting detection rate under the proposed scheme and quantify the cost in a max-min sense. Finally, we provide an array of numerical results, comparing the proposed scheme with the theoretical optimum and existing schemes, in both synthetic data and LLM watermarking. Our code is available at this https URL

Comments:	Accepted at ISIT25
Subjects:	Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Information Theory (cs.IT)
Cite as:	arXiv:2505.08878 [cs.CR]
	(or arXiv:2505.08878v1 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2505.08878

Submission history

From: Dor Tsur [view email]
[v1] Tue, 13 May 2025 18:08:12 UTC (1,933 KB)

Computer Science > Cryptography and Security

Title:Optimized Couplings for Watermarking Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:Optimized Couplings for Watermarking Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators