BiMark: Unbiased Multilayer Watermarking for Large Language Models

Feng, Xiaoyan; Zhang, He; Zhang, Yanjun; Zhang, Leo Yu; Pan, Shirui

Computer Science > Computation and Language

arXiv:2506.21602 (cs)

[Submitted on 19 Jun 2025]

Title:BiMark: Unbiased Multilayer Watermarking for Large Language Models

Authors:Xiaoyan Feng, He Zhang, Yanjun Zhang, Leo Yu Zhang, Shirui Pan

View PDF HTML (experimental)

Abstract:Recent advances in Large Language Models (LLMs) have raised urgent concerns about LLM-generated text authenticity, prompting regulatory demands for reliable identification mechanisms. Although watermarking offers a promising solution, existing approaches struggle to simultaneously achieve three critical requirements: text quality preservation, model-agnostic detection, and message embedding capacity, which are crucial for practical implementation. To achieve these goals, the key challenge lies in balancing the trade-off between text quality preservation and message embedding capacity. To address this challenge, we propose BiMark, a novel watermarking framework that achieves these requirements through three key innovations: (1) a bit-flip unbiased reweighting mechanism enabling model-agnostic detection, (2) a multilayer architecture enhancing detectability without compromising generation quality, and (3) an information encoding approach supporting multi-bit watermarking. Through theoretical analysis and extensive experiments, we validate that, compared to state-of-the-art multi-bit watermarking methods, BiMark achieves up to 30% higher extraction rates for short texts while maintaining text quality indicated by lower perplexity, and performs comparably to non-watermarked text on downstream tasks such as summarization and translation.

Comments:	This paper is accepted by International Conference on Machine Learning (ICML) 2025
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2506.21602 [cs.CL]
	(or arXiv:2506.21602v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2506.21602

Submission history

From: Xiaoyan Feng [view email]
[v1] Thu, 19 Jun 2025 11:08:59 UTC (3,150 KB)

Computer Science > Computation and Language

Title:BiMark: Unbiased Multilayer Watermarking for Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:BiMark: Unbiased Multilayer Watermarking for Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators