Search | arXiv e-print repository

LLM-Text Watermarking based on Lagrange Interpolation

Authors: Jarosław Janas, Paweł Morawiecki, Josef Pieprzyk

Abstract: The rapid advancement of LLMs (Large Language Models) has established them as a foundational technology for many AI and ML-powered human computer interactions. A critical challenge in this context is the attribution of LLM-generated text -- either to the specific language model that produced it or to the individual user who embedded their identity via a so-called multi-bit watermark. This capabili… ▽ More The rapid advancement of LLMs (Large Language Models) has established them as a foundational technology for many AI and ML-powered human computer interactions. A critical challenge in this context is the attribution of LLM-generated text -- either to the specific language model that produced it or to the individual user who embedded their identity via a so-called multi-bit watermark. This capability is essential for combating misinformation, fake news, misinterpretation, and plagiarism. One of the key techniques for addressing this challenge is digital watermarking. This work presents a watermarking scheme for LLM-generated text based on Lagrange interpolation, enabling the recovery of a multi-bit author identity even when the text has been heavily redacted by an adversary. The core idea is to embed a continuous sequence of points $(x, f(x))$ that lie on a single straight line. The $x$-coordinates are computed pseudorandomly using a cryptographic hash function $H$ applied to the concatenation of the previous token's identity and a secret key $s_k$. Crucially, the $x$-coordinates do not need to be embedded into the text -- only the corresponding $f(x)$ values are embedded. During extraction, the algorithm recovers the original points along with many spurious ones, forming an instance of the Maximum Collinear Points (MCP) problem, which can be solved efficiently. Experimental results demonstrate that the proposed method is highly effective, allowing the recovery of the author identity even when as few as three genuine points remain after adversarial manipulation. △ Less

Submitted 12 May, 2025; v1 submitted 8 May, 2025; originally announced May 2025.

arXiv:2410.21986 [pdf, other]

From 5G to 6G: A Survey on Security, Privacy, and Standardization Pathways

Authors: Mengmeng Yang, Youyang Qu, Thilina Ranbaduge, Chandra Thapa, Nazatul Sultan, Ming Ding, Hajime Suzuki, Wei Ni, Sharif Abuadbba, David Smith, Paul Tyler, Josef Pieprzyk, Thierry Rakotoarivelo, Xinlong Guan, Sirine M'rabet

Abstract: The vision for 6G aims to enhance network capabilities with faster data rates, near-zero latency, and higher capacity, supporting more connected devices and seamless experiences within an intelligent digital ecosystem where artificial intelligence (AI) plays a crucial role in network management and data analysis. This advancement seeks to enable immersive mixed-reality experiences, holographic com… ▽ More The vision for 6G aims to enhance network capabilities with faster data rates, near-zero latency, and higher capacity, supporting more connected devices and seamless experiences within an intelligent digital ecosystem where artificial intelligence (AI) plays a crucial role in network management and data analysis. This advancement seeks to enable immersive mixed-reality experiences, holographic communications, and smart city infrastructures. However, the expansion of 6G raises critical security and privacy concerns, such as unauthorized access and data breaches. This is due to the increased integration of IoT devices, edge computing, and AI-driven analytics. This paper provides a comprehensive overview of 6G protocols, focusing on security and privacy, identifying risks, and presenting mitigation strategies. The survey examines current risk assessment frameworks and advocates for tailored 6G solutions. We further discuss industry visions, government projects, and standardization efforts to balance technological innovation with robust security and privacy measures. △ Less

Submitted 3 October, 2024; originally announced October 2024.

arXiv:2209.02228 [pdf, ps, other]

doi 10.3390/e25040672

Compression Optimality of Asymmetric Numeral Systems

Authors: Josef Pieprzyk, Jarek Duda, Marcin Pawlowski, Seyit Camtepe, Arash Mahboubi, Pawel Morawiecki

Abstract: Compression also known as entropy coding has a rich and long history. However, a recent explosion of multimedia Internet applications (such as teleconferencing and video streaming for instance) renews an interest in fast compression that also squeezes out as much redundancy as possible. In 2009 Jarek Duda invented his asymmetric numeral system (ANS). Apart from a beautiful mathematical structure,… ▽ More Compression also known as entropy coding has a rich and long history. However, a recent explosion of multimedia Internet applications (such as teleconferencing and video streaming for instance) renews an interest in fast compression that also squeezes out as much redundancy as possible. In 2009 Jarek Duda invented his asymmetric numeral system (ANS). Apart from a beautiful mathematical structure, it is very efficient and offers compression with a very low residual redundancy. ANS works well for any symbol source statistics. Besides, ANS has become a preferred compression algorithm in the IT industry. However, designing ANS instance requires a random selection of its symbol spread function. Consequently, each ANS instance offers compression with a slightly different compression rate. The paper investigates compression optimality of ANS. It shows that ANS is optimal (i.e. the entropies of encoding and source are equal) for any symbol sources whose probability distribution is described by natural powers of 1/2. We use Markov chains to calculate ANS state probabilities. This allows us to determine ANS compression rate precisely. We present two algorithms for finding ANS instances with high compression rates. The first explores state probability approximations in order to choose ANS instances with better compression rates. The second algorithm is a probabilistic one. It finds ANS instances, whose compression rate can be made as close to the best rate as required. This is done at the expense of the number $θ$ of internal random ``coin'' tosses. The algorithm complexity is ${\cal O}(θL^3)$, where $L$ is the number of ANS states. The complexity can be reduced to ${\cal O}(θL\log{L})$ if we use a fast matrix inversion. If the algorithm is implemented on quantum computer, its complexity becomes ${\cal O}(θ(\log{L})^3)$. △ Less

Submitted 6 September, 2022; originally announced September 2022.

arXiv:2204.03214 [pdf, other]

Transformer-Based Language Models for Software Vulnerability Detection

Authors: Chandra Thapa, Seung Ick Jang, Muhammad Ejaz Ahmed, Seyit Camtepe, Josef Pieprzyk, Surya Nepal

Abstract: The large transformer-based language models demonstrate excellent performance in natural language processing. By considering the transferability of the knowledge gained by these models in one domain to other related domains, and the closeness of natural languages to high-level programming languages, such as C/C++, this work studies how to leverage (large) transformer-based language models in detec… ▽ More The large transformer-based language models demonstrate excellent performance in natural language processing. By considering the transferability of the knowledge gained by these models in one domain to other related domains, and the closeness of natural languages to high-level programming languages, such as C/C++, this work studies how to leverage (large) transformer-based language models in detecting software vulnerabilities and how good are these models for vulnerability detection tasks. In this regard, firstly, a systematic (cohesive) framework that details source code translation, model preparation, and inference is presented. Then, an empirical analysis is performed with software vulnerability datasets with C/C++ source codes having multiple vulnerabilities corresponding to the library function call, pointer usage, array usage, and arithmetic expression. Our empirical results demonstrate the good performance of the language models in vulnerability detection. Moreover, these language models have better performance metrics, such as F1-score, than the contemporary models, namely bidirectional long short-term memory and bidirectional gated recurrent unit. Experimenting with the language models is always challenging due to the requirement of computing resources, platforms, libraries, and dependencies. Thus, this paper also analyses the popular platforms to efficiently fine-tune these models and present recommendations while choosing the platforms. △ Less

Submitted 5 September, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

Comments: 16 pages

arXiv:2007.06884 [pdf, ps, other]

Lattice Blind Signatures with Forward Security

Authors: Huy Quoc Le, Dung Hoang Duong, Willy Susilo, Ha Thanh Nguyen Tran, Viet Cuong Trinh, Josef Pieprzyk, Thomas Plantard

Abstract: Blind signatures play an important role in both electronic cash and electronic voting systems. Blind signatures should be secure against various attacks (such as signature forgeries). The work puts a special attention to secret key exposure attacks, which totally break digital signatures. Signatures that resist secret key exposure attacks are called forward secure in the sense that disclosure of a… ▽ More Blind signatures play an important role in both electronic cash and electronic voting systems. Blind signatures should be secure against various attacks (such as signature forgeries). The work puts a special attention to secret key exposure attacks, which totally break digital signatures. Signatures that resist secret key exposure attacks are called forward secure in the sense that disclosure of a current secret key does not compromise past secret keys. This means that forward-secure signatures must include a mechanism for secret-key evolution over time periods. This paper gives a construction of the first blind signature that is forward secure. The construction is based on the SIS assumption in the lattice setting. The core techniques applied are the binary tree data structure for the time periods and the trapdoor delegation for the key-evolution mechanism. △ Less

Submitted 14 July, 2020; originally announced July 2020.

Comments: ACISP 2020

arXiv:2007.06881 [pdf, ps, other]

Trapdoor Delegation and HIBE from Middle-Product LWE in Standard Model

Authors: Huy Quoc Le, Dung Hoang Duong, Willy Susilo, Josef Pieprzyk

Abstract: At CRYPTO 2017, Rosca, Sakzad, Stehle and Steinfeld introduced the Middle--Product LWE (MPLWE) assumption which is as secure as Polynomial-LWE for a large class of polynomials, making the corresponding cryptographic schemes more flexible in choosing the underlying polynomial ring in design while still keeping the equivalent efficiency. Recently at TCC 2019, Lombardi, Vaikuntanathan and Vuong intro… ▽ More At CRYPTO 2017, Rosca, Sakzad, Stehle and Steinfeld introduced the Middle--Product LWE (MPLWE) assumption which is as secure as Polynomial-LWE for a large class of polynomials, making the corresponding cryptographic schemes more flexible in choosing the underlying polynomial ring in design while still keeping the equivalent efficiency. Recently at TCC 2019, Lombardi, Vaikuntanathan and Vuong introduced a variant of MPLWE assumption and constructed the first IBE scheme based on MPLWE. Their core technique is to construct lattice trapdoors compatible with MPLWE in the same paradigm of Gentry, Peikert and Vaikuntanathan at STOC 2008. However, their method cannot directly offer a Hierachical IBE construction. In this paper, we make a step further by proposing a novel trapdoor delegation mechanism for an extended family of polynomials from which we construct, for the first time, a Hierachical IBE scheme from MPLWE. Our Hierachy IBE scheme is provably secure in the standard model. △ Less

Submitted 14 July, 2020; originally announced July 2020.

Comments: ACNS 2020

arXiv:2007.06353 [pdf, ps, other]

Puncturable Encryption: A Generic Construction from Delegatable Fully Key-Homomorphic Encryption

Authors: Willy Susilo, Dung Hoang Duong, Huy Quoc Le, Josef Pieprzyk

Abstract: Puncturable encryption (PE), proposed by Green and Miers at IEEE S&P 2015, is a kind of public key encryption that allows recipients to revoke individual messages by repeatedly updating decryption keys without communicating with senders. PE is an essential tool for constructing many interesting applications, such as asynchronous messaging systems, forward-secret zero round-trip time protocols, pub… ▽ More Puncturable encryption (PE), proposed by Green and Miers at IEEE S&P 2015, is a kind of public key encryption that allows recipients to revoke individual messages by repeatedly updating decryption keys without communicating with senders. PE is an essential tool for constructing many interesting applications, such as asynchronous messaging systems, forward-secret zero round-trip time protocols, public-key watermarking schemes and forward-secret proxy re-encryptions. This paper revisits PEs from the observation that the puncturing property can be implemented as efficiently computable functions. From this view, we propose a generic PE construction from the fully key-homomorphic encryption, augmented with a key delegation mechanism (DFKHE) from Boneh et al. at Eurocrypt 2014. We show that our PE construction enjoys the selective security under chosen plaintext attacks (that can be converted into the adaptive security with some efficiency loss) from that of DFKHE in the standard model. Basing on the framework, we obtain the first post-quantum secure PE instantiation that is based on the learning with errors problem, selective secure under chosen plaintext attacks (CPA) in the standard model. We also discuss about the ability of modification our framework to support the unbounded number of ciphertext tags inspired from the work of Brakerski and Vaikuntanathan at CRYPTO 2016. △ Less

Submitted 13 July, 2020; originally announced July 2020.

arXiv:1905.08561 [pdf, other]

Dynamic Searchable Symmetric Encryption Schemes Supporting Range Queries with Forward/Backward Privacy

Authors: Cong Zuo, Shi-Feng Sun, Joseph K. Liu, Jun Shao, Josef Pieprzyk

Abstract: Dynamic searchable symmetric encryption (DSSE) is a useful cryptographic tool in encrypted cloud storage. However, it has been reported that DSSE usually suffers from file-injection attacks and content leak of deleted documents. To mitigate these attacks, forward privacy and backward privacy have been proposed. Nevertheless, the existing forward/backward-private DSSE schemes can only support singl… ▽ More Dynamic searchable symmetric encryption (DSSE) is a useful cryptographic tool in encrypted cloud storage. However, it has been reported that DSSE usually suffers from file-injection attacks and content leak of deleted documents. To mitigate these attacks, forward privacy and backward privacy have been proposed. Nevertheless, the existing forward/backward-private DSSE schemes can only support single keyword queries. To address this problem, in this paper, we propose two DSSE schemes supporting range queries. One is forward-private and supports a large number of documents. The other can achieve backward privacy, while it can only support a limited number of documents. Finally, we also give the security proofs of the proposed DSSE schemes in the random oracle model. △ Less

Submitted 21 May, 2019; originally announced May 2019.

Comments: ESORICS 2018

arXiv:math/0211267 [pdf, ps]

On alternative approach for verifiable secret sharing

Authors: Kamil Kulesza, Zbigniew Kotulski, Joseph Pieprzyk

Abstract: Secret sharing allows split/distributed control over the secret (e.g. master key). Verifiable secret sharing (VSS) is the secret sharing extended by verification capacity. Usually verification comes at the price. We propose "free lunch", the approach that allows to overcome this inconvenience. Secret sharing allows split/distributed control over the secret (e.g. master key). Verifiable secret sharing (VSS) is the secret sharing extended by verification capacity. Usually verification comes at the price. We propose "free lunch", the approach that allows to overcome this inconvenience. △ Less

Submitted 18 November, 2002; originally announced November 2002.

Comments: This is poster that was presented on ESORICS2002 conference in Zurich. It consists of 4 color pages, with proposal and flowcharts

MSC Class: D.4.6; E.4

Showing 1–9 of 9 results for author: Pieprzyk, J