Skip to main content

Showing 1–4 of 4 results for author: Long, C X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.06409  [pdf, other

    cs.CR cs.AI cs.CL cs.CY cs.IT cs.LG

    HeavyWater and SimplexWater: Watermarking Low-Entropy Text Distributions

    Authors: Dor Tsur, Carol Xuan Long, Claudio Mayrink Verdun, Hsiang Hsu, Chen-Fu Chen, Haim Permuter, Sajani Vithana, Flavio P. Calmon

    Abstract: Large language model (LLM) watermarks enable authentication of text provenance, curb misuse of machine-generated text, and promote trust in AI systems. Current watermarks operate by changing the next-token predictions output by an LLM. The updated (i.e., watermarked) predictions depend on random side information produced, for example, by hashing previously generated tokens. LLM watermarking is par… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  2. arXiv:2505.08878  [pdf, other

    cs.CR cs.AI cs.IT

    Optimized Couplings for Watermarking Large Language Models

    Authors: Dor Tsur, Carol Xuan Long, Claudio Mayrink Verdun, Hsiang Hsu, Haim Permuter, Flavio P. Calmon

    Abstract: Large-language models (LLMs) are now able to produce text that is, in many cases, seemingly indistinguishable from human-generated content. This has fueled the development of watermarks that imprint a ``signal'' in LLM-generated text with minimal perturbation of an LLM's output. This paper provides an analysis of text watermarking in a one-shot setting. Through the lens of hypothesis testing with… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: Accepted at ISIT25

  3. arXiv:2407.08571  [pdf, other

    cs.AI cs.IR cs.IT cs.LG stat.ML

    Multi-Group Proportional Representation in Retrieval

    Authors: Alex Oesterling, Claudio Mayrink Verdun, Carol Xuan Long, Alexander Glynn, Lucas Monteiro Paes, Sajani Vithana, Martina Cardone, Flavio P. Calmon

    Abstract: Image search and retrieval tasks can perpetuate harmful stereotypes, erase cultural identities, and amplify social disparities. Current approaches to mitigate these representational harms balance the number of retrieved items across population groups defined by a small number of (often binary) attributes. However, most existing methods overlook intersectional groups determined by combinations of g… ▽ More

    Submitted 31 October, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

    Comments: 48 pages, 33 figures. Accepted as poster at NeurIPS 2024. Code can be found at https://github.com/alex-oesterling/multigroup-proportional-representation

  4. arXiv:2306.09425  [pdf, other

    cs.LG cs.CY cs.IT

    Arbitrariness Lies Beyond the Fairness-Accuracy Frontier

    Authors: Carol Xuan Long, Hsiang Hsu, Wael Alghamdi, Flavio P. Calmon

    Abstract: Machine learning tasks may admit multiple competing models that achieve similar performance yet produce conflicting outputs for individual samples -- a phenomenon known as predictive multiplicity. We demonstrate that fairness interventions in machine learning optimized solely for group fairness and accuracy can exacerbate predictive multiplicity. Consequently, state-of-the-art fairness interventio… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.