Skip to main content

Showing 1–2 of 2 results for author: Ranchin, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.01084  [pdf, other

    cs.CL cs.LG

    zip2zip: Inference-Time Adaptive Vocabularies for Language Models via Token Compression

    Authors: Saibo Geng, Nathan Ranchin, Yunzhen yao, Maxime Peyrard, Chris Wendler, Michael Gastpar, Robert West

    Abstract: Tokenization efficiency plays a critical role in the performance and cost of large language models (LLMs), yet most models rely on static tokenizers optimized for general-purpose corpora. These tokenizers' fixed vocabularies often fail to adapt to domain- or language-specific inputs, leading to longer token sequences and higher computational costs. We introduce zip2zip, a framework that enables LL… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: Code will be released at https://github.com/epfl-dlab/zip2zip

  2. arXiv:2501.10868  [pdf, other

    cs.CL cs.AI

    JSONSchemaBench: A Rigorous Benchmark of Structured Outputs for Language Models

    Authors: Saibo Geng, Hudson Cooper, MichaƂ Moskal, Samuel Jenkins, Julian Berman, Nathan Ranchin, Robert West, Eric Horvitz, Harsha Nori

    Abstract: Reliably generating structured outputs has become a critical capability for modern language model (LM) applications. Constrained decoding has emerged as the dominant technology across sectors for enforcing structured outputs during generation. Despite its growing adoption, little has been done with the systematic evaluation of the behaviors and performance of constrained decoding. Constrained deco… ▽ More

    Submitted 27 February, 2025; v1 submitted 18 January, 2025; originally announced January 2025.