Skip to main content

Showing 1–5 of 5 results for author: Geuter, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.15889  [pdf, ps, other

    cs.CL

    Entropy-Driven Pre-Tokenization for Byte-Pair Encoding

    Authors: Yifan Hu, Frank Liang, Dachuan Zhao, Jonathan Geuter, Varshini Reddy, Craig W. Schmidt, Chris Tanner

    Abstract: Byte-Pair Encoding (BPE) has become a widely adopted subword tokenization method in modern language models due to its simplicity and strong empirical performance across downstream tasks. However, applying BPE to unsegmented languages such as Chinese presents significant challenges, as its frequency-driven merge operation is agnostic to linguistic boundaries. To address this, we propose two entropy… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  2. arXiv:2506.04118  [pdf, ps, other

    cs.LG stat.ML

    Guided Speculative Inference for Efficient Test-Time Alignment of LLMs

    Authors: Jonathan Geuter, Youssef Mroueh, David Alvarez-Melis

    Abstract: We propose Guided Speculative Inference (GSI), a novel algorithm for efficient reward-guided decoding in large language models. GSI combines soft best-of-$n$ test-time scaling with a reward model $r(x,y)$ and speculative samples from a small auxiliary model $π_S(y\mid x)$. We provably approximate the optimal tilted policy $π_{β,B}(y\mid x) \propto π_B(y\mid x)\exp(β\,r(x,y))$ of soft best-of-$n$ u… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: 12 pages, 2 figures

    ACM Class: I.2.7

  3. arXiv:2503.01140  [pdf, other

    cs.LG stat.ML

    DDEQs: Distributional Deep Equilibrium Models through Wasserstein Gradient Flows

    Authors: Jonathan Geuter, Clément Bonet, Anna Korba, David Alvarez-Melis

    Abstract: Deep Equilibrium Models (DEQs) are a class of implicit neural networks that solve for a fixed point of a neural network in their forward pass. Traditionally, DEQs take sequences as inputs, but have since been applied to a variety of data. In this work, we present Distributional Deep Equilibrium Models (DDEQs), extending DEQs to discrete measure inputs, such as sets or point clouds. We provide a th… ▽ More

    Submitted 22 March, 2025; v1 submitted 2 March, 2025; originally announced March 2025.

    Comments: 39 pages, 17 figures. To be published in AISTATS 2025

  4. arXiv:2307.11224  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models

    Authors: Michael Günther, Louis Milliken, Jonathan Geuter, Georgios Mastrapas, Bo Wang, Han Xiao

    Abstract: Jina Embeddings constitutes a set of high-performance sentence embedding models adept at translating textual inputs into numerical representations, capturing the semantics of the text. These models excel in applications like dense retrieval and semantic textual similarity. This paper details the development of Jina Embeddings, starting with the creation of high-quality pairwise and triplet dataset… ▽ More

    Submitted 20 October, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: 9 pages, 2 page appendix

    MSC Class: 68T50 ACM Class: H.3.1; H.3.3; I.2.7; I.5.4

  5. arXiv:2212.00133  [pdf, ps, other

    cs.LG math.OC stat.ML

    Universal Neural Optimal Transport

    Authors: Jonathan Geuter, Gregor Kornhardt, Ingimar Tomasson, Vaios Laschos

    Abstract: Optimal Transport (OT) problems are a cornerstone of many applications, but solving them is computationally expensive. To address this problem, we propose UNOT (Universal Neural Optimal Transport), a novel framework capable of accurately predicting (entropic) OT distances and plans between discrete measures for a given cost function. UNOT builds on Fourier Neural Operators, a universal class of ne… ▽ More

    Submitted 12 June, 2025; v1 submitted 30 November, 2022; originally announced December 2022.

    Comments: 37 pages, 19 figures, accepted to ICML 2025

    MSC Class: 68T07 (Primary) 90C08 (Secondary) ACM Class: I.2.6; G.3; G.4