Skip to main content

Showing 1–6 of 6 results for author: Cohen-Wang, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.13752  [pdf, other

    cs.LG cs.CL

    Learning to Attribute with Attention

    Authors: Benjamin Cohen-Wang, Yung-Sung Chuang, Aleksander Madry

    Abstract: Given a sequence of tokens generated by a language model, we may want to identify the preceding tokens that influence the model to generate this sequence. Performing such token attribution is expensive; a common approach is to ablate preceding tokens and directly measure their effects. To reduce the cost of token attribution, we revisit attention weights as a heuristic for how a language model use… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

  2. arXiv:2502.09604  [pdf, ps, other

    cs.CL cs.AI cs.LG

    SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

    Authors: Yung-Sung Chuang, Benjamin Cohen-Wang, Shannon Zejiang Shen, Zhaofeng Wu, Hu Xu, Xi Victoria Lin, James Glass, Shang-Wen Li, Wen-tau Yih

    Abstract: We introduce SelfCite, a novel self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for the statements in their generated responses. Instead of only relying on costly and labor-intensive annotations, SelfCite leverages a reward signal provided by the LLM itself through context ablation: If a citation is necessary, removing the cited text from t… ▽ More

    Submitted 15 June, 2025; v1 submitted 13 February, 2025; originally announced February 2025.

    Comments: ICML 2025 main conference paper. The source code is available at https://github.com/facebookresearch/SelfCite

  3. arXiv:2409.00729  [pdf, other

    cs.LG cs.CL

    ContextCite: Attributing Model Generation to Context

    Authors: Benjamin Cohen-Wang, Harshay Shah, Kristian Georgiev, Aleksander Madry

    Abstract: How do language models use information provided as context when generating a response? Can we infer whether a particular generated statement is actually grounded in the context, a misinterpretation, or fabricated? To help answer these questions, we introduce the problem of context attribution: pinpointing the parts of the context (if any) that led a model to generate a particular statement. We the… ▽ More

    Submitted 13 September, 2024; v1 submitted 1 September, 2024; originally announced September 2024.

  4. arXiv:2403.00194  [pdf, other

    cs.LG

    Ask Your Distribution Shift if Pre-Training is Right for You

    Authors: Benjamin Cohen-Wang, Joshua Vendrow, Aleksander Madry

    Abstract: Pre-training is a widely used approach to develop models that are robust to distribution shifts. However, in practice, its effectiveness varies: fine-tuning a pre-trained model improves robustness significantly in some cases but not at all in others (compared to training from scratch). In this work, we seek to characterize the failure modes that pre-training can and cannot address. In particular,… ▽ More

    Submitted 22 December, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

  5. arXiv:2312.02132  [pdf, other

    cs.LG cs.AI cs.CR cs.DS

    Hot PATE: Private Aggregation of Distributions for Diverse Task

    Authors: Edith Cohen, Benjamin Cohen-Wang, Xin Lyu, Jelani Nelson, Tamas Sarlos, Uri Stemmer

    Abstract: The Private Aggregation of Teacher Ensembles (PATE) framework enables privacy-preserving machine learning by aggregating responses from disjoint subsets of sensitive data. Adaptations of PATE to tasks with inherent output diversity such as text generation face a core tension: preserving output diversity reduces teacher agreement, which in turn increases the noise required for differential privacy,… ▽ More

    Submitted 17 May, 2025; v1 submitted 4 December, 2023; originally announced December 2023.

  6. arXiv:2103.02761  [pdf, other

    cs.LG stat.ML

    Comparing the Value of Labeled and Unlabeled Data in Method-of-Moments Latent Variable Estimation

    Authors: Mayee F. Chen, Benjamin Cohen-Wang, Stephen Mussmann, Frederic Sala, Christopher RĂ©

    Abstract: Labeling data for modern machine learning is expensive and time-consuming. Latent variable models can be used to infer labels from weaker, easier-to-acquire sources operating on unlabeled data. Such models can also be trained using labeled data, presenting a key question: should a user invest in few labeled or many unlabeled points? We answer this via a framework centered on model misspecification… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: To appear in AISTATS 2021