Skip to main content

Showing 1–5 of 5 results for author: Goldman-Wetzler, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.14417  [pdf, ps, other

    cs.AI cs.CL

    Inverse Scaling in Test-Time Compute

    Authors: Aryo Pradipta Gema, Alexander Hägele, Runjin Chen, Andy Arditi, Jacob Goldman-Wetzler, Kit Fraser-Taliente, Henry Sleight, Linda Petrini, Julian Michael, Beatrice Alex, Pasquale Minervini, Yanda Chen, Joe Benton, Ethan Perez

    Abstract: We construct evaluation tasks where extending the reasoning length of Large Reasoning Models (LRMs) deteriorates performance, exhibiting an inverse scaling relationship between test-time compute and accuracy. Our evaluation tasks span four categories: simple counting tasks with distractors, regression tasks with spurious features, deduction tasks with constraint tracking, and advanced AI risks. We… ▽ More

    Submitted 18 July, 2025; originally announced July 2025.

  2. arXiv:2506.10139  [pdf, ps, other

    cs.CL cs.AI

    Unsupervised Elicitation of Language Models

    Authors: Jiaxin Wen, Zachary Ankner, Arushi Somani, Peter Hase, Samuel Marks, Jacob Goldman-Wetzler, Linda Petrini, Henry Sleight, Collin Burns, He He, Shi Feng, Ethan Perez, Jan Leike

    Abstract: To steer pretrained language models for downstream tasks, today's post-training paradigm relies on humans to specify desired behaviors. However, for models with superhuman capabilities, it is difficult or impossible to get high-quality human supervision. To address this challenge, we introduce a new unsupervised algorithm, Internal Coherence Maximization (ICM), to fine-tune pretrained language mod… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  3. arXiv:2506.06278  [pdf, ps, other

    cs.LG cs.AI

    Distillation Robustifies Unlearning

    Authors: Bruce W. Lee, Addie Foote, Alex Infanger, Leni Shor, Harish Kamath, Jacob Goldman-Wetzler, Bryce Woodworth, Alex Cloud, Alexander Matt Turner

    Abstract: Current LLM unlearning methods are not robust: they can be reverted easily with a few steps of finetuning. This is true even for the idealized unlearning method of training to imitate an oracle model that was never exposed to unwanted information, suggesting that output-based finetuning is insufficient to achieve robust unlearning. In a similar vein, we find that training a randomly initialized st… ▽ More

    Submitted 9 June, 2025; v1 submitted 6 June, 2025; originally announced June 2025.

  4. arXiv:2410.04332  [pdf, other

    cs.LG cs.AI

    Gradient Routing: Masking Gradients to Localize Computation in Neural Networks

    Authors: Alex Cloud, Jacob Goldman-Wetzler, Evžen Wybitul, Joseph Miller, Alexander Matt Turner

    Abstract: Neural networks are trained primarily based on their inputs and outputs, without regard for their internal mechanisms. These neglected mechanisms determine properties that are critical for safety, like (i) transparency; (ii) the absence of sensitive information or harmful capabilities; and (iii) reliable generalization of goals beyond the training distribution. To address this shortcoming, we intr… ▽ More

    Submitted 29 November, 2024; v1 submitted 5 October, 2024; originally announced October 2024.

  5. arXiv:2401.16645  [pdf, other

    cs.LG

    Speeding up and reducing memory usage for scientific machine learning via mixed precision

    Authors: Joel Hayford, Jacob Goldman-Wetzler, Eric Wang, Lu Lu

    Abstract: Scientific machine learning (SciML) has emerged as a versatile approach to address complex computational science and engineering problems. Within this field, physics-informed neural networks (PINNs) and deep operator networks (DeepONets) stand out as the leading techniques for solving partial differential equations by incorporating both physical equations and experimental data. However, training P… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: 25 pages, 7 figures