Skip to main content

Showing 1–7 of 7 results for author: Reusch, A

.
  1. arXiv:2506.03817  [pdf, ps, other

    cs.LG

    Survey of Active Learning Hyperparameters: Insights from a Large-Scale Experimental Grid

    Authors: Julius Gonsior, Tim Rieß, Anja Reusch, Claudio Hartmann, Maik Thiele, Wolfgang Lehner

    Abstract: Annotating data is a time-consuming and costly task, but it is inherently required for supervised machine learning. Active Learning (AL) is an established method that minimizes human labeling effort by iteratively selecting the most informative unlabeled samples for expert annotation, thereby improving the overall classification performance. Even though AL has been known for decades, AL is still r… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  2. arXiv:2504.13151  [pdf, ps, other

    cs.LG cs.AI cs.CL

    MIB: A Mechanistic Interpretability Benchmark

    Authors: Aaron Mueller, Atticus Geiger, Sarah Wiegreffe, Dana Arad, Iván Arcuschin, Adam Belfki, Yik Siu Chan, Jaden Fiotto-Kaufman, Tal Haklay, Michael Hanna, Jing Huang, Rohan Gupta, Yaniv Nikankin, Hadas Orgad, Nikhil Prakash, Anja Reusch, Aruna Sankaranarayanan, Shun Shao, Alessandro Stolfo, Martin Tutek, Amir Zur, David Bau, Yonatan Belinkov

    Abstract: How can we know whether new mechanistic interpretability methods achieve real improvements? In pursuit of lasting evaluation standards, we propose MIB, a Mechanistic Interpretability Benchmark, with two tracks spanning four tasks and five models. MIB favors methods that precisely and concisely recover relevant causal pathways or causal variables in neural language models. The circuit localization… ▽ More

    Submitted 9 June, 2025; v1 submitted 17 April, 2025; originally announced April 2025.

    Comments: Accepted to ICML 2025. Project website at https://mib-bench.github.io

  3. arXiv:2503.19715  [pdf, other

    cs.IR

    Reverse-Engineering the Retrieval Process in GenIR Models

    Authors: Anja Reusch, Yonatan Belinkov

    Abstract: Generative Information Retrieval (GenIR) is a novel paradigm in which a transformer encoder-decoder model predicts document rankings based on a query in an end-to-end fashion. These GenIR models have received significant attention due to their simple retrieval architecture while maintaining high retrieval effectiveness. However, in contrast to established retrieval architectures like cross-encoder… ▽ More

    Submitted 8 April, 2025; v1 submitted 25 March, 2025; originally announced March 2025.

    ACM Class: H.3.3

  4. arXiv:2502.20855  [pdf, other

    cs.CL cs.LG

    MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training

    Authors: Jonathan Drechsel, Anja Reusch, Steffen Herbold

    Abstract: Mathematical formulas are a fundamental and widely used component in various scientific fields, serving as a universal language for expressing complex concepts and relationships. While state-of-the-art transformer models excel in processing and understanding natural language, they encounter challenges with mathematical notation, which involves a complex structure and diverse representations. This… ▽ More

    Submitted 28 February, 2025; originally announced February 2025.

  5. arXiv:2410.21272  [pdf, other

    cs.CL

    Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics

    Authors: Yaniv Nikankin, Anja Reusch, Aaron Mueller, Yonatan Belinkov

    Abstract: Do large language models (LLMs) solve reasoning tasks by learning robust generalizable algorithms, or do they memorize training data? To investigate this question, we use arithmetic reasoning as a representative task. Using causal analysis, we identify a subset of the model (a circuit) that explains most of the model's behavior for basic arithmetic logic and examine its functionality. By zooming i… ▽ More

    Submitted 20 May, 2025; v1 submitted 28 October, 2024; originally announced October 2024.

    MSC Class: 68T5 ACM Class: I.2.7

  6. arXiv:2210.03005  [pdf, other

    cs.LG cs.AI cs.CL cs.DB

    To Softmax, or not to Softmax: that is the question when applying Active Learning for Transformer Models

    Authors: Julius Gonsior, Christian Falkenberg, Silvio Magino, Anja Reusch, Maik Thiele, Wolfgang Lehner

    Abstract: Despite achieving state-of-the-art results in nearly all Natural Language Processing applications, fine-tuning Transformer-based language models still requires a significant amount of labeled data to work. A well known technique to reduce the amount of human effort in acquiring a labeled dataset is \textit{Active Learning} (AL): an iterative process in which only the minimal amount of samples is l… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

  7. Entanglement Witnesses for Indistinguishable Particles

    Authors: A. Reusch, J. Sperling, W. Vogel

    Abstract: We study the problem of witnessing entanglement among indistinguishable particles. For this purpose, we derive a set of equations which results in necessary and sufficient conditions for probing multipartite entanglement between arbitrary systems of Bosons or Fermions. The solution of these equations yields the construction of optimal entanglement witnesses for partial and full entanglement in dis… ▽ More

    Submitted 21 April, 2015; v1 submitted 12 January, 2015; originally announced January 2015.

    Comments: 11 pages, 2 figures

    Journal ref: Phys. Rev. A 91, 042324 (2015)