Skip to main content

Showing 1–13 of 13 results for author: Benz, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.06446  [pdf, ps, other

    cs.CL cs.AI cs.LG stat.ML

    Canonical Autoregressive Generation

    Authors: Ivi Chatzi, Nina Corvelo Benz, Stratis Tsirtsis, Manuel Gomez-Rodriguez

    Abstract: State of the art large language models are trained using large amounts of tokens derived from raw text using what is called a tokenizer. Crucially, the tokenizer determines the (token) vocabulary a model will use during inference as well as, in principle, the (token) language. This is because, while the token vocabulary may allow for different tokenizations of a string, the tokenizer always maps t… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  2. arXiv:2502.01754  [pdf, other

    cs.CL cs.AI cs.LG

    Evaluation of Large Language Models via Coupled Token Generation

    Authors: Nina Corvelo Benz, Stratis Tsirtsis, Eleni Straitouri, Ivi Chatzi, Ander Artola Velasco, Suhas Thejaswi, Manuel Gomez-Rodriguez

    Abstract: State of the art large language models rely on randomization to respond to a prompt. As an immediate consequence, a model may respond differently to the same prompt if asked multiple times. In this work, we argue that the evaluation and ranking of large language models should control for the randomization underpinning their functioning. Our starting point is the development of a causal model for c… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

  3. arXiv:2501.14035  [pdf, other

    cs.AI

    Human-Alignment Influences the Utility of AI-assisted Decision Making

    Authors: Nina L. Corvelo Benz, Manuel Gomez Rodriguez

    Abstract: Whenever an AI model is used to predict a relevant (binary) outcome in AI-assisted decision making, it is widely agreed that, together with each prediction, the model should provide an AI confidence value. However, it has been unclear why decision makers have often difficulties to develop a good sense on when to trust a prediction using AI confidence values. Very recently, Corvelo Benz and Gomez R… ▽ More

    Submitted 23 January, 2025; originally announced January 2025.

  4. arXiv:2409.17027  [pdf, other

    cs.LG cs.AI cs.CL

    Counterfactual Token Generation in Large Language Models

    Authors: Ivi Chatzi, Nina Corvelo Benz, Eleni Straitouri, Stratis Tsirtsis, Manuel Gomez-Rodriguez

    Abstract: "Sure, I am happy to generate a story for you: Captain Lyra stood at the helm of her trusty ship, the Maelstrom's Fury, gazing out at the endless sea. [...] Lyra's eyes welled up with tears as she realized the bitter truth - she had sacrificed everything for fleeting riches, and lost the love of her crew, her family, and herself." Although this story, generated by a large language model, is captiv… ▽ More

    Submitted 24 March, 2025; v1 submitted 25 September, 2024; originally announced September 2024.

    Comments: Accepted at CLeaR 2025

  5. arXiv:2407.13052  [pdf, other

    cs.CY cs.DS cs.LG

    Matchings, Predictions and Counterfactual Harm in Refugee Resettlement Processes

    Authors: Seungeon Lee, Nina Corvelo Benz, Suhas Thejaswi, Manuel Gomez-Rodriguez

    Abstract: Resettlement agencies have started to adopt data-driven algorithmic matching to match refugees to locations using employment rate as a measure of utility. Given a pool of refugees, data-driven algorithmic matching utilizes a classifier to predict the probability that each refugee would find employment at any given location. Then, it uses the predicted probabilities to estimate the expected utility… ▽ More

    Submitted 24 May, 2024; originally announced July 2024.

    Comments: 24 pages including reference and appendix

  6. arXiv:2401.03298  [pdf, other

    cs.CV

    ENSTRECT: A Stage-based Approach to 2.5D Structural Damage Detection

    Authors: Christian Benz, Volker Rodehorst

    Abstract: To effectively assess structural damage, it is essential to localize the instances of damage in the physical world of a civil structure. ENSTRECT is a stage-based approach designed to accomplish 2.5D structural damage detection. The method requires an image collection, the relative orientation, and a point cloud. Using these inputs, surface damages are segmented at the image level and then mapped… ▽ More

    Submitted 2 October, 2024; v1 submitted 6 January, 2024; originally announced January 2024.

  7. arXiv:2309.09742  [pdf, other

    cs.CV

    Drawing the Same Bounding Box Twice? Coping Noisy Annotations in Object Detection with Repeated Labels

    Authors: David Tschirschwitz, Christian Benz, Morris Florek, Henrik Norderhus, Benno Stein, Volker Rodehorst

    Abstract: The reliability of supervised machine learning systems depends on the accuracy and availability of ground truth labels. However, the process of human annotation, being prone to error, introduces the potential for noisy labels, which can impede the practicality of these systems. While training with noisy labels is a significant consideration, the reliability of test data is also crucial to ascertai… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  8. arXiv:2306.00074  [pdf, other

    cs.LG cs.CY cs.HC stat.ML

    Human-Aligned Calibration for AI-Assisted Decision Making

    Authors: Nina L. Corvelo Benz, Manuel Gomez Rodriguez

    Abstract: Whenever a binary classifier is used to provide decision support, it typically provides both a label prediction and a confidence value. Then, the decision maker is supposed to use the confidence value to calibrate how much to trust the prediction. In this context, it has been often argued that the confidence value should correspond to a well calibrated estimate of the probability that the predicte… ▽ More

    Submitted 23 February, 2024; v1 submitted 31 May, 2023; originally announced June 2023.

  9. Appropriate Reliance on AI Advice: Conceptualization and the Effect of Explanations

    Authors: Max Schemmer, Niklas Kühl, Carina Benz, Andrea Bartos, Gerhard Satzger

    Abstract: AI advice is becoming increasingly popular, e.g., in investment and medical treatment decisions. As this advice is typically imperfect, decision-makers have to exert discretion as to whether actually follow that advice: they have to "appropriately" rely on correct and turn down incorrect advice. However, current research on appropriate reliance still lacks a common definition as well as an operati… ▽ More

    Submitted 13 April, 2023; v1 submitted 4 February, 2023; originally announced February 2023.

    Comments: arXiv admin note: text overlap with arXiv:2204.06916

    Journal ref: ACM 28th International Conference on Intelligent User Interfaces (IUI), 2023

  10. arXiv:2210.02123  [pdf

    cs.HC

    On the Influence of Cognitive Styles on Users' Understanding of Explanations

    Authors: Lara Riefle, Patrick Hemmer, Carina Benz, Michael Vössing, Jannik Pries

    Abstract: Artificial intelligence (AI) is becoming increasingly complex, making it difficult for users to understand how the AI has derived its prediction. Using explainable AI (XAI)-methods, researchers aim to explain AI decisions to users. So far, XAI-based explanations pursue a technology-focused approach - neglecting the influence of users' cognitive abilities and differences in information processing o… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: Accepted at 43rd International Conference on Information Systems (ICIS 2022)

  11. arXiv:2204.08859  [pdf

    cs.HC cs.AI

    On the Influence of Explainable AI on Automation Bias

    Authors: Max Schemmer, Niklas Kühl, Carina Benz, Gerhard Satzger

    Abstract: Artificial intelligence (AI) is gaining momentum, and its importance for the future of work in many areas, such as medicine and banking, is continuously rising. However, insights on the effective collaboration of humans and AI are still rare. Typically, AI supports humans in decision-making by addressing human limitations. However, it may also evoke human bias, especially in the form of automation… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

    Comments: Thirtieth European Conference on Information Systems (ECIS 2022)

  12. arXiv:2204.06916  [pdf, other

    cs.HC cs.AI

    Should I Follow AI-based Advice? Measuring Appropriate Reliance in Human-AI Decision-Making

    Authors: Max Schemmer, Patrick Hemmer, Niklas Kühl, Carina Benz, Gerhard Satzger

    Abstract: Many important decisions in daily life are made with the help of advisors, e.g., decisions about medical treatments or financial investments. Whereas in the past, advice has often been received from human experts, friends, or family, advisors based on artificial intelligence (AI) have become more and more present nowadays. Typically, the advice generated by AI is judged by a human and either deeme… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: ACM Conference on Human Factors in Computing Systems (CHI '22), Workshop on Trust and Reliance in AI-Human Teams (trAIt)

  13. arXiv:2203.08653  [pdf, other

    cs.LG cs.CY cs.HC stat.ME stat.ML

    Counterfactual Inference of Second Opinions

    Authors: Nina L. Corvelo Benz, Manuel Gomez Rodriguez

    Abstract: Automated decision support systems that are able to infer second opinions from experts can potentially facilitate a more efficient allocation of resources; they can help decide when and from whom to seek a second opinion. In this paper, we look at the design of this type of support systems from the perspective of counterfactual inference. We focus on a multiclass classification setting and first s… ▽ More

    Submitted 30 June, 2022; v1 submitted 16 March, 2022; originally announced March 2022.