Skip to main content

Showing 1–6 of 6 results for author: Mackraz, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.23996  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Is Your Model Fairly Certain? Uncertainty-Aware Fairness Evaluation for LLMs

    Authors: Yinong Oliver Wang, Nivedha Sivakumar, Falaah Arif Khan, Rin Metcalf Susa, Adam Golinski, Natalie Mackraz, Barry-John Theobald, Luca Zappella, Nicholas Apostoloff

    Abstract: The recent rapid adoption of large language models (LLMs) highlights the critical need for benchmarking their fairness. Conventional fairness metrics, which focus on discrete accuracy-based evaluations (i.e., prediction correctness), fail to capture the implicit impact of model uncertainty (e.g., higher model confidence about one group over another despite similar accuracy). To address this limita… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 9 pages, 8 figures, and 1 table in main paper. Supplementary appendix attached. Accepted at ICML 2025

  2. arXiv:2505.23815  [pdf, ps, other

    cs.CL cs.LG

    Aligning LLMs by Predicting Preferences from User Writing Samples

    Authors: Stéphane Aroca-Ouellette, Natalie Mackraz, Barry-John Theobald, Katherine Metcalf

    Abstract: Accommodating human preferences is essential for creating aligned LLM agents that deliver personalized and effective interactions. Recent work has shown the potential for LLMs acting as writing agents to infer a description of user preferences. Agent alignment then comes from conditioning on the inferred preference description. However, existing methods often produce generic preference description… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: Accepted to ICML 2025. 32 pages total: 9 main, 2 references, 21 appendix. arXiv admin note: substantial text overlap with arXiv:2410.06273

  3. arXiv:2412.03537  [pdf, other

    cs.CL cs.AI cs.LG

    Evaluating Gender Bias Transfer between Pre-trained and Prompt-Adapted Language Models

    Authors: Natalie Mackraz, Nivedha Sivakumar, Samira Khorshidi, Krishna Patel, Barry-John Theobald, Luca Zappella, Nicholas Apostoloff

    Abstract: Large language models (LLMs) are increasingly being adapted to achieve task-specificity for deployment in real-world decision systems. Several previous works have investigated the bias transfer hypothesis (BTH) by studying the effect of the fine-tuning adaptation strategy on model fairness to find that fairness in pre-trained masked language models have limited effect on the fairness of models whe… ▽ More

    Submitted 4 December, 2024; originally announced December 2024.

  4. arXiv:2410.06273  [pdf, other

    cs.AI cs.HC

    PREDICT: Preference Reasoning by Evaluating Decomposed preferences Inferred from Candidate Trajectories

    Authors: Stephane Aroca-Ouellette, Natalie Mackraz, Barry-John Theobald, Katherine Metcalf

    Abstract: Accommodating human preferences is essential for creating AI agents that deliver personalized and effective interactions. Recent work has shown the potential for LLMs to infer preferences from user interactions, but they often produce broad and generic preferences, failing to capture the unique and individualized nature of human preferences. This paper introduces PREDICT, a method designed to enha… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  5. arXiv:2402.17975  [pdf, other

    cs.AI cs.LG

    Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards

    Authors: Katherine Metcalf, Miguel Sarabia, Natalie Mackraz, Barry-John Theobald

    Abstract: Preference-based reinforcement learning (PbRL) aligns a robot behavior with human preferences via a reward function learned from binary feedback over agent behaviors. We show that dynamics-aware reward functions improve the sample efficiency of PbRL by an order of magnitude. In our experiments we iterate between: (1) learning a dynamics-aware state-action representation (z^{sa}) via a self-supervi… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: CoRL 2023. arXiv admin note: substantial text overlap with arXiv:2211.06527

  6. arXiv:2310.17722  [pdf, other

    cs.LG cs.AI cs.CL

    Large Language Models as Generalizable Policies for Embodied Tasks

    Authors: Andrew Szot, Max Schwarzer, Harsh Agrawal, Bogdan Mazoure, Walter Talbott, Katherine Metcalf, Natalie Mackraz, Devon Hjelm, Alexander Toshev

    Abstract: We show that large language models (LLMs) can be adapted to be generalizable policies for embodied visual tasks. Our approach, called Large LAnguage model Reinforcement Learning Policy (LLaRP), adapts a pre-trained frozen LLM to take as input text instructions and visual egocentric observations and output actions directly in the environment. Using reinforcement learning, we train LLaRP to see and… ▽ More

    Submitted 16 April, 2024; v1 submitted 26 October, 2023; originally announced October 2023.