Skip to main content

Showing 1–13 of 13 results for author: Fogliato, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.17427  [pdf, other

    stat.ME cs.LG math.ST stat.ML

    Stronger Neyman Regret Guarantees for Adaptive Experimental Design

    Authors: Georgy Noarov, Riccardo Fogliato, Martin Bertran, Aaron Roth

    Abstract: We study the design of adaptive, sequential experiments for unbiased average treatment effect (ATE) estimation in the design-based potential outcomes setting. Our goal is to develop adaptive designs offering sublinear Neyman regret, meaning their efficiency must approach that of the hindsight-optimal nonadaptive design. Recent work [Dai et al, 2023] introduced ClipOGD, the first method achieving… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  2. arXiv:2412.04642  [pdf, other

    cs.LG cs.AI

    Improving LLM Group Fairness on Tabular Data via In-Context Learning

    Authors: Valeriia Cherepanova, Chia-Jung Lee, Nil-Jana Akpinar, Riccardo Fogliato, Martin Andres Bertran, Michael Kearns, James Zou

    Abstract: Large language models (LLMs) have been shown to be effective on tabular prediction tasks in the low-data regime, leveraging their internal knowledge and ability to learn from instructions and examples. However, LLMs can fail to generate predictions that satisfy group fairness, that is, produce equitable outcomes across groups. Critically, conventional debiasing approaches for natural language task… ▽ More

    Submitted 5 December, 2024; originally announced December 2024.

  3. arXiv:2410.05222  [pdf, other

    cs.LG cs.CL cs.CV stat.AP

    Precise Model Benchmarking with Only a Few Observations

    Authors: Riccardo Fogliato, Pratik Patil, Nil-Jana Akpinar, Mathew Monfort

    Abstract: How can we precisely estimate a large language model's (LLM) accuracy on questions belonging to a specific topic within a larger question-answering dataset? The standard direct estimator, which averages the model's accuracy on the questions in each subgroup, may exhibit high variance for subgroups (topics) with small sample sizes. Synthetic regression modeling, which leverages the model's accuracy… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: To appear at EMNLP 2024

  4. arXiv:2406.07320  [pdf, other

    cs.CV stat.AP

    A Framework for Efficient Model Evaluation through Stratification, Sampling, and Estimation

    Authors: Riccardo Fogliato, Pratik Patil, Mathew Monfort, Pietro Perona

    Abstract: Model performance evaluation is a critical and expensive task in machine learning and computer vision. Without clear guidelines, practitioners often estimate model accuracy using a one-time completely random selection of the data. However, by employing tailored sampling and estimation strategies, one can obtain more precise estimates and reduce annotation costs. In this paper, we propose a statist… ▽ More

    Submitted 18 July, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: To appear at ECCV 2024

  5. arXiv:2404.04689  [pdf, other

    stat.ML cs.CL cs.LG

    Multicalibration for Confidence Scoring in LLMs

    Authors: Gianluca Detommaso, Martin Bertran, Riccardo Fogliato, Aaron Roth

    Abstract: This paper proposes the use of "multicalibration" to yield interpretable and reliable confidence scores for outputs generated by large language models (LLMs). Multicalibration asks for calibration not just marginally, but simultaneously across various intersecting groupings of the data. We show how to form groupings for prompt/completion pairs that are correlated with the probability of correctnes… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  6. arXiv:2306.01198  [pdf, other

    stat.ME cs.CV stat.ML

    Confidence Intervals for Error Rates in 1:1 Matching Tasks: Critical Statistical Analysis and Recommendations

    Authors: Riccardo Fogliato, Pratik Patil, Pietro Perona

    Abstract: Matching algorithms are commonly used to predict matches between items in a collection. For example, in 1:1 face verification, a matching algorithm predicts whether two face images depict the same person. Accurately assessing the uncertainty of the error rates of such algorithms can be challenging when data are dependent and error rates are low, two aspects that have been often overlooked in the l… ▽ More

    Submitted 26 April, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

  7. The Progression of Disparities within the Criminal Justice System: Differential Enforcement and Risk Assessment Instruments

    Authors: Miri Zilka, Riccardo Fogliato, Jiri Hron, Bradley Butcher, Carolyn Ashurst, Adrian Weller

    Abstract: Algorithmic risk assessment instruments (RAIs) increasingly inform decision-making in criminal justice. RAIs largely rely on arrest records as a proxy for underlying crime. Problematically, the extent to which arrests reflect overall offending can vary with the person's characteristics. We examine how the disconnect between crime and arrest rates impacts RAIs and their evaluation. Our main contrib… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: Accepted to FAccT '23

  8. arXiv:2205.09701  [pdf, other

    cs.HC cs.CY

    Homophily and Incentive Effects in Use of Algorithms

    Authors: Riccardo Fogliato, Sina Fazelpour, Shantanu Gupta, Zachary Lipton, David Danks

    Abstract: As algorithmic tools increasingly aid experts in making consequential decisions, the need to understand the precise factors that mediate their influence has grown commensurately. In this paper, we present a crowdsourcing vignette study designed to assess the impacts of two plausible factors on AI-informed decision-making. First, we examine homophily -- do people defer more to models that tend to a… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: Accepted at CogSci, 2022

  9. arXiv:2205.09696  [pdf, other

    cs.HC cs.AI

    Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging

    Authors: Riccardo Fogliato, Shreya Chappidi, Matthew Lungren, Michael Fitzke, Mark Parkinson, Diane Wilson, Paul Fisher, Eric Horvitz, Kori Inkpen, Besmira Nushi

    Abstract: Details of the designs and mechanisms in support of human-AI collaboration must be considered in the real-world fielding of AI technologies. A critical aspect of interaction design for AI-assisted human decision making are policies about the display and sequencing of AI inferences within larger decision-making workflows. We have a poor understanding of the influences of making AI inferences availa… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: Accepted at ACM Conference on Fairness, Accountability, and Transparency (FAccT), 2022

  10. arXiv:2109.01443  [pdf, other

    cs.HC cs.AI

    The Impact of Algorithmic Risk Assessments on Human Predictions and its Analysis via Crowdsourcing Studies

    Authors: Riccardo Fogliato, Alexandra Chouldechova, Zachary Lipton

    Abstract: As algorithmic risk assessment instruments (RAIs) are increasingly adopted to assist decision makers, their predictive performance and potential to promote inequity have come under scrutiny. However, while most studies examine these tools in isolation, researchers have come to recognize that assessing their impact requires understanding the behavior of their human interactants. In this paper, buil… ▽ More

    Submitted 3 September, 2021; originally announced September 2021.

    Comments: Proceedings of the ACM on Human-Computer Interaction 5, CSCW2, Article 428 (October 2021)

  11. arXiv:2011.07586  [pdf, other

    cs.CY cs.HC cs.LG

    Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty

    Authors: Umang Bhatt, Javier AntorĂ¡n, Yunfeng Zhang, Q. Vera Liao, Prasanna Sattigeri, Riccardo Fogliato, Gabrielle Gauthier Melançon, Ranganath Krishnan, Jason Stanley, Omesh Tickoo, Lama Nachman, Rumi Chunara, Madhulika Srikumar, Adrian Weller, Alice Xiang

    Abstract: Algorithmic transparency entails exposing system properties to various stakeholders for purposes that include understanding, improving, and contesting predictions. Until now, most research into algorithmic transparency has predominantly focused on explainability. Explainability attempts to provide reasons for a machine learning model's behavior to stakeholders. However, understanding a model's spe… ▽ More

    Submitted 4 May, 2021; v1 submitted 15 November, 2020; originally announced November 2020.

    Comments: AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society (AIES) 2021

  12. arXiv:2003.13808  [pdf, other

    stat.ME cs.CY

    Fairness Evaluation in Presence of Biased Noisy Labels

    Authors: Riccardo Fogliato, Max G'Sell, Alexandra Chouldechova

    Abstract: Risk assessment tools are widely used around the country to inform decision making within the criminal justice system. Recently, considerable attention has been devoted to the question of whether such tools may suffer from racial bias. In this type of assessment, a fundamental issue is that the training and evaluation of the model is based on a variable (arrest) that may represent a noisy version… ▽ More

    Submitted 30 March, 2020; originally announced March 2020.

    Comments: Accepted at International Conference on Artificial Intelligence and Statistics (AISTATS), 2020

  13. A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores

    Authors: Maria De-Arteaga, Riccardo Fogliato, Alexandra Chouldechova

    Abstract: The increased use of algorithmic predictions in sensitive domains has been accompanied by both enthusiasm and concern. To understand the opportunities and risks of these technologies, it is key to study how experts alter their decisions when using such tools. In this paper, we study the adoption of an algorithmic tool used to assist child maltreatment hotline screening decisions. We focus on the q… ▽ More

    Submitted 20 February, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

    Comments: Accepted at ACM Conference on Human Factors in Computing Systems (ACM CHI), 2020