Skip to main content

Showing 1–22 of 22 results for author: Shen, J H

.
  1. arXiv:2504.16277  [pdf, other

    cs.LG cs.AI

    DataS^3: Dataset Subset Selection for Specialization

    Authors: Neha Hulkund, Alaa Maalouf, Levi Cai, Daniel Yang, Tsun-Hsuan Wang, Abigail O'Neil, Timm Haucke, Sandeep Mukherjee, Vikram Ramaswamy, Judy Hansen Shen, Gabriel Tseng, Mike Walmsley, Daniela Rus, Ken Goldberg, Hannah Kerner, Irene Chen, Yogesh Girdhar, Sara Beery

    Abstract: In many real-world machine learning (ML) applications (e.g. detecting broken bones in x-ray images, detecting species in camera traps), in practice models need to perform well on specific deployments (e.g. a specific hospital, a specific national park) rather than the domain broadly. However, deployments often have imbalanced, unique data distributions. Discrepancy between the training distributio… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

  2. arXiv:2504.06549  [pdf, other

    cs.CY cs.AI

    Societal Impacts Research Requires Benchmarks for Creative Composition Tasks

    Authors: Judy Hanwen Shen, Carlos Guestrin

    Abstract: Foundation models that are capable of automating cognitive tasks represent a pivotal technological shift, yet their societal implications remain unclear. These systems promise exciting advances, yet they also risk flooding our information ecosystem with formulaic, homogeneous, and potentially misleading synthetic content. Developing benchmarks grounded in real use cases where these risks are most… ▽ More

    Submitted 24 May, 2025; v1 submitted 8 April, 2025; originally announced April 2025.

    Comments: v1: ICLR 2025 Workshop on Bidirectional Human-AI Alignment (BiAlign) v2: ICML 2025 Position Paper

  3. arXiv:2502.13221  [pdf, other

    cs.LG cs.AI cs.CY cs.GT

    Two Tickets are Better than One: Fair and Accurate Hiring Under Strategic LLM Manipulations

    Authors: Lee Cohen, Jack Hsieh, Connie Hong, Judy Hanwen Shen

    Abstract: In an era of increasingly capable foundation models, job seekers are turning to generative AI tools to enhance their application materials. However, unequal access to and knowledge about generative AI tools can harm both employers and candidates by reducing the accuracy of hiring decisions and giving some candidates an unfair advantage. To address these challenges, we introduce a new variant of th… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

  4. arXiv:2502.02861  [pdf, ps, other

    stat.ML cs.DS cs.LG

    Algorithms with Calibrated Machine Learning Predictions

    Authors: Judy Hanwen Shen, Ellen Vitercik, Anders Wikum

    Abstract: The field of algorithms with predictions incorporates machine learning advice in the design of online algorithms to improve real-world performance. A central consideration is the extent to which predictions can be trusted -- while existing approaches often require users to specify an aggregate trust level, modern machine learning models can provide estimates of prediction-level uncertainty. In thi… ▽ More

    Submitted 14 June, 2025; v1 submitted 4 February, 2025; originally announced February 2025.

    Comments: v2 matches the camera-ready version accepted at ICML 2025

  5. arXiv:2409.09603  [pdf, other

    cs.AI cs.CL cs.LG

    Towards Data-Centric RLHF: Simple Metrics for Preference Dataset Comparison

    Authors: Judy Hanwen Shen, Archit Sharma, Jun Qin

    Abstract: The goal of aligning language models to human preferences requires data that reveal these preferences. Ideally, time and money can be spent carefully collecting and tailoring bespoke preference data to each downstream application. However, in practice, a select few publicly available preference datasets are often used to train reward models for reinforcement learning from human feedback (RLHF). Wh… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

    Comments: Working Paper

  6. arXiv:2408.04154  [pdf, other

    cs.LG cs.AI stat.ML

    The Data Addition Dilemma

    Authors: Judy Hanwen Shen, Inioluwa Deborah Raji, Irene Y. Chen

    Abstract: In many machine learning for healthcare tasks, standard datasets are constructed by amassing data across many, often fundamentally dissimilar, sources. But when does adding more data help, and when does it hinder progress on desired model outcomes in real-world settings? We identify this situation as the \textit{Data Addition Dilemma}, demonstrating that adding training data in this multi-source s… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: Machine Learning For Health Care 2024 (MLHC)

  7. arXiv:2405.00614  [pdf, other

    cs.LG

    Multigroup Robustness

    Authors: Lunjia Hu, Charlotte Peale, Judy Hanwen Shen

    Abstract: To address the shortcomings of real-world datasets, robust learning algorithms have been designed to overcome arbitrary and indiscriminate data corruption. However, practical processes of gathering data may lead to patterns of data corruption that are localized to specific partitions of the training dataset. Motivated by critical applications where the learned model is deployed to make predictions… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  8. arXiv:2311.12233  [pdf, other

    cs.CL

    Unifying Corroborative and Contributive Attributions in Large Language Models

    Authors: Theodora Worledge, Judy Hanwen Shen, Nicole Meister, Caleb Winston, Carlos Guestrin

    Abstract: As businesses, products, and services spring up around large language models, the trustworthiness of these models hinges on the verifiability of their outputs. However, methods for explaining language model outputs largely fall across two distinct fields of study which both use the term "attribution" to refer to entirely separate techniques: citation generation and training data attribution. In ma… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: NeurIPS ATTRIB Workshop 2023

  9. arXiv:2308.10888  [pdf, other

    cs.LG cs.CV cs.CY

    Unlocking Accuracy and Fairness in Differentially Private Image Classification

    Authors: Leonard Berrada, Soham De, Judy Hanwen Shen, Jamie Hayes, Robert Stanforth, David Stutz, Pushmeet Kohli, Samuel L. Smith, Borja Balle

    Abstract: Privacy-preserving machine learning aims to train models on private data without leaking sensitive information. Differential privacy (DP) is considered the gold standard framework for privacy-preserving training, as it provides formal privacy guarantees. However, compared to their non-private counterparts, models trained with DP often have significantly reduced accuracy. Private classifiers are al… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

  10. arXiv:2307.07636  [pdf, other

    cs.AI

    Dissenting Explanations: Leveraging Disagreement to Reduce Model Overreliance

    Authors: Omer Reingold, Judy Hanwen Shen, Aditi Talati

    Abstract: While explainability is a desirable characteristic of increasingly complex black-box models, modern explanation methods have been shown to be inconsistent and contradictory. The semantics of explanations is not always fully understood - to what extent do explanations "explain" a decision and to what extent do they merely advocate for a decision? Can we help humans gain insights from explanations a… ▽ More

    Submitted 7 August, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: V2: AAAI 2024 V1: AI & HCI Workshop at ICML 2023

    MSC Class: 68 ACM Class: I.2

  11. Bidding Strategies for Proportional Representation in Advertisement Campaigns

    Authors: Inbal Livni Navon, Charlotte Peale, Omer Reingold, Judy Hanwen Shen

    Abstract: Many companies rely on advertising platforms such as Google, Facebook, or Instagram to recruit a large and diverse applicant pool for job openings. Prior works have shown that equitable bidding may not result in equitable outcomes due to heterogeneous levels of competition for different types of individuals. Suggestions have been made to address this problem via revisions to the advertising platfo… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: Foundations of Responsible Computing (FORC 2023)

    ACM Class: F.0

  12. arXiv:2205.01157  [pdf, other

    cs.DS cs.CY

    Leximax Approximations and Representative Cohort Selection

    Authors: Monika Henzinger, Charlotte Peale, Omer Reingold, Judy Hanwen Shen

    Abstract: Finding a representative cohort from a broad pool of candidates is a goal that arises in many contexts such as choosing governing committees and consumer panels. While there are many ways to define the degree to which a cohort represents a population, a very appealing solution concept is lexicographic maximality (leximax) which offers a natural (pareto-optimal like) interpretation that the utility… ▽ More

    Submitted 17 May, 2022; v1 submitted 2 May, 2022; originally announced May 2022.

    Comments: 27 pages. Shortened version to appear in FORC 2022

  13. arXiv:2204.07073  [pdf, other

    cs.CY

    Longitudinal Complex Dynamics of Labour Markets Reveal Increasing Polarisation

    Authors: Shahad Althobaiti, Ahmad Alabdulkareem, Judy Hanwen Shen, Iyad Rahwan, Morgan Frank, Esteban Moro, Alex Rutherford

    Abstract: In this paper we conduct a longitudinal analysis of the structure of labour markets in the US over 7 decades of technological, economic and policy change. We make use of network science, natural language processing and machine learning to uncover structural changes in the labour market over time. We find a steady rate of both disappearance of jobs and a shift in the required work tasks, despite mu… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

  14. arXiv:2106.09680  [pdf, other

    cs.LG cs.CR

    Accuracy, Interpretability, and Differential Privacy via Explainable Boosting

    Authors: Harsha Nori, Rich Caruana, Zhiqi Bu, Judy Hanwen Shen, Janardhan Kulkarni

    Abstract: We show that adding differential privacy to Explainable Boosting Machines (EBMs), a recent method for training interpretable ML models, yields state-of-the-art accuracy while protecting privacy. Our experiments on multiple classification and regression datasets show that DP-EBM models suffer surprisingly little accuracy loss even with strong differential privacy guarantees. In addition to high acc… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

    Comments: To be published in ICML 2021. 12 pages, 6 figures

  15. arXiv:2102.03013  [pdf, other

    cs.LG cs.CR

    Fast and Memory Efficient Differentially Private-SGD via JL Projections

    Authors: Zhiqi Bu, Sivakanth Gopi, Janardhan Kulkarni, Yin Tat Lee, Judy Hanwen Shen, Uthaipon Tantipongpipat

    Abstract: Differentially Private-SGD (DP-SGD) of Abadi et al. (2016) and its variations are the only known algorithms for private training of large scale neural networks. This algorithm requires computation of per-sample gradients norms which is extremely slow and memory intensive in practice. In this paper, we present a new framework to design differentially private optimizers called DP-SGD-JL and DP-Adam-… ▽ More

    Submitted 5 February, 2021; originally announced February 2021.

  16. arXiv:2010.05848  [pdf, other

    cs.CL cs.LG

    Human-centric Dialog Training via Offline Reinforcement Learning

    Authors: Natasha Jaques, Judy Hanwen Shen, Asma Ghandeharioun, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Shane Gu, Rosalind Picard

    Abstract: How can we train a dialog model to produce better conversations by learning from human feedback, without the risk of humans teaching it harmful chat behaviors? We start by hosting models online, and gather human feedback from real-time, open-ended conversations, which we then use to train and improve the models using offline reinforcement learning (RL). We identify implicit conversational cues inc… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: To appear in EMNLP 2020 (long paper)

  17. arXiv:2002.09745  [pdf, other

    cs.CR cs.DS cs.LG stat.ML

    Differentially Private Set Union

    Authors: Sivakanth Gopi, Pankaj Gulhane, Janardhan Kulkarni, Judy Hanwen Shen, Milad Shokouhi, Sergey Yekhanin

    Abstract: We study the basic operation of set union in the global model of differential privacy. In this problem, we are given a universe $U$ of items, possibly of infinite size, and a database $D$ of users. Each user $i$ contributes a subset $W_i \subseteq U$ of items. We want an ($ε$,$δ$)-differentially private algorithm which outputs a subset $S \subset \cup_i W_i$ such that the size of $S$ is as large a… ▽ More

    Submitted 6 April, 2022; v1 submitted 22 February, 2020; originally announced February 2020.

    Comments: 23 pages, 7 figures

  18. arXiv:1909.07547  [pdf, other

    cs.LG cs.AI stat.ML

    Hierarchical Reinforcement Learning for Open-Domain Dialog

    Authors: Abdelrhman Saleh, Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Rosalind Picard

    Abstract: Open-domain dialog generation is a challenging problem; maximum likelihood training can lead to repetitive outputs, models have difficulty tracking long-term conversational goals, and training on standard movie or online datasets may lead to the generation of inappropriate, biased, or offensive text. Reinforcement Learning (RL) is a powerful framework that could potentially address these issues, f… ▽ More

    Submitted 31 December, 2019; v1 submitted 16 September, 2019; originally announced September 2019.

  19. arXiv:1907.00456  [pdf, other

    cs.LG cs.AI stat.ML

    Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog

    Authors: Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Gu, Rosalind Picard

    Abstract: Most deep reinforcement learning (RL) systems are not able to learn effectively from off-policy data, especially if they cannot explore online in the environment. These are critical shortcomings for applying RL to real-world problems where collecting data is expensive, and models must be tested offline before being deployed to interact with the environment -- e.g. systems that learn from human int… ▽ More

    Submitted 8 July, 2019; v1 submitted 30 June, 2019; originally announced July 2019.

  20. arXiv:1906.09308  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems

    Authors: Asma Ghandeharioun, Judy Hanwen Shen, Natasha Jaques, Craig Ferguson, Noah Jones, Agata Lapedriza, Rosalind Picard

    Abstract: Building an open-domain conversational agent is a challenging problem. Current evaluation methods, mostly post-hoc judgments of static conversation, do not capture conversation quality in a realistic interactive context. In this paper, we investigate interactive human evaluation and provide evidence for its necessity; we then introduce a novel, model-agnostic, and dataset-agnostic method to approx… ▽ More

    Submitted 3 November, 2019; v1 submitted 21 June, 2019; originally announced June 2019.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

  21. arXiv:1810.03717  [pdf, other

    cs.CL

    Comparing Models of Associative Meaning: An Empirical Investigation of Reference in Simple Language Games

    Authors: Judy Hanwen Shen, Matthias Hofer, Bjarke Felbo, Roger Levy

    Abstract: Simple reference games are of central theoretical and empirical importance in the study of situated language use. Although language provides rich, compositional truth-conditional semantics to facilitate reference, speakers and listeners may sometimes lack the overall lexical and cognitive resources to guarantee successful reference through these means alone. However, language also has rich associa… ▽ More

    Submitted 8 October, 2018; originally announced October 2018.

    Comments: Conference on Computational Natural Language Learning (CoNLL) 2018

  22. arXiv:1803.07233  [pdf, other

    cs.CY cs.AI

    Closing the AI Knowledge Gap

    Authors: Ziv Epstein, Blakeley H. Payne, Judy Hanwen Shen, Abhimanyu Dubey, Bjarke Felbo, Matthew Groh, Nick Obradovich, Manuel Cebrian, Iyad Rahwan

    Abstract: AI researchers employ not only the scientific method, but also methodology from mathematics and engineering. However, the use of the scientific method - specifically hypothesis testing - in AI is typically conducted in service of engineering objectives. Growing interest in topics such as fairness and algorithmic bias show that engineering-focused questions only comprise a subset of the important q… ▽ More

    Submitted 19 March, 2018; originally announced March 2018.

    Comments: 8 pages, 3 figures, under review