-
Representative Ranking for Deliberation in the Public Sphere
Authors:
Manon Revel,
Smitha Milli,
Tyler Lu,
Jamelle Watson-Daniels,
Max Nickel
Abstract:
Online comment sections, such as those on news sites or social media, have the potential to foster informal public deliberation, However, this potential is often undermined by the frequency of toxic or low-quality exchanges that occur in these settings. To combat this, platforms increasingly leverage algorithmic ranking to facilitate higher-quality discussions, e.g., by using civility classifiers…
▽ More
Online comment sections, such as those on news sites or social media, have the potential to foster informal public deliberation, However, this potential is often undermined by the frequency of toxic or low-quality exchanges that occur in these settings. To combat this, platforms increasingly leverage algorithmic ranking to facilitate higher-quality discussions, e.g., by using civility classifiers or forms of prosocial ranking. Yet, these interventions may also inadvertently reduce the visibility of legitimate viewpoints, undermining another key aspect of deliberation: representation of diverse views. We seek to remedy this problem by introducing guarantees of representation into these methods. In particular, we adopt the notion of justified representation (JR) from the social choice literature and incorporate a JR constraint into the comment ranking setting. We find that enforcing JR leads to greater inclusion of diverse viewpoints while still being compatible with optimizing for user engagement or other measures of conversational quality.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Cross-Modal Projection in Multimodal LLMs Doesn't Really Project Visual Attributes to Textual Space
Authors:
Gaurav Verma,
Minje Choi,
Kartik Sharma,
Jamelle Watson-Daniels,
Sejoon Oh,
Srijan Kumar
Abstract:
Multimodal large language models (MLLMs) like LLaVA and GPT-4(V) enable general-purpose conversations about images with the language modality. As off-the-shelf MLLMs may have limited capabilities on images from domains like dermatology and agriculture, they must be fine-tuned to unlock domain-specific applications. The prevalent architecture of current open-source MLLMs comprises two major modules…
▽ More
Multimodal large language models (MLLMs) like LLaVA and GPT-4(V) enable general-purpose conversations about images with the language modality. As off-the-shelf MLLMs may have limited capabilities on images from domains like dermatology and agriculture, they must be fine-tuned to unlock domain-specific applications. The prevalent architecture of current open-source MLLMs comprises two major modules: an image-language (cross-modal) projection network and a large language model. It is desirable to understand the roles of these two modules in modeling domain-specific visual attributes to inform the design of future models and streamline the interpretability efforts on the current models. To this end, via experiments on 4 datasets and under 2 fine-tuning settings, we find that as the MLLM is fine-tuned, it indeed gains domain-specific visual capabilities, but the updates do not lead to the projection extracting relevant domain-specific visual attributes. Our results indicate that the domain-specific visual attributes are modeled by the LLM, even when only the projection is fine-tuned. Through this study, we offer a potential reinterpretation of the role of cross-modal projections in MLLM architectures. Project webpage: https://claws-lab.github.io/projection-in-MLLMs/
△ Less
Submitted 21 July, 2024; v1 submitted 26 February, 2024;
originally announced February 2024.
-
Algorithmic Fairness and Color-blind Racism: Navigating the Intersection
Authors:
Jamelle Watson-Daniels
Abstract:
Our focus lies at the intersection between two broader research perspectives: (1) the scientific study of algorithms and (2) the scholarship on race and racism. Many streams of research related to algorithmic fairness have been born out of interest at this intersection. We think about this intersection as the product of work derived from both sides. From (1) algorithms to (2) racism, the starting…
▽ More
Our focus lies at the intersection between two broader research perspectives: (1) the scientific study of algorithms and (2) the scholarship on race and racism. Many streams of research related to algorithmic fairness have been born out of interest at this intersection. We think about this intersection as the product of work derived from both sides. From (1) algorithms to (2) racism, the starting place might be an algorithmic question or method connected to a conceptualization of racism. On the other hand, from (2) racism to (1) algorithms, the starting place could be recognizing a setting where a legacy of racism is known to persist and drawing connections between that legacy and the introduction of algorithms into this setting. In either direction, meaningful disconnection can occur when conducting research at the intersection of racism and algorithms. The present paper urges collective reflection on research directions at this intersection. Despite being primarily motivated by instances of racial bias, research in algorithmic fairness remains mostly disconnected from scholarship on racism. In particular, there has not been an examination connecting algorithmic fairness discussions directly to the ideology of color-blind racism; we aim to fill this gap. We begin with a review of an essential account of color-blind racism then we review racial discourse within algorithmic fairness research and underline significant patterns, shifts and disconnects. Ultimately, we argue that researchers can improve the navigation of the landscape at the intersection by recognizing ideological shifts as such and iteratively re-orienting towards maintaining meaningful connections across interdisciplinary lines.
△ Less
Submitted 25 April, 2025; v1 submitted 12 February, 2024;
originally announced February 2024.
-
Predictive Churn with the Set of Good Models
Authors:
Jamelle Watson-Daniels,
Flavio du Pin Calmon,
Alexander D'Amour,
Carol Long,
David C. Parkes,
Berk Ustun
Abstract:
Issues can arise when research focused on fairness, transparency, or safety is conducted separately from research driven by practical deployment concerns and vice versa. This separation creates a growing need for translational work that bridges the gap between independently studied concepts that may be fundamentally related. This paper explores connections between two seemingly unrelated concepts…
▽ More
Issues can arise when research focused on fairness, transparency, or safety is conducted separately from research driven by practical deployment concerns and vice versa. This separation creates a growing need for translational work that bridges the gap between independently studied concepts that may be fundamentally related. This paper explores connections between two seemingly unrelated concepts of predictive inconsistency that share intriguing parallels. The first, known as predictive multiplicity, occurs when models that perform similarly (e.g., nearly equivalent training loss) produce conflicting predictions for individual samples. This concept is often emphasized in algorithmic fairness research as a means of promoting transparency in ML model development. The second concept, predictive churn, examines the differences in individual predictions before and after model updates, a key challenge in deploying ML models in consumer-facing applications. We present theoretical and empirical results that uncover links between these previously disconnected concepts.
△ Less
Submitted 25 April, 2025; v1 submitted 12 February, 2024;
originally announced February 2024.
-
Multi-Target Multiplicity: Flexibility and Fairness in Target Specification under Resource Constraints
Authors:
Jamelle Watson-Daniels,
Solon Barocas,
Jake M. Hofman,
Alexandra Chouldechova
Abstract:
Prediction models have been widely adopted as the basis for decision-making in domains as diverse as employment, education, lending, and health. Yet, few real world problems readily present themselves as precisely formulated prediction tasks. In particular, there are often many reasonable target variable options. Prior work has argued that this is an important and sometimes underappreciated choice…
▽ More
Prediction models have been widely adopted as the basis for decision-making in domains as diverse as employment, education, lending, and health. Yet, few real world problems readily present themselves as precisely formulated prediction tasks. In particular, there are often many reasonable target variable options. Prior work has argued that this is an important and sometimes underappreciated choice, and has also shown that target choice can have a significant impact on the fairness of the resulting model. However, the existing literature does not offer a formal framework for characterizing the extent to which target choice matters in a particular task. Our work fills this gap by drawing connections between the problem of target choice and recent work on predictive multiplicity. Specifically, we introduce a conceptual and computational framework for assessing how the choice of target affects individuals' outcomes and selection rate disparities across groups. We call this multi-target multiplicity. Along the way, we refine the study of single-target multiplicity by introducing notions of multiplicity that respect resource constraints -- a feature of many real-world tasks that is not captured by existing notions of predictive multiplicity. We apply our methods on a healthcare dataset, and show that the level of multiplicity that stems from target variable choice can be greater than that stemming from nearly-optimal models of a single target.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Predictive Multiplicity in Probabilistic Classification
Authors:
Jamelle Watson-Daniels,
David C. Parkes,
Berk Ustun
Abstract:
Machine learning models are often used to inform real world risk assessment tasks: predicting consumer default risk, predicting whether a person suffers from a serious illness, or predicting a person's risk to appear in court. Given multiple models that perform almost equally well for a prediction task, to what extent do predictions vary across these models? If predictions are relatively consisten…
▽ More
Machine learning models are often used to inform real world risk assessment tasks: predicting consumer default risk, predicting whether a person suffers from a serious illness, or predicting a person's risk to appear in court. Given multiple models that perform almost equally well for a prediction task, to what extent do predictions vary across these models? If predictions are relatively consistent for similar models, then the standard approach of choosing the model that optimizes a penalized loss suffices. But what if predictions vary significantly for similar models? In machine learning, this is referred to as predictive multiplicity i.e. the prevalence of conflicting predictions assigned by near-optimal competing models. In this paper, we present a framework for measuring predictive multiplicity in probabilistic classification (predicting the probability of a positive outcome). We introduce measures that capture the variation in risk estimates over the set of competing models, and develop optimization-based methods to compute these measures efficiently and reliably for convex empirical risk minimization problems. We demonstrate the incidence and prevalence of predictive multiplicity in real-world tasks. Further, we provide insight into how predictive multiplicity arises by analyzing the relationship between predictive multiplicity and data set characteristics (outliers, separability, and majority-minority structure). Our results emphasize the need to report predictive multiplicity more widely.
△ Less
Submitted 23 June, 2023; v1 submitted 2 June, 2022;
originally announced June 2022.
-
Trapping in irradiated p-on-n silicon sensors at fluences anticipated at the HL-LHC outer tracker
Authors:
W. Adam,
T. Bergauer,
M. Dragicevic,
M. Friedl,
R. Fruehwirth,
M. Hoch,
J. Hrubec,
M. Krammer,
W. Treberspurg,
W. Waltenberger,
S. Alderweireldt,
W. Beaumont,
X. Janssen,
S. Luyckx,
P. Van Mechelen,
N. Van Remortel,
A. Van Spilbeeck,
P. Barria,
C. Caillol,
B. Clerbaux,
G. De Lentdecker,
D. Dobur,
L. Favart,
A. Grebenyuk,
Th. Lenzi
, et al. (663 additional authors not shown)
Abstract:
The degradation of signal in silicon sensors is studied under conditions expected at the CERN High-Luminosity LHC. 200 $μ$m thick n-type silicon sensors are irradiated with protons of different energies to fluences of up to $3 \cdot 10^{15}$ neq/cm$^2$. Pulsed red laser light with a wavelength of 672 nm is used to generate electron-hole pairs in the sensors. The induced signals are used to determi…
▽ More
The degradation of signal in silicon sensors is studied under conditions expected at the CERN High-Luminosity LHC. 200 $μ$m thick n-type silicon sensors are irradiated with protons of different energies to fluences of up to $3 \cdot 10^{15}$ neq/cm$^2$. Pulsed red laser light with a wavelength of 672 nm is used to generate electron-hole pairs in the sensors. The induced signals are used to determine the charge collection efficiencies separately for electrons and holes drifting through the sensor. The effective trapping rates are extracted by comparing the results to simulation. The electric field is simulated using Synopsys device simulation assuming two effective defects. The generation and drift of charge carriers are simulated in an independent simulation based on PixelAV. The effective trapping rates are determined from the measured charge collection efficiencies and the simulated and measured time-resolved current pulses are compared. The effective trapping rates determined for both electrons and holes are about 50% smaller than those obtained using standard extrapolations of studies at low fluences and suggests an improved tracker performance over initial expectations.
△ Less
Submitted 7 May, 2015;
originally announced May 2015.