-
Preserving Task-Relevant Information Under Linear Concept Removal
Authors:
Floris Holstege,
Shauli Ravfogel,
Bram Wouters
Abstract:
Modern neural networks often encode unwanted concepts alongside task-relevant information, leading to fairness and interpretability concerns. Existing post-hoc approaches can remove undesired concepts but often degrade useful signals. We introduce SPLICE-Simultaneous Projection for LInear concept removal and Covariance prEservation-which eliminates sensitive concepts from representations while exa…
▽ More
Modern neural networks often encode unwanted concepts alongside task-relevant information, leading to fairness and interpretability concerns. Existing post-hoc approaches can remove undesired concepts but often degrade useful signals. We introduce SPLICE-Simultaneous Projection for LInear concept removal and Covariance prEservation-which eliminates sensitive concepts from representations while exactly preserving their covariance with a target label. SPLICE achieves this via an oblique projection that "splices out" the unwanted direction yet protects important label correlations. Theoretically, it is the unique solution that removes linear concept predictability and maintains target covariance with minimal embedding distortion. Empirically, SPLICE outperforms baselines on benchmarks such as Bias in Bios and Winobias, removing protected attributes while minimally damaging main-task information.
△ Less
Submitted 12 June, 2025;
originally announced June 2025.
-
Auditing a Dutch Public Sector Risk Profiling Algorithm Using an Unsupervised Bias Detection Tool
Authors:
Floris Holstege,
Mackenzie Jorgensen,
Kirtan Padh,
Jurriaan Parie,
Joel Persson,
Krsto Prorokovic,
Lukas Snoek
Abstract:
Algorithms are increasingly used to automate or aid human decisions, yet recent research shows that these algorithms may exhibit bias across legally protected demographic groups. However, data on these groups may be unavailable to organizations or external auditors due to privacy legislation. This paper studies bias detection using an unsupervised clustering tool when data on demographic groups ar…
▽ More
Algorithms are increasingly used to automate or aid human decisions, yet recent research shows that these algorithms may exhibit bias across legally protected demographic groups. However, data on these groups may be unavailable to organizations or external auditors due to privacy legislation. This paper studies bias detection using an unsupervised clustering tool when data on demographic groups are unavailable. We collaborate with the Dutch Executive Agency for Education to audit an algorithm that was used to assign risk scores to college students at the national level in the Netherlands between 2012-2023. Our audit covers more than 250,000 students from the whole country. The unsupervised clustering tool highlights known disparities between students with a non-European migration background and Dutch origin. Our contributions are three-fold: (1) we assess bias in a real-world, large-scale and high-stakes decision-making process by a governmental organization; (2) we use simulation studies to highlight potential pitfalls of using the unsupervised clustering tool to detect true bias when demographic group data are unavailable and provide recommendations for valid inferences; (3) we provide the unsupervised clustering tool in an open-source library. Our work serves as a starting point for a deliberative assessment by human experts to evaluate potential discrimination in algorithmic-supported decision-making processes.
△ Less
Submitted 5 May, 2025; v1 submitted 3 February, 2025;
originally announced February 2025.
-
Optimizing importance weighting in the presence of sub-population shifts
Authors:
Floris Holstege,
Bram Wouters,
Noud van Giersbergen,
Cees Diks
Abstract:
A distribution shift between the training and test data can severely harm performance of machine learning models. Importance weighting addresses this issue by assigning different weights to data points during training. We argue that existing heuristics for determining the weights are suboptimal, as they neglect the increase of the variance of the estimated model due to the finite sample size of th…
▽ More
A distribution shift between the training and test data can severely harm performance of machine learning models. Importance weighting addresses this issue by assigning different weights to data points during training. We argue that existing heuristics for determining the weights are suboptimal, as they neglect the increase of the variance of the estimated model due to the finite sample size of the training data. We interpret the optimal weights in terms of a bias-variance trade-off, and propose a bi-level optimization procedure in which the weights and model parameters are optimized simultaneously. We apply this optimization to existing importance weighting techniques for last-layer retraining of deep neural networks in the presence of sub-population shifts and show empirically that optimizing weights significantly improves generalization performance.
△ Less
Submitted 18 October, 2024;
originally announced October 2024.
-
Removing Spurious Concepts from Neural Network Representations via Joint Subspace Estimation
Authors:
Floris Holstege,
Bram Wouters,
Noud van Giersbergen,
Cees Diks
Abstract:
Out-of-distribution generalization in neural networks is often hampered by spurious correlations. A common strategy is to mitigate this by removing spurious concepts from the neural network representation of the data. Existing concept-removal methods tend to be overzealous by inadvertently eliminating features associated with the main task of the model, thereby harming model performance. We propos…
▽ More
Out-of-distribution generalization in neural networks is often hampered by spurious correlations. A common strategy is to mitigate this by removing spurious concepts from the neural network representation of the data. Existing concept-removal methods tend to be overzealous by inadvertently eliminating features associated with the main task of the model, thereby harming model performance. We propose an iterative algorithm that separates spurious from main-task concepts by jointly identifying two low-dimensional orthogonal subspaces in the neural network representation. We evaluate the algorithm on benchmark datasets for computer vision (Waterbirds, CelebA) and natural language processing (MultiNLI), and show that it outperforms existing concept removal methods
△ Less
Submitted 22 July, 2024; v1 submitted 18 October, 2023;
originally announced October 2023.
-
Female scholars need to achieve more for equal public recognition
Authors:
Menno H. Schellekens,
Floris Holstege,
Taha Yasseri
Abstract:
Different kinds of "gender gap" have been reported in different walks of the scientific life, almost always favouring male scientists over females. In this work, for the first time, we present a large-scale empirical analysis to ask whether female scientists with the same level of scientific accomplishment are as likely as males to be recognised. We particularly focus on Wikipedia, the open online…
▽ More
Different kinds of "gender gap" have been reported in different walks of the scientific life, almost always favouring male scientists over females. In this work, for the first time, we present a large-scale empirical analysis to ask whether female scientists with the same level of scientific accomplishment are as likely as males to be recognised. We particularly focus on Wikipedia, the open online encyclopedia that its open nature allows us to have a proxy of community recognition. We calculate the probability of appearing on Wikipedia as a scientist for both male and female scholars in three different fields. We find that women in Physics, Economics and Philosophy are considerable less likely than men to be recognised on Wikipedia across all levels of achievement.
△ Less
Submitted 16 April, 2019; v1 submitted 12 April, 2019;
originally announced April 2019.