Uncovering Bias in Foundation Models: Impact, Testing, Harm, and Mitigation

Sun, Shuzhou; Liu, Li; Liu, Yongxiang; Liu, Zhen; Zhang, Shuanghui; Heikkilä, Janne; Li, Xiang

Computer Science > Machine Learning

arXiv:2501.10453 (cs)

[Submitted on 14 Jan 2025]

Title:Uncovering Bias in Foundation Models: Impact, Testing, Harm, and Mitigation

Authors:Shuzhou Sun (1 and 2), Li Liu (3), Yongxiang Liu (3), Zhen Liu (3), Shuanghui Zhang (3), Janne Heikkilä (2), Xiang Li (3) ((1) The College of Computer Science, Nankai University, Tianjin, China, (2) The Center for Machine Vision and Signal Analysis, University of Oulu, Finland, (3) The College of Electronic Science, National University of Defense Technology, China)

View PDF HTML (experimental)

Abstract:Bias in Foundation Models (FMs) - trained on vast datasets spanning societal and historical knowledge - poses significant challenges for fairness and equity across fields such as healthcare, education, and finance. These biases, rooted in the overrepresentation of stereotypes and societal inequalities in training data, exacerbate real-world discrimination, reinforce harmful stereotypes, and erode trust in AI systems. To address this, we introduce Trident Probe Testing (TriProTesting), a systematic testing method that detects explicit and implicit biases using semantically designed probes. Here we show that FMs, including CLIP, ALIGN, BridgeTower, and OWLv2, demonstrate pervasive biases across single and mixed social attributes (gender, race, age, and occupation). Notably, we uncover mixed biases when social attributes are combined, such as gender x race, gender x age, and gender x occupation, revealing deeper layers of discrimination. We further propose Adaptive Logit Adjustment (AdaLogAdjustment), a post-processing technique that dynamically redistributes probability power to mitigate these biases effectively, achieving significant improvements in fairness without retraining models. These findings highlight the urgent need for ethical AI practices and interdisciplinary solutions to address biases not only at the model level but also in societal structures. Our work provides a scalable and interpretable solution that advances fairness in AI systems while offering practical insights for future research on fair AI technologies.

Comments:	60 pages, 5 figures
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
Cite as:	arXiv:2501.10453 [cs.LG]
	(or arXiv:2501.10453v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2501.10453

Submission history

From: Shuzhou Sun [view email]
[v1] Tue, 14 Jan 2025 19:06:37 UTC (5,939 KB)

Computer Science > Machine Learning

Title:Uncovering Bias in Foundation Models: Impact, Testing, Harm, and Mitigation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Uncovering Bias in Foundation Models: Impact, Testing, Harm, and Mitigation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators