A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets

Mildenberger, David; Hager, Paul; Rueckert, Daniel; Menten, Martin J

Computer Science > Machine Learning

arXiv:2503.17024 (cs)

[Submitted on 21 Mar 2025]

Title:A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets

Authors:David Mildenberger, Paul Hager, Daniel Rueckert, Martin J Menten

View PDF HTML (experimental)

Abstract:Supervised contrastive learning (SupCon) has proven to be a powerful alternative to the standard cross-entropy loss for classification of multi-class balanced datasets. However, it struggles to learn well-conditioned representations of datasets with long-tailed class distributions. This problem is potentially exacerbated for binary imbalanced distributions, which are commonly encountered during many real-world problems such as medical diagnosis. In experiments on seven binary datasets of natural and medical images, we show that the performance of SupCon decreases with increasing class imbalance. To substantiate these findings, we introduce two novel metrics that evaluate the quality of the learned representation space. By measuring the class distribution in local neighborhoods, we are able to uncover structural deficiencies of the representation space that classical metrics cannot detect. Informed by these insights, we propose two new supervised contrastive learning strategies tailored to binary imbalanced datasets that improve the structure of the representation space and increase downstream classification accuracy over standard SupCon by up to 35%. We make our code available.

Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2503.17024 [cs.LG]
	(or arXiv:2503.17024v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2503.17024

Submission history

From: Paul Hager [view email]
[v1] Fri, 21 Mar 2025 10:34:51 UTC (18,106 KB)

Computer Science > Machine Learning

Title:A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators