Tailoring Mixup to Data for Calibration

Bouniot, Quentin; Mozharovskyi, Pavlo; d'Alché-Buc, Florence

Computer Science > Machine Learning

arXiv:2311.01434v2 (cs)

[Submitted on 2 Nov 2023 (v1), revised 11 Jun 2024 (this version, v2), latest version 18 Mar 2025 (v3)]

Title:Tailoring Mixup to Data for Calibration

Authors:Quentin Bouniot, Pavlo Mozharovskyi, Florence d'Alché-Buc

View PDF HTML (experimental)

Abstract:Among all data augmentation techniques proposed so far, linear interpolation of training samples, also called Mixup, has found to be effective for a large panel of applications. Along with improved performance, Mixup is also a good technique for improving calibration and predictive uncertainty. However, mixing data carelessly can lead to manifold intrusion, i.e., conflicts between the synthetic labels assigned and the true label distributions, which can deteriorate calibration. In this work, we argue that the likelihood of manifold intrusion increases with the distance between data to mix. To this end, we propose to dynamically change the underlying distributions of interpolation coefficients depending on the similarity between samples to mix, and define a flexible framework to do so without losing in diversity. We provide extensive experiments for classification and regression tasks, showing that our proposed method improves performance and calibration of models, while being much more efficient. The code for our work is available at this https URL.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:2311.01434 [cs.LG]
	(or arXiv:2311.01434v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2311.01434

Submission history

From: Quentin Bouniot [view email]
[v1] Thu, 2 Nov 2023 17:48:28 UTC (6,739 KB)
[v2] Tue, 11 Jun 2024 12:22:27 UTC (9,179 KB)
[v3] Tue, 18 Mar 2025 21:28:33 UTC (9,839 KB)

Computer Science > Machine Learning

Title:Tailoring Mixup to Data for Calibration

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Tailoring Mixup to Data for Calibration

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators