Search | arXiv e-print repository

An Empirical Study of Self-supervised Learning with Wasserstein Distance

Authors: Makoto Yamada, Yuki Takezawa, Guillaume Houry, Kira Michaela Dusterwald, Deborah Sulem, Han Zhao, Yao-Hung Hubert Tsai

Abstract: In this study, we delve into the problem of self-supervised learning (SSL) utilizing the 1-Wasserstein distance on a tree structure (a.k.a., Tree-Wasserstein distance (TWD)), where TWD is defined as the L1 distance between two tree-embedded vectors. In SSL methods, the cosine similarity is often utilized as an objective function; however, it has not been well studied when utilizing the Wasserstein… ▽ More In this study, we delve into the problem of self-supervised learning (SSL) utilizing the 1-Wasserstein distance on a tree structure (a.k.a., Tree-Wasserstein distance (TWD)), where TWD is defined as the L1 distance between two tree-embedded vectors. In SSL methods, the cosine similarity is often utilized as an objective function; however, it has not been well studied when utilizing the Wasserstein distance. Training the Wasserstein distance is numerically challenging. Thus, this study empirically investigates a strategy for optimizing the SSL with the Wasserstein distance and finds a stable training procedure. More specifically, we evaluate the combination of two types of TWD (total variation and ClusterTree) and several probability models, including the softmax function, the ArcFace probability model, and simplicial embedding. We propose a simple yet effective Jeffrey divergence-based regularization method to stabilize optimization. Through empirical experiments on STL10, CIFAR10, CIFAR100, and SVHN, we find that a simple combination of the softmax function and TWD can obtain significantly lower results than the standard SimCLR. Moreover, a simple combination of TWD and SimSiam fails to train the model. We find that the model performance depends on the combination of TWD and probability model, and that the Jeffrey divergence regularization helps in model training. Finally, we show that the appropriate combination of the TWD and probability model outperforms cosine similarity-based representation learning. △ Less

Submitted 5 February, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

arXiv:2211.09075 [pdf, other]

Keeping it sparse: Computing Persistent Homology revisited

Authors: Ulrich Bauer, Talha Bin Masood, Barbara Giunti, Guillaume Houry, Michael Kerber, Abhishek Rathod

Abstract: In this work, we study several variants of matrix reduction via Gaussian elimination that try to keep the reduced matrix sparse. The motivation comes from the growing field of topological data analysis where matrix reduction is the major subroutine to compute barcodes, the main invariant therein. We propose two novel variants of the standard algorithm, called swap and retrospective reductions. We… ▽ More In this work, we study several variants of matrix reduction via Gaussian elimination that try to keep the reduced matrix sparse. The motivation comes from the growing field of topological data analysis where matrix reduction is the major subroutine to compute barcodes, the main invariant therein. We propose two novel variants of the standard algorithm, called swap and retrospective reductions. We test them on a large collection of data against other known variants to compare their efficiency, and we find that sometimes they provide a considerable speed-up. We also present novel output-sensitive bounds for the retrospective variant which better explain the discrepancy between the cubic worst-case complexity bound and the almost linear practical behavior of matrix reduction. Finally, we provide several constructions on which one of the variants performs strictly better than the others. △ Less

Submitted 13 June, 2024; v1 submitted 16 November, 2022; originally announced November 2022.

Comments: 26 pages, 6 tables

arXiv:2111.02125 [pdf, other]

Expected Complexity of Barcode Computation via Matrix Reduction

Authors: Barbara Giunti, Guillaume Houry, Michael Kerber, Matthias Söls

Abstract: We study the algorithmic complexity of computing persistent homology of a randomly generated filtration. We prove upper bounds for the average fill-in (number of non-zero entries) of the boundary matrix on Čech, Vietoris--Rips and Erdős--Rényi filtrations after matrix reduction, which in turn provide bounds on the expected complexity of the barcode computation. Our method is based on previous resu… ▽ More We study the algorithmic complexity of computing persistent homology of a randomly generated filtration. We prove upper bounds for the average fill-in (number of non-zero entries) of the boundary matrix on Čech, Vietoris--Rips and Erdős--Rényi filtrations after matrix reduction, which in turn provide bounds on the expected complexity of the barcode computation. Our method is based on previous results on the expected Betti numbers of the corresponding complexes, which we link to the fill-in of the boundary matrix. Our fill-in bounds for Čech and Vietoris--Rips complexes are asymptotically tight up to a logarithmic factor. In particular, both our fill-in and computation bounds are better than the worst-case estimates. We also provide an Erdős--Rényi filtration realising the worst-case fill-in and computation. △ Less

Submitted 12 February, 2025; v1 submitted 3 November, 2021; originally announced November 2021.

Comments: Extended version of the previous conference article "Average complexity of matrix reduction for clique filtrations" by Giunti, Houry, Kerber

Showing 1–3 of 3 results for author: Houry, G