-
Sparse Higher Order Čech Filtrations
Authors:
Mickaël Buchet,
Bianca B. Dornelas,
Michael Kerber
Abstract:
For a finite set of balls of radius $r$, the $k$-fold cover is the space covered by at least $k$ balls. Fixing the ball centers and varying the radius, we obtain a nested sequence of spaces that is called the $k$-fold filtration of the centers. For $k=1$, the construction is the union-of-balls filtration that is popular in topological data analysis. For larger $k$, it yields a cleaner shape recons…
▽ More
For a finite set of balls of radius $r$, the $k$-fold cover is the space covered by at least $k$ balls. Fixing the ball centers and varying the radius, we obtain a nested sequence of spaces that is called the $k$-fold filtration of the centers. For $k=1$, the construction is the union-of-balls filtration that is popular in topological data analysis. For larger $k$, it yields a cleaner shape reconstruction in the presence of outliers. We contribute a sparsification algorithm to approximate the topology of the $k$-fold filtration. Our method is a combination and adaptation of several techniques from the well-studied case $k=1$, resulting in a sparsification of linear size that can be computed in expected near-linear time with respect to the number of input points. Our method also extends to the multicover bifiltration, composed of the $k$-fold filtrations for several values of $k$, with the same size and complexity bounds.
△ Less
Submitted 17 May, 2023; v1 submitted 12 March, 2023;
originally announced March 2023.
-
Declutter and Resample: Towards parameter free denoising
Authors:
Mickaël Buchet,
Tamal K. Dey,
Jiayuan Wang,
Yusu Wang
Abstract:
In many data analysis applications the following scenario is commonplace: we are given a point set that is supposed to sample a hidden ground truth $K$ in a metric space, but it got corrupted with noise so that some of the data points lie far away from $K$ creating outliers also termed as {\em ambient noise}. One of the main goals of denoising algorithms is to eliminate such noise so that the cura…
▽ More
In many data analysis applications the following scenario is commonplace: we are given a point set that is supposed to sample a hidden ground truth $K$ in a metric space, but it got corrupted with noise so that some of the data points lie far away from $K$ creating outliers also termed as {\em ambient noise}. One of the main goals of denoising algorithms is to eliminate such noise so that the curated data lie within a bounded Hausdorff distance of $K$. Popular denoising approaches such as deconvolution and thresholding often require the user to set several parameters and/or to choose an appropriate noise model while guaranteeing only asymptotic convergence. Our goal is to lighten this burden as much as possible while ensuring theoretical guarantees in all cases.
Specifically, first, we propose a simple denoising algorithm that requires only a single parameter but provides a theoretical guarantee on the quality of the output on general input points. We argue that this single parameter cannot be avoided. We next present a simple algorithm that avoids even this parameter by paying for it with a slight strengthening of the sampling condition on the input points which is not unrealistic. We also provide some preliminary empirical evidence that our algorithms are effective in practice.
△ Less
Submitted 26 March, 2017; v1 submitted 17 November, 2015;
originally announced November 2015.
-
Topological analysis of scalar fields with outliers
Authors:
Mickaël Buchet,
Frédéric Chazal,
Tamal K. Dey,
Fengtao Fan,
Steve Y. Oudot,
Yusu Wang
Abstract:
Given a real-valued function $f$ defined over a manifold $M$ embedded in $\mathbb{R}^d$, we are interested in recovering structural information about $f$ from the sole information of its values on a finite sample $P$. Existing methods provide approximation to the persistence diagram of $f$ when geometric noise and functional noise are bounded. However, they fail in the presence of aberrant values,…
▽ More
Given a real-valued function $f$ defined over a manifold $M$ embedded in $\mathbb{R}^d$, we are interested in recovering structural information about $f$ from the sole information of its values on a finite sample $P$. Existing methods provide approximation to the persistence diagram of $f$ when geometric noise and functional noise are bounded. However, they fail in the presence of aberrant values, also called outliers, both in theory and practice.
We propose a new algorithm that deals with outliers. We handle aberrant functional values with a method inspired from the k-nearest neighbors regression and the local median filtering, while the geometric outliers are handled using the distance to a measure. Combined with topological results on nested filtrations, our algorithm performs robust topological analysis of scalar fields in a wider range of noise models than handled by current methods. We provide theoretical guarantees and experimental results on the quality of our approximation of the sampled scalar field.
△ Less
Submitted 7 April, 2015; v1 submitted 4 December, 2014;
originally announced December 2014.
-
Efficient and Robust Persistent Homology for Measures
Authors:
Mickael Buchet,
Frederic Chazal,
Steve Y. Oudot,
Donald R. Sheehy
Abstract:
We extend the notion of the distance to a measure from Euclidean space to probability measures on general metric spaces as a way to do topological data analysis in a way that is robust to noise and outliers. We then give an efficient way to approximate the sub-level sets of this function by a union of metric balls and extend previous results on sparse Rips filtrations to this setting. This robust…
▽ More
We extend the notion of the distance to a measure from Euclidean space to probability measures on general metric spaces as a way to do topological data analysis in a way that is robust to noise and outliers. We then give an efficient way to approximate the sub-level sets of this function by a union of metric balls and extend previous results on sparse Rips filtrations to this setting. This robust and efficient approach to topological data analysis is illustrated with several examples from an implementation.
△ Less
Submitted 8 October, 2014; v1 submitted 31 May, 2013;
originally announced June 2013.