-
Drawing Reeb Graphs
Authors:
Erin Chambers,
Brittany Terese Fasy,
Erfan Hosseini Sereshgi,
Maarten Löffler
Abstract:
Reeb graphs are simple topological descriptors with applications in many areas like topological data analysis and computational geometry. Despite their prevalence, visualization of Reeb graphs has received less attention. In this paper, we bridge an essential gap in the literature by exploring the complexity of drawing Reeb graphs. Specifically, we demonstrate that Reeb graph crossing number minim…
▽ More
Reeb graphs are simple topological descriptors with applications in many areas like topological data analysis and computational geometry. Despite their prevalence, visualization of Reeb graphs has received less attention. In this paper, we bridge an essential gap in the literature by exploring the complexity of drawing Reeb graphs. Specifically, we demonstrate that Reeb graph crossing number minimization is NP-hard, both for straight-lined and curved edges. On the other hand, we identify specific classes of Reeb graphs, namely paths and caterpillars, for which crossing-free drawings exist. We also give an optimal algorithm for drawing cycle-shaped Reeb graphs with the least number of crossings and provide initial observations on the complexities of drawing multi-cycle Reeb graphs. We hope that this work establishes the foundation for an understanding of the graph drawing challenges inherent in Reeb graph visualization and paves the way for future work in this area.
△ Less
Submitted 18 May, 2025; v1 submitted 30 April, 2025;
originally announced April 2025.
-
Rapid and Precise Topological Comparison with Merge Tree Neural Networks
Authors:
Yu Qin,
Brittany Terese Fasy,
Carola Wenk,
Brian Summa
Abstract:
Merge trees are a valuable tool in the scientific visualization of scalar fields; however, current methods for merge tree comparisons are computationally expensive, primarily due to the exhaustive matching between tree nodes. To address this challenge, we introduce the Merge Tree Neural Network (MTNN), a learned neural network model designed for merge tree comparison. The MTNN enables rapid and hi…
▽ More
Merge trees are a valuable tool in the scientific visualization of scalar fields; however, current methods for merge tree comparisons are computationally expensive, primarily due to the exhaustive matching between tree nodes. To address this challenge, we introduce the Merge Tree Neural Network (MTNN), a learned neural network model designed for merge tree comparison. The MTNN enables rapid and high-quality similarity computation. We first demonstrate how to train graph neural networks, which emerged as effective encoders for graphs, in order to produce embeddings of merge trees in vector spaces for efficient similarity comparison. Next, we formulate the novel MTNN model that further improves the similarity comparisons by integrating the tree and node embeddings with a new topological attention mechanism. We demonstrate the effectiveness of our model on real-world data in different domains and examine our model's generalizability across various datasets. Our experimental analysis demonstrates our approach's superiority in accuracy and efficiency. In particular, we speed up the prior state-of-the-art by more than $100\times$ on the benchmark datasets while maintaining an error rate below $0.1\%$.
△ Less
Submitted 4 October, 2024; v1 submitted 8 April, 2024;
originally announced April 2024.
-
How Small Can Faithful Sets Be? Ordering Topological Descriptors
Authors:
Brittany Terese Fasy,
David L. Millman,
Anna Schenfisch
Abstract:
Recent developments in shape reconstruction and comparison call for the use of many different (topological) descriptor types, such as persistence diagrams and Euler characteristic functions. We establish a framework to quantitatively compare the strength of different descriptor types, setting up a theory that allows for future comparisons and analysis of descriptor types and that can inform choice…
▽ More
Recent developments in shape reconstruction and comparison call for the use of many different (topological) descriptor types, such as persistence diagrams and Euler characteristic functions. We establish a framework to quantitatively compare the strength of different descriptor types, setting up a theory that allows for future comparisons and analysis of descriptor types and that can inform choices made in applications. We use this framework to partially order a set of six common descriptor types. We then give lower bounds on the size of sets of descriptors that uniquely correspond to simplicial complexes, giving insight into the advantages of using verbose rather than concise topological descriptors.
△ Less
Submitted 8 July, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
The Manifold Density Function: An Intrinsic Method for the Validation of Manifold Learning
Authors:
Benjamin Holmgren,
Eli Quist,
Jordan Schupbach,
Brittany Terese Fasy,
Bastian Rieck
Abstract:
We introduce the manifold density function, which is an intrinsic method to validate manifold learning techniques. Our approach adapts and extends Ripley's $K$-function, and categorizes in an unsupervised setting the extent to which an output of a manifold learning algorithm captures the structure of a latent manifold. Our manifold density function generalizes to broad classes of Riemannian manifo…
▽ More
We introduce the manifold density function, which is an intrinsic method to validate manifold learning techniques. Our approach adapts and extends Ripley's $K$-function, and categorizes in an unsupervised setting the extent to which an output of a manifold learning algorithm captures the structure of a latent manifold. Our manifold density function generalizes to broad classes of Riemannian manifolds. In particular, we extend the manifold density function to general two-manifolds using the Gauss-Bonnet theorem, and demonstrate that the manifold density function for hypersurfaces is well approximated using the first Laplacian eigenvalue. We prove desirable convergence and robustness properties.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Visualizing Topological Importance: A Class-Driven Approach
Authors:
Yu Qin,
Brittany Terese Fasy,
Carola Wenk,
Brian Summa
Abstract:
This paper presents the first approach to visualize the importance of topological features that define classes of data. Topological features, with their ability to abstract the fundamental structure of complex data, are an integral component of visualization and analysis pipelines. Although not all topological features present in data are of equal importance. To date, the default definition of fea…
▽ More
This paper presents the first approach to visualize the importance of topological features that define classes of data. Topological features, with their ability to abstract the fundamental structure of complex data, are an integral component of visualization and analysis pipelines. Although not all topological features present in data are of equal importance. To date, the default definition of feature importance is often assumed and fixed. This work shows how proven explainable deep learning approaches can be adapted for use in topological classification. In doing so, it provides the first technique that illuminates what topological structures are important in each dataset in regards to their class label. In particular, the approach uses a learned metric classifier with a density estimator of the points of a persistence diagram as input. This metric learns how to reweigh this density such that classification accuracy is high. By extracting this weight, an importance field on persistent point density can be created. This provides an intuitive representation of persistence point importance that can be used to drive new visualizations. This work provides two examples: Visualization on each diagram directly and, in the case of sublevel set filtrations on images, directly on the images themselves. This work highlights real-world examples of this approach visualizing the important topological features in graph, 3D shape, and medical image data.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
From Curves to Words and Back Again: Geometric Computation of Minimum-Area Homotopy
Authors:
Hsien-Chih Chang,
Brittany Terese Fasy,
Bradley McCoy,
David L. Millman,
Carola Wenk
Abstract:
Let $γ$ be a generic closed curve in the plane. Samuel Blank, in his 1967 Ph.D. thesis, determined if $γ$ is self-overlapping by geometrically constructing a combinatorial word from $γ$. More recently, Zipei Nie, in an unpublished manuscript, computed the minimum homotopy area of $γ$ by constructing a combinatorial word algebraically. We provide a unified framework for working with both words and…
▽ More
Let $γ$ be a generic closed curve in the plane. Samuel Blank, in his 1967 Ph.D. thesis, determined if $γ$ is self-overlapping by geometrically constructing a combinatorial word from $γ$. More recently, Zipei Nie, in an unpublished manuscript, computed the minimum homotopy area of $γ$ by constructing a combinatorial word algebraically. We provide a unified framework for working with both words and determine the settings under which Blank's word and Nie's word are equivalent. Using this equivalence, we give a new geometric proof for the correctness of Nie's algorithm. Unlike previous work, our proof is constructive which allows us to naturally compute the actual homotopy that realizes the minimum area. Furthermore, we contribute to the theory of self-overlapping curves by providing the first polynomial-time algorithm to compute a self-overlapping decomposition of any closed curve $γ$ with minimum area.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Metric and Path-Connectedness Properties of the Frechet Distance for Paths and Graphs
Authors:
Erin Chambers,
Brittany Fasy,
Benjamin Holmgren,
Sushovan Majhi,
Carola Wenk
Abstract:
The Frechet distance is often used to measure distances between paths, with applications in areas ranging from map matching to GPS trajectory analysis to handwriting recognition. More recently, the Frechet distance has been generalized to a distance between two copies of the same graph embedded or immersed in a metric space; this more general setting opens up a wide range of more complex applicati…
▽ More
The Frechet distance is often used to measure distances between paths, with applications in areas ranging from map matching to GPS trajectory analysis to handwriting recognition. More recently, the Frechet distance has been generalized to a distance between two copies of the same graph embedded or immersed in a metric space; this more general setting opens up a wide range of more complex applications in graph analysis. In this paper, we initiate a study of some of the fundamental topological properties of spaces of paths and of graphs mapped to R^n under the Frechet distance, in an effort to lay the theoretical groundwork for understanding how these distances can be used in practice. In particular, we prove whether or not these spaces, and the metric balls therein, are path-connected.
△ Less
Submitted 1 August, 2023;
originally announced August 2023.
-
The Weighted Euler Characteristic Transform for Image Shape Classification
Authors:
Jessi Cisewski-Kehe,
Brittany Terese Fasy,
Dhanush Giriyan,
Eli Quist
Abstract:
The weighted Euler characteristic transform (WECT) is a new tool for extracting shape information from data equipped with a weight function. Image data may benefit from the WECT where the intensity of the pixels are used to define the weight function. In this work, an empirical assessment of the WECT's ability to distinguish shapes on images with different pixel intensity distributions is consider…
▽ More
The weighted Euler characteristic transform (WECT) is a new tool for extracting shape information from data equipped with a weight function. Image data may benefit from the WECT where the intensity of the pixels are used to define the weight function. In this work, an empirical assessment of the WECT's ability to distinguish shapes on images with different pixel intensity distributions is considered, along with visualization techniques to improve the intuition and understanding of what is captured by the WECT. Additionally, the expected weighted Euler characteristic and the expected WECT are derived.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Efficient Graph Reconstruction and Representation Using Augmented Persistence Diagrams
Authors:
Brittany Terese Fasy,
Samuel Micka,
David L. Millman,
Anna Schenfisch,
Lucia Williams
Abstract:
Persistent homology is a tool that can be employed to summarize the shape of data by quantifying homological features. When the data is an object in $\mathbb{R}^d$, the (augmented) persistent homology transform ((A)PHT) is a family of persistence diagrams, parameterized by directions in the ambient space. A recent advance in understanding the PHT used the framework of reconstruction in order to fi…
▽ More
Persistent homology is a tool that can be employed to summarize the shape of data by quantifying homological features. When the data is an object in $\mathbb{R}^d$, the (augmented) persistent homology transform ((A)PHT) is a family of persistence diagrams, parameterized by directions in the ambient space. A recent advance in understanding the PHT used the framework of reconstruction in order to find finite a set of directions to faithfully represent the shape, a result that is of both theoretical and practical interest. In this paper, we improve upon this result and present an improved algorithm for graph -- and, more generally one-skeleton -- reconstruction. The improvement comes in reconstructing the edges, where we use a radial binary (multi-)search. The binary search employed takes advantage of the fact that the edges can be ordered radially with respect to a reference plane, a feature unique to graphs.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.
-
Combinatorial Persistent Homology Transform
Authors:
Brittany Terese Fasy,
Amit Patel
Abstract:
The combinatorial interpretation of the persistence diagram as a Möbius inversion was recently shown to be functorial. We employ this discovery to recast the Persistent Homology Transform of a geometric complex as a representation of a cellulation on $\mathbb{S}^n$ to the category of combinatorial persistence diagrams. Detailed examples are provided. We hope this recasting of the PH transform will…
▽ More
The combinatorial interpretation of the persistence diagram as a Möbius inversion was recently shown to be functorial. We employ this discovery to recast the Persistent Homology Transform of a geometric complex as a representation of a cellulation on $\mathbb{S}^n$ to the category of combinatorial persistence diagrams. Detailed examples are provided. We hope this recasting of the PH transform will allow for the adoption of existing methods from algebraic and topological combinatorics to the study of shapes.
△ Less
Submitted 15 May, 2024; v1 submitted 10 August, 2022;
originally announced August 2022.
-
Extremal Event Graphs: A (Stable) Tool for Analyzing Noisy Time Series Data
Authors:
Robin Belton,
Bree Cummins,
Brittany Terese Fasy,
Tomáš Gedeon
Abstract:
Local maxima and minima, or extremal events, in experimental time series can be used as a coarse summary to characterize data. However, the discrete sampling in recording experimental measurements suggests uncertainty on the true timing of extrema during the experiment. This in turn gives uncertainty in the timing order of extrema within the time series. Motivated by applications in genomic time s…
▽ More
Local maxima and minima, or extremal events, in experimental time series can be used as a coarse summary to characterize data. However, the discrete sampling in recording experimental measurements suggests uncertainty on the true timing of extrema during the experiment. This in turn gives uncertainty in the timing order of extrema within the time series. Motivated by applications in genomic time series and biological network analysis, we construct a weighted directed acyclic graph (DAG) called an extremal event DAG using techniques from persistent homology that is robust to measurement noise. Furthermore, we define a distance between extremal event DAGs based on the edit distance between strings. We prove several properties including local stability for the extremal event DAG distance with respect to pairwise $L_{\infty}$ distances between functions in the time series data. Lastly, we provide algorithms, publicly free software, and implementations on extremal event DAG construction and comparison.
△ Less
Submitted 23 August, 2022; v1 submitted 17 March, 2022;
originally announced March 2022.
-
Combinatorial Conditions for Directed Collapsing
Authors:
Robin Belton,
Robyn Brooks,
Stefania Ebli,
Lisbeth Fajstrup,
Brittany Terese Fasy,
Nicole Sanderson,
Elizabeth Vidaurre
Abstract:
The purpose of this article is to study directed collapsibility of directed Euclidean cubical complexes. One application of this is in the nontrivial task of verifying the execution of concurrent programs. The classical definition of collapsibility involves certain conditions on a pair of cubes of the complex. The direction of the space can be taken into account by requiring that the past links of…
▽ More
The purpose of this article is to study directed collapsibility of directed Euclidean cubical complexes. One application of this is in the nontrivial task of verifying the execution of concurrent programs. The classical definition of collapsibility involves certain conditions on a pair of cubes of the complex. The direction of the space can be taken into account by requiring that the past links of vertices remain homotopy equivalent after collapsing. We call this type of collapse a link-preserving directed collapse. In this paper, we give combinatorially equivalent conditions for preserving the topology of the links, allowing for the implementation of an algorithm for collapsing a directed Euclidean cubical complex. Furthermore, we give conditions for when link-preserving directed collapses preserve the contractability and connectedness of directed path spaces, as well as examples when link-preserving directed collapses do not preserve the number of connected components of the path space between the minimum and a given vertex.
△ Less
Submitted 25 May, 2022; v1 submitted 2 June, 2021;
originally announced June 2021.
-
A Domain-Oblivious Approach for Learning Concise Representations of Filtered Topological Spaces for Clustering
Authors:
Yu Qin,
Brittany Terese Fasy,
Carola Wenk,
Brian Summa
Abstract:
Persistence diagrams have been widely used to quantify the underlying features of filtered topological spaces in data visualization. In many applications, computing distances between diagrams is essential; however, computing these distances has been challenging due to the computational cost. In this paper, we propose a persistence diagram hashing framework that learns a binary code representation…
▽ More
Persistence diagrams have been widely used to quantify the underlying features of filtered topological spaces in data visualization. In many applications, computing distances between diagrams is essential; however, computing these distances has been challenging due to the computational cost. In this paper, we propose a persistence diagram hashing framework that learns a binary code representation of persistence diagrams, which allows for fast computation of distances. This framework is built upon a generative adversarial network (GAN) with a diagram distance loss function to steer the learning process. Instead of using standard representations, we hash diagrams into binary codes, which have natural advantages in large-scale tasks. The training of this model is domain-oblivious in that it can be computed purely from synthetic, randomly created diagrams. As a consequence, our proposed method is directly applicable to various datasets without the need for retraining the model. These binary codes, when compared using fast Hamming distance, better maintain topological similarity properties between datasets than other vectorized representations. To evaluate this method, we apply our framework to the problem of diagram clustering and we compare the quality and performance of our approach to the state-of-the-art. In addition, we show the scalability of our approach on a dataset with 10k persistence diagrams, which is not possible with current techniques. Moreover, our experimental results demonstrate that our method is significantly faster with the potential of less memory usage, while retaining comparable or better quality comparisons.
△ Less
Submitted 10 August, 2021; v1 submitted 25 May, 2021;
originally announced May 2021.
-
If You Must Choose Among Your Children, Pick the Right One
Authors:
Benjamin Holmgren,
Bradley McCoy,
Brittany Fasy,
David Millman
Abstract:
Given a simplicial complex $K$ and an injective function $f$ from the vertices of $K$ to $\mathbb{R}$, we consider algorithms that extend $f$ to a discrete Morse function on $K$. We show that an algorithm of King, Knudson and Mramor can be described on the directed Hasse diagram of $K$. Our description has a faster runtime for high dimensional data with no increase in space.
Given a simplicial complex $K$ and an injective function $f$ from the vertices of $K$ to $\mathbb{R}$, we consider algorithms that extend $f$ to a discrete Morse function on $K$. We show that an algorithm of King, Knudson and Mramor can be described on the directed Hasse diagram of $K$. Our description has a faster runtime for high dimensional data with no increase in space.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.
-
A Faithful Discretization of the Verbose Persistent Homology Transform
Authors:
Brittany Terese Fasy,
Samuel Micka,
David L. Millman,
Anna Schenfisch,
Lucia Williams
Abstract:
The persistent homology transform (PHT) represents a shape with a multiset of persistence diagrams parameterized by the sphere of directions in the ambient space. In this work, we describe a finite set of diagrams that discretize the PHT such that it faithfully represents the underlying shape. We provide a discretization that is exponential in the dimension of the shape. Moreover, we show that thi…
▽ More
The persistent homology transform (PHT) represents a shape with a multiset of persistence diagrams parameterized by the sphere of directions in the ambient space. In this work, we describe a finite set of diagrams that discretize the PHT such that it faithfully represents the underlying shape. We provide a discretization that is exponential in the dimension of the shape. Moreover, we show that this discretization is stable with respect to various perturbations and we provide an algorithm for computing the discretization. Our approach relies only on knowing the heights and dimensions of topological events, which means that it can be adapted to provide discretizations of other dimension-returning topological transforms, including the Betti function transform. With mild alterations, we also adapt our methods to faithfully discretize the Euler characteristic function transform.
△ Less
Submitted 13 February, 2024; v1 submitted 29 December, 2019;
originally announced December 2019.
-
Reconstructing Embedded Graphs from Persistence Diagrams
Authors:
Robin Lynne Belton,
Brittany Terese Fasy,
Rostik Mertz,
Samuel Micka,
David L. Millman,
Daniel Salinas,
Anna Schenfisch,
Jordan Schupbach,
Lucia Williams
Abstract:
The persistence diagram (PD) is an increasingly popular topological descriptor. By encoding the size and prominence of topological features at varying scales, the PD provides important geometric and topological information about a space. Recent work has shown that well-chosen (finite) sets of PDs can differentiate between geometric simplicial complexes, providing a method for representing complex…
▽ More
The persistence diagram (PD) is an increasingly popular topological descriptor. By encoding the size and prominence of topological features at varying scales, the PD provides important geometric and topological information about a space. Recent work has shown that well-chosen (finite) sets of PDs can differentiate between geometric simplicial complexes, providing a method for representing complex shapes using a finite set of descriptors. A related inverse problem is the following: given a set of PDs (or an oracle we can query for persistence diagrams), what is underlying geometric simplicial complex? In this paper, we present an algorithm for reconstructing embedded graphs in $\mathbb{R}^d$ (plane graphs in $\mathbb{R}^2$) with $n$ vertices from $n^2 - n + d + 1$ directional (augmented) PDs. Additionally, we empirically validate the correctness and time-complexity of our algorithm in $\mathbb{R}^2$ on randomly generated plane graphs using our implementation, and explain the numerical limitations of implementing our algorithm.
△ Less
Submitted 18 June, 2020; v1 submitted 18 December, 2019;
originally announced December 2019.
-
Topological and Geometric Reconstruction of Metric Graphs in $\mathbb{R}^n$
Authors:
Brittany Terese Fasy,
Rafal Komendarczyk,
Sushovan Majhi,
Carola Wenk
Abstract:
We propose an algorithm to estimate the topology of an embedded metric graph from a well-sampled finite subset of the underlying graph.
We propose an algorithm to estimate the topology of an embedded metric graph from a well-sampled finite subset of the underlying graph.
△ Less
Submitted 6 December, 2019;
originally announced December 2019.
-
Threshold-Based Graph Reconstruction Using Discrete Morse Theory
Authors:
Brittany Terese Fasy,
Sushovan Majhi,
Carola Wenk
Abstract:
Discrete Morse theory has recently been applied in metric graph reconstruction from a given density function concentrated around an (unknown) underlying embedded graph. We propose a new noise model for the density function to reconstruct a connected graph both topologically and geometrically.
Discrete Morse theory has recently been applied in metric graph reconstruction from a given density function concentrated around an (unknown) underlying embedded graph. We propose a new noise model for the density function to reconstruct a connected graph both topologically and geometrically.
△ Less
Submitted 28 November, 2019;
originally announced November 2019.
-
Approximate Nearest Neighbors in the Space of Persistence Diagrams
Authors:
Brittany Terese Fasy,
Xiaozhou He,
Zhihui Liu,
Samuel Micka,
David L. Millman,
Binhai Zhu
Abstract:
Persistence diagrams are important tools in the field of topological data analysis that describe the presence and magnitude of features in a filtered topological space. However, current approaches for comparing a persistence diagram to a set of other persistence diagrams is linear in the number of diagrams or do not offer performance guarantees. In this paper, we apply concepts from locality-sensi…
▽ More
Persistence diagrams are important tools in the field of topological data analysis that describe the presence and magnitude of features in a filtered topological space. However, current approaches for comparing a persistence diagram to a set of other persistence diagrams is linear in the number of diagrams or do not offer performance guarantees. In this paper, we apply concepts from locality-sensitive hashing to support approximate nearest neighbor search in the space of persistence diagrams. Given a set $Γ$ of $n$ $(M,m)$-bounded persistence diagrams, each with at most $m$ points, we snap-round the points of each diagram to points on a cubical lattice and produce a key for each possible snap-rounding. Specifically, we fix a grid over each diagram at several resolutions and consider the snap-roundings of each diagram to the four nearest lattice points. Then, we propose a data structure with $τ$ levels $\mathbb{D}_τ$ that stores all snap-roundings of each persistence diagram in $Γ$ at each resolution. This data structure has size $O(n5^mτ)$ to account for varying lattice resolutions as well as snap-roundings and the deletion of points with low persistence. To search for a persistence diagram, we compute a key for a query diagram by snapping each point to a lattice and deleting points of low persistence. Furthermore, as the lattice parameter decreases, searching our data structure yields a six-approximation of the nearest diagram in $Γ$ in $O((m\log{n}+m^2)\logτ)$ time and a constant factor approximation of the $k$th nearest diagram in $O((m\log{n}+m^2+k)\logτ)$ time.
△ Less
Submitted 22 March, 2021; v1 submitted 28 December, 2018;
originally announced December 2018.
-
Challenges in Reconstructing Shapes from Euler Characteristic Curves
Authors:
Brittany Terese Fasy,
Samuel Micka,
David L. Millman,
Anna Schenfisch,
Lucia Williams
Abstract:
Shape recognition and classification is a problem with a wide variety of applications. Several recent works have demonstrated that topological descriptors can be used as summaries of shapes and utilized to compute distances. In this abstract, we explore the use of a finite number of Euler Characteristic Curves (ECC) to reconstruct plane graphs. We highlight difficulties that occur when attempting…
▽ More
Shape recognition and classification is a problem with a wide variety of applications. Several recent works have demonstrated that topological descriptors can be used as summaries of shapes and utilized to compute distances. In this abstract, we explore the use of a finite number of Euler Characteristic Curves (ECC) to reconstruct plane graphs. We highlight difficulties that occur when attempting to adopt approaches for reconstruction with persistence diagrams to reconstruction with ECCs. Furthermore, we highlight specific arrangements of vertices that create problems for reconstruction and present several observations about how they affect the ECC-based reconstruction. Finally, we show that plane graphs without degree two vertices can be reconstructed using a finite number of ECCs.
△ Less
Submitted 27 November, 2018;
originally announced November 2018.
-
Learning Simplicial Complexes from Persistence Diagrams
Authors:
Robin Lynne Belton,
Brittany Terese Fasy,
Rostik Mertz,
Samuel Micka,
David L. Millman,
Daniel Salinas,
Anna Schenfisch,
Jordan Schupbach,
Lucia Williams
Abstract:
Topological Data Analysis (TDA) studies the shape of data. A common topological descriptor is the persistence diagram, which encodes topological features in a topological space at different scales. Turner, Mukeherjee, and Boyer showed that one can reconstruct a simplicial complex embedded in R^3 using persistence diagrams generated from all possible height filtrations (an uncountably infinite numb…
▽ More
Topological Data Analysis (TDA) studies the shape of data. A common topological descriptor is the persistence diagram, which encodes topological features in a topological space at different scales. Turner, Mukeherjee, and Boyer showed that one can reconstruct a simplicial complex embedded in R^3 using persistence diagrams generated from all possible height filtrations (an uncountably infinite number of directions). In this paper, we present an algorithm for reconstructing plane graphs K=(V,E) in R^2 , i.e., a planar graph with vertices in general position and a straight-line embedding, from a quadratic number height filtrations and their respective persistence diagrams.
△ Less
Submitted 31 July, 2018; v1 submitted 27 May, 2018;
originally announced May 2018.
-
Approximating Nearest Neighbor Distances
Authors:
Michael B. Cohen,
Brittany Terese Fasy,
Gary L. Miller,
Amir Nayyeri,
Donald R. Sheehy,
Ameya Velingker
Abstract:
Several researchers proposed using non-Euclidean metrics on point sets in Euclidean space for clustering noisy data. Almost always, a distance function is desired that recognizes the closeness of the points in the same cluster, even if the Euclidean cluster diameter is large. Therefore, it is preferred to assign smaller costs to the paths that stay close to the input points.
In this paper, we co…
▽ More
Several researchers proposed using non-Euclidean metrics on point sets in Euclidean space for clustering noisy data. Almost always, a distance function is desired that recognizes the closeness of the points in the same cluster, even if the Euclidean cluster diameter is large. Therefore, it is preferred to assign smaller costs to the paths that stay close to the input points.
In this paper, we consider the most natural metric with this property, which we call the nearest neighbor metric. Given a point set P and a path $γ$, our metric charges each point of $γ$ with its distance to P. The total charge along $γ$ determines its nearest neighbor length, which is formally defined as the integral of the distance to the input points along the curve. We describe a $(3+\varepsilon)$-approximation algorithm and a $(1+\varepsilon)$-approximation algorithm to compute the nearest neighbor metric. Both approximation algorithms work in near-linear time. The former uses shortest paths on a sparse graph using only the input points. The latter uses a sparse sample of the ambient space, to find good approximate geodesic paths.
△ Less
Submitted 27 February, 2015;
originally announced February 2015.
-
Robust Topological Inference: Distance To a Measure and Kernel Distance
Authors:
Frédéric Chazal,
Brittany T. Fasy,
Fabrizio Lecci,
Bertrand Michel,
Alessandro Rinaldo,
Larry Wasserman
Abstract:
Let P be a distribution with support S. The salient features of S can be quantified with persistent homology, which summarizes topological features of the sublevel sets of the distance function (the distance of any point x to S). Given a sample from P we can infer the persistent homology using an empirical version of the distance function. However, the empirical distance function is highly non-rob…
▽ More
Let P be a distribution with support S. The salient features of S can be quantified with persistent homology, which summarizes topological features of the sublevel sets of the distance function (the distance of any point x to S). Given a sample from P we can infer the persistent homology using an empirical version of the distance function. However, the empirical distance function is highly non-robust to noise and outliers. Even one outlier is deadly. The distance-to-a-measure (DTM), introduced by Chazal et al. (2011), and the kernel distance, introduced by Phillips et al. (2014), are smooth functions that provide useful topological information but are robust to noise and outliers. Chazal et al. (2014) derived concentration bounds for DTM. Building on these results, we derive limiting distributions and confidence sets, and we propose a method for choosing tuning parameters.
△ Less
Submitted 22 December, 2014;
originally announced December 2014.
-
Introduction to the R package TDA
Authors:
Brittany Terese Fasy,
Jisu Kim,
Fabrizio Lecci,
Clément Maria
Abstract:
We present a short tutorial and introduction to using the R package TDA, which provides some tools for Topological Data Analysis. In particular, it includes implementations of functions that, given some data, provide topological information about the underlying space, such as the distance function, the distance to a measure, the kNN density estimator, the kernel density estimator, and the kernel d…
▽ More
We present a short tutorial and introduction to using the R package TDA, which provides some tools for Topological Data Analysis. In particular, it includes implementations of functions that, given some data, provide topological information about the underlying space, such as the distance function, the distance to a measure, the kNN density estimator, the kernel density estimator, and the kernel distance. The salient topological features of the sublevel sets (or superlevel sets) of these functions can be quantified with persistent homology. We provide an R interface for the efficient algorithms of the C++ libraries GUDHI, Dionysus and PHAT, including a function for the persistent homology of the Rips filtration, and one for the persistent homology of sublevel sets (or superlevel sets) of arbitrary functions evaluated over a grid of points. The significance of the features in the resulting persistence diagrams can be analyzed with functions that implement recently developed statistical methods. The R package TDA also includes the implementation of an algorithm for density clustering, which allows us to identify the spatial organization of the probability mass associated to a density function and visualize it by means of a dendrogram, the cluster tree.
△ Less
Submitted 29 January, 2015; v1 submitted 7 November, 2014;
originally announced November 2014.
-
Subsampling Methods for Persistent Homology
Authors:
Frédéric Chazal,
Brittany Terese Fasy,
Fabrizio Lecci,
Bertrand Michel,
Alessandro Rinaldo,
Larry Wasserman
Abstract:
Persistent homology is a multiscale method for analyzing the shape of sets and functions from point cloud data arising from an unknown distribution supported on those sets. When the size of the sample is large, direct computation of the persistent homology is prohibitive due to the combinatorial nature of the existing algorithms. We propose to compute the persistent homology of several subsamples…
▽ More
Persistent homology is a multiscale method for analyzing the shape of sets and functions from point cloud data arising from an unknown distribution supported on those sets. When the size of the sample is large, direct computation of the persistent homology is prohibitive due to the combinatorial nature of the existing algorithms. We propose to compute the persistent homology of several subsamples of the data and then combine the resulting estimates. We study the risk of two estimators and we prove that the subsampling approach carries stable topological information while achieving a great reduction in computational complexity.
△ Less
Submitted 7 June, 2014;
originally announced June 2014.
-
Stochastic Convergence of Persistence Landscapes and Silhouettes
Authors:
Frédéric Chazal,
Brittany Terese Fasy,
Fabrizio Lecci,
Alessandro Rinaldo,
Larry Wasserman
Abstract:
Persistent homology is a widely used tool in Topological Data Analysis that encodes multiscale topological information as a multi-set of points in the plane called a persistence diagram. It is difficult to apply statistical theory directly to a random sample of diagrams. Instead, we can summarize the persistent homology with the persistence landscape, introduced by Bubenik, which converts a diagra…
▽ More
Persistent homology is a widely used tool in Topological Data Analysis that encodes multiscale topological information as a multi-set of points in the plane called a persistence diagram. It is difficult to apply statistical theory directly to a random sample of diagrams. Instead, we can summarize the persistent homology with the persistence landscape, introduced by Bubenik, which converts a diagram into a well-behaved real-valued function. We investigate the statistical properties of landscapes, such as weak convergence of the average landscapes and convergence of the bootstrap. In addition, we introduce an alternate functional summary of persistent homology, which we call the silhouette, and derive an analogous statistical theory.
△ Less
Submitted 1 December, 2013;
originally announced December 2013.
-
On the Bootstrap for Persistence Diagrams and Landscapes
Authors:
Frédéric Chazal,
Brittany Terese Fasy,
Fabrizio Lecci,
Alessandro Rinaldo,
Aarti Singh,
Larry Wasserman
Abstract:
Persistent homology probes topological properties from point clouds and functions. By looking at multiple scales simultaneously, one can record the births and deaths of topological features as the scale varies. In this paper we use a statistical technique, the empirical bootstrap, to separate topological signal from topological noise. In particular, we derive confidence sets for persistence diagra…
▽ More
Persistent homology probes topological properties from point clouds and functions. By looking at multiple scales simultaneously, one can record the births and deaths of topological features as the scale varies. In this paper we use a statistical technique, the empirical bootstrap, to separate topological signal from topological noise. In particular, we derive confidence sets for persistence diagrams and confidence bands for persistence landscapes.
△ Less
Submitted 22 January, 2014; v1 submitted 2 November, 2013;
originally announced November 2013.
-
Path-Based Distance for Street Map Comparison
Authors:
Mahmuda Ahmed,
Brittany Terese Fasy,
Kyle S. Hickmann,
Carola Wenk
Abstract:
Comparing two geometric graphs embedded in space is important in the field of transportation network analysis. Given street maps of the same city collected from different sources, researchers often need to know how and where they differ. However, the majority of current graph comparison algorithms are based on structural properties of graphs, such as their degree distribution or their local connec…
▽ More
Comparing two geometric graphs embedded in space is important in the field of transportation network analysis. Given street maps of the same city collected from different sources, researchers often need to know how and where they differ. However, the majority of current graph comparison algorithms are based on structural properties of graphs, such as their degree distribution or their local connectivity properties, and do not consider their spatial embedding. This ignores a key property of road networks since similarity of travel over two road networks is intimately tied to the specific spatial embedding. Likewise, many current street map comparison algorithms focus on the spatial embeddings only and do not take structural properties into account, which makes these algorithms insensitive to local connectivity properties and shortest path similarities. We propose a new path-based distance measure to compare two planar geometric graphs embedded in the plane. Our distance measure takes structural as well as spatial properties into account by imposing a distance measure between two road networks based on the Hausdorff distance between the two sets of travel paths they represent. We show that this distance can be approximated in polynomial time and that it preserves structural and spatial properties of the graphs.
△ Less
Submitted 13 February, 2015; v1 submitted 24 September, 2013;
originally announced September 2013.
-
Confidence sets for persistence diagrams
Authors:
Brittany Terese Fasy,
Fabrizio Lecci,
Alessandro Rinaldo,
Larry Wasserman,
Sivaraman Balakrishnan,
Aarti Singh
Abstract:
Persistent homology is a method for probing topological properties of point clouds and functions. The method involves tracking the birth and death of topological features (2000) as one varies a tuning parameter. Features with short lifetimes are informally considered to be "topological noise," and those with a long lifetime are considered to be "topological signal." In this paper, we bring some st…
▽ More
Persistent homology is a method for probing topological properties of point clouds and functions. The method involves tracking the birth and death of topological features (2000) as one varies a tuning parameter. Features with short lifetimes are informally considered to be "topological noise," and those with a long lifetime are considered to be "topological signal." In this paper, we bring some statistical ideas to persistent homology. In particular, we derive confidence sets that allow us to separate topological signal from topological noise.
△ Less
Submitted 20 November, 2014; v1 submitted 28 March, 2013;
originally announced March 2013.
-
Persistence Diagrams and the Heat Equation Homotopy
Authors:
Brittany Terese Fasy
Abstract:
Persistence homology is a tool used to measure topological features that are present in data sets and functions. Persistence pairs births and deaths of these features as we iterate through the sublevel sets of the data or function of interest. I am concerned with using persistence to characterize the difference between two functions f, g : M -> R, where M is a topological space. Furthermore, I f…
▽ More
Persistence homology is a tool used to measure topological features that are present in data sets and functions. Persistence pairs births and deaths of these features as we iterate through the sublevel sets of the data or function of interest. I am concerned with using persistence to characterize the difference between two functions f, g : M -> R, where M is a topological space. Furthermore, I formulate a homotopy from g to f by applying the heat equation to the difference function g-f. By stacking the persistence diagrams associated with this homotopy, we create a vineyard of curves that connect the points in the diagram for f with the points in the diagram for g. I look at the diagrams where M is a square, a sphere, a torus, and a Klein bottle. Looking at these four topologies, we notice trends (and differences) as the persistence diagrams change with respect to time.
△ Less
Submitted 9 February, 2010;
originally announced February 2010.