-
A Distance for Geometric Graphs via the Labeled Merge Tree Interleaving Distance
Authors:
Erin Wolf Chambers,
Elizabeth Munch,
Sarah Percival,
Xinyi Wang
Abstract:
Geometric graphs appear in many real-world data sets, such as road networks, sensor networks, and molecules. We investigate the notion of distance between embedded graphs and present a metric to measure the distance between two geometric graphs via merge trees. In order to preserve as much useful information as possible from the original data, we introduce a way of rotating the sublevel set to obt…
▽ More
Geometric graphs appear in many real-world data sets, such as road networks, sensor networks, and molecules. We investigate the notion of distance between embedded graphs and present a metric to measure the distance between two geometric graphs via merge trees. In order to preserve as much useful information as possible from the original data, we introduce a way of rotating the sublevel set to obtain the merge trees via the idea of the directional transform. We represent the merge trees using a surjective multi-labeling scheme and then compute the distance between two representative matrices. We show some theoretically desirable qualities and present two methods of computation: approximation via sampling and exact distance using a kinetic data structure, both in polynomial time. We illustrate its utility by implementing it on two data sets.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Bounding the Interleaving Distance for Mapper Graphs with a Loss Function
Authors:
Erin W. Chambers,
Elizabeth Munch,
Sarah Percival,
Bei Wang
Abstract:
Data consisting of a graph with a function mapping into $\mathbb{R}^d$ arise in many data applications, encompassing structures such as Reeb graphs, geometric graphs, and knot embeddings. As such, the ability to compare and cluster such objects is required in a data analysis pipeline, leading to a need for distances between them. In this work, we study the interleaving distance on discretization o…
▽ More
Data consisting of a graph with a function mapping into $\mathbb{R}^d$ arise in many data applications, encompassing structures such as Reeb graphs, geometric graphs, and knot embeddings. As such, the ability to compare and cluster such objects is required in a data analysis pipeline, leading to a need for distances between them. In this work, we study the interleaving distance on discretization of these objects, called mapper graphs when $d=1$, where functor representations of the data can be compared by finding pairs of natural transformations between them. However, in many cases, computation of the interleaving distance is NP-hard. For this reason, we take inspiration from recent work by Robinson to find quality measures for families of maps that do not rise to the level of a natural transformation, called assignments. We then endow the functor images with the extra structure of a metric space and define a loss function which measures how far an assignment is from making the required diagrams of an interleaving commute. Finally we show that the computation of the loss function is polynomial with a given assignment. We believe this idea is both powerful and translatable, with the potential to provide approximations and bounds on interleavings in a broad array of contexts.
△ Less
Submitted 19 May, 2025; v1 submitted 27 July, 2023;
originally announced July 2023.
-
Comparing representations of high-dimensional data with persistent homology: a case study in neuroimaging
Authors:
Ty Easley,
Kevin Freese,
Elizabeth Munch,
Janine Bijsterbosch
Abstract:
Despite much attention, the comparison of reduced-dimension representations of high-dimensional data remains a challenging problem in multiple fields, especially when representations remain high-dimensional compared to sample size. We offer a framework for evaluating the topological similarity of high-dimensional representations of very high-dimensional data, a regime where topological structure i…
▽ More
Despite much attention, the comparison of reduced-dimension representations of high-dimensional data remains a challenging problem in multiple fields, especially when representations remain high-dimensional compared to sample size. We offer a framework for evaluating the topological similarity of high-dimensional representations of very high-dimensional data, a regime where topological structure is more likely captured in the distribution of topological "noise" than a few prominent generators. Treating each representational map as a metric embedding, we compute the Vietoris-Rips persistence of its image. We then use the topological bootstrap to analyze the re-sampling stability of each representation, assigning a "prevalence score" for each nontrivial basis element of its persistence module. Finally, we compare the persistent homology of representations using a prevalence-weighted variant of the Wasserstein distance. Notably, our method is able to compare representations derived from different samples of the same distribution and, in particular, is not restricted to comparisons of graphs on the same vertex set. In addition, representations need not lie in the same metric space. We apply this analysis to a cross-sectional sample of representations of functional neuroimaging data in a large cohort and hierarchically cluster under the prevalence-weighted Wasserstein. We find that the ambient dimension of a representation is a stronger predictor of the number and stability of topological features than its decomposition rank. Our findings suggest that important topological information lies in repeatable, low-persistence homology generators, whose distributions capture important and interpretable differences between high-dimensional data representations.
△ Less
Submitted 23 November, 2023; v1 submitted 23 June, 2023;
originally announced June 2023.
-
A Topological Framework for Identifying Phenomenological Bifurcations in Stochastic Dynamical Systems
Authors:
Sunia Tanweer,
Firas A. Khasawneh,
Elizabeth Munch,
Joshua R. Tempelman
Abstract:
Changes in the parameters of dynamical systems can cause the state of the system to shift between different qualitative regimes. These shifts, known as bifurcations, are critical to study as they can indicate when the system is about to undergo harmful changes in its behavior. In stochastic dynamical systems, there is particular interest in P-type (phenomenological) bifurcations, which can include…
▽ More
Changes in the parameters of dynamical systems can cause the state of the system to shift between different qualitative regimes. These shifts, known as bifurcations, are critical to study as they can indicate when the system is about to undergo harmful changes in its behavior. In stochastic dynamical systems, there is particular interest in P-type (phenomenological) bifurcations, which can include transitions from a mono-stable state to multi-stable states, the appearance of stochastic limit cycles, and other features in the probability density function (PDF) of the system's state. Current practices are limited to systems with small state spaces, cannot detect all possible behaviours of the PDFs, and mandate human intervention for visually identifying the change in the PDF. In contrast, this study presents a new approach based on Topological Data Analysis (TDA) that uses superlevel persistence to mathematically quantify P-type bifurcations in stochastic systems through a "homological bifurcation plot'' -- which shows the changing ranks of 0th and 1st homology groups. Using these plots, we demonstrate the successful detection of P-bifurcations on the stochastic Duffing, Raleigh-Vander Pol and Quintic Oscillators given their analytical PDFs, and elaborate on how to generate an estimated homological bifurcation plot given a kernel density estimate (KDE) of these systems by employing a tool for finding topological consistency between PDFs and KDEs.
△ Less
Submitted 9 July, 2023; v1 submitted 4 May, 2023;
originally announced May 2023.
-
Detecting bifurcations in dynamical systems with CROCKER plots
Authors:
İsmail Güzel,
Elizabeth Munch,
Firas A. Khasawneh
Abstract:
Existing tools for bifurcation detection from signals of dynamical systems typically are either limited to a special class of systems, or they require carefully chosen input parameters, and significant expertise to interpret the results. Therefore, we describe an alternative method based on persistent homology -- a tool from Topological Data Analysis (TDA) -- that utilizes Betti numbers and CROCKE…
▽ More
Existing tools for bifurcation detection from signals of dynamical systems typically are either limited to a special class of systems, or they require carefully chosen input parameters, and significant expertise to interpret the results. Therefore, we describe an alternative method based on persistent homology -- a tool from Topological Data Analysis (TDA) -- that utilizes Betti numbers and CROCKER plots. Betti numbers are topological invariants of topological spaces, while the CROCKER plot is a coarsened but easy to visualize data representation of a one-parameter varying family of persistence barcodes. The specific bifurcations we investigate are transitions from periodic to chaotic behavior or vice versa in a one-parameter family of differential equations. We validate our methods using numerical experiments on ten dynamical systems and contrast the results with existing tools that use the maximum Lyapunov exponent. We further prove the relationship between the Wasserstein distance to the empty diagram and the norm of the Betti vector, which shows that an even more simplified version of the information has the potential to provide insight into the bifurcation parameter. The results show that our approach reveals more information about the shape of the periodic attractor than standard tools, and it has more favorable computational time in comparison to the Rosenstein algorithm for computing the Lyapunov exponent from time series.
△ Less
Submitted 3 August, 2022; v1 submitted 9 June, 2022;
originally announced June 2022.
-
Persistent Homology of Coarse Grained State Space Networks
Authors:
Audun D. Myers,
Max M. Chumley,
Firas A. Khasawneh,
Elizabeth Munch
Abstract:
This work is dedicated to the topological analysis of complex transitional networks for dynamic state detection. Transitional networks are formed from time series data and they leverage graph theory tools to reveal information about the underlying dynamic system. However, traditional tools can fail to summarize the complex topology present in such graphs. In this work, we leverage persistent homol…
▽ More
This work is dedicated to the topological analysis of complex transitional networks for dynamic state detection. Transitional networks are formed from time series data and they leverage graph theory tools to reveal information about the underlying dynamic system. However, traditional tools can fail to summarize the complex topology present in such graphs. In this work, we leverage persistent homology from topological data analysis to study the structure of these networks. We contrast dynamic state detection from time series using a coarse-grained state-space network (CGSSN) and topological data analysis (TDA) to two state of the art approaches: ordinal partition networks (OPNs) combined with TDA and the standard application of persistent homology to the time-delay embedding of the signal. We show that the CGSSN captures rich information about the dynamic state of the underlying dynamical system as evidenced by a significant improvement in dynamic state detection and noise robustness in comparison to OPNs. We also show that because the computational time of CGSSN is not linearly dependent on the signal's length, it is more computationally efficient than applying TDA to the time-delay embedding of the time series.
△ Less
Submitted 4 August, 2023; v1 submitted 20 May, 2022;
originally announced June 2022.
-
A Case Study on Identifying Bifurcation and Chaos with CROCKER Plots
Authors:
İsmail Güzel,
Elizabeth Munch,
Firas Khasawneh
Abstract:
The CROCKER plot is a coarsened but easy to visualize representation of the data in a one-parameter varying family of persistence barcodes. In this paper, we use the CROCKER plot to view changes in the persistence under a varying bifurcation parameter. We perform experiments to support our methods using the Rössler and Lorenz system and show the relationship with common methods for bifurcation ana…
▽ More
The CROCKER plot is a coarsened but easy to visualize representation of the data in a one-parameter varying family of persistence barcodes. In this paper, we use the CROCKER plot to view changes in the persistence under a varying bifurcation parameter. We perform experiments to support our methods using the Rössler and Lorenz system and show the relationship with common methods for bifurcation analysis such as the Lyapunov exponent.
△ Less
Submitted 11 April, 2022;
originally announced April 2022.
-
Reeb Graph Metrics from the Ground Up
Authors:
Brian Bollen,
Erin Chambers,
Joshua A. Levine,
Elizabeth Munch
Abstract:
The Reeb graph has been utilized in various applications including the analysis of scalar fields. Recently, research has been focused on using topological signatures such as the Reeb graph to compare multiple scalar fields by defining distance metrics on the topological signatures themselves. Here we survey five existing metrics that have been defined on Reeb graphs: the bottleneck distance, the i…
▽ More
The Reeb graph has been utilized in various applications including the analysis of scalar fields. Recently, research has been focused on using topological signatures such as the Reeb graph to compare multiple scalar fields by defining distance metrics on the topological signatures themselves. Here we survey five existing metrics that have been defined on Reeb graphs: the bottleneck distance, the interleaving distance, functional distortion distance, the Reeb graph edit distance, and the universal edit distance. Our goal is to (1) provide definitions and concrete examples of these distances in order to develop the intuition of the reader, (2) visit previously proven results of stability, universality, and discriminativity, (3) identify and complete any remaining properties which have only been proven (or disproven) for a subset of these metrics, (4) expand the taxonomy of the bottleneck distance to better distinguish between variations which have been commonly miscited, and (5) reconcile the various definitions and requirements on the underlying spaces for these metrics to be defined and properties to be proven.
△ Less
Submitted 19 October, 2022; v1 submitted 11 October, 2021;
originally announced October 2021.
-
A Relative Theory of Interleavings
Authors:
Magnus Bakke Botnan,
Justin Curry,
Elizabeth Munch
Abstract:
The interleaving distance, although originally developed for persistent homology, has been generalized to measure the distance between functors modeled on many posets or even small categories. Existing theories require that such a poset have a superlinear family of translations or a similar structure. However, many posets of interest to topological data analysis, such as zig-zag posets and the fac…
▽ More
The interleaving distance, although originally developed for persistent homology, has been generalized to measure the distance between functors modeled on many posets or even small categories. Existing theories require that such a poset have a superlinear family of translations or a similar structure. However, many posets of interest to topological data analysis, such as zig-zag posets and the face relation poset of a cell-complex, do not admit interesting translations, and consequently don't admit a nice theory of interleavings. In this paper we show how one can side-step this limitation by providing a general theory where one maps to a poset that does admit interesting translations, such as the lattice of down sets, and then defines interleavings relative to this map. Part of our theory includes a rigorous notion of discretization or "pixelization" of poset modules, which in turn we use for interleaving inference. We provide an approximation condition that in the setting of lattices gives rise to two possible pixelizations, both of which are guaranteed to be close in the interleaving distance. Finally, we conclude by considering interleaving inference for cosheaves over a metric space and give an explicit description of interleavings over a grid structure on Euclidean space.
△ Less
Submitted 29 April, 2020;
originally announced April 2020.
-
Probabilistic Convergence and Stability of Random Mapper Graphs
Authors:
Adam Brown,
Omer Bobrowski,
Elizabeth Munch,
Bei Wang
Abstract:
We study the probabilistic convergence between the mapper graph and the Reeb graph of a topological space $\mathbb{X}$ equipped with a continuous function $f: \mathbb{X} \rightarrow \mathbb{R}$. We first give a categorification of the mapper graph and the Reeb graph by interpreting them in terms of cosheaves and stratified covers of the real line $\mathbb{R}$. We then introduce a variant of the cl…
▽ More
We study the probabilistic convergence between the mapper graph and the Reeb graph of a topological space $\mathbb{X}$ equipped with a continuous function $f: \mathbb{X} \rightarrow \mathbb{R}$. We first give a categorification of the mapper graph and the Reeb graph by interpreting them in terms of cosheaves and stratified covers of the real line $\mathbb{R}$. We then introduce a variant of the classic mapper graph of Singh et al.~(2007), referred to as the enhanced mapper graph, and demonstrate that such a construction approximates the Reeb graph of $(\mathbb{X}, f)$ when it is applied to points randomly sampled from a probability density function concentrated on $(\mathbb{X}, f)$.
Our techniques are based on the interleaving distance of constructible cosheaves and topological estimation via kernel density estimates. Following Munch and Wang (2018), we first show that the mapper graph of $(\mathbb{X}, f)$, a constructible $\mathbb{R}$-space (with a fixed open cover), approximates the Reeb graph of the same space. We then construct an isomorphism between the mapper of $(\mathbb{X},f)$ to the mapper of a super-level set of a probability density function concentrated on $(\mathbb{X}, f)$. Finally, building on the approach of Bobrowski et al.~(2017), we show that, with high probability, we can recover the mapper of the super-level set given a sufficiently large sample. Our work is the first to consider the mapper construction using the theory of cosheaves in a probabilistic setting. It is part of an ongoing effort to combine sheaf theory, probability, and statistics, to support topological data analysis with random data.
△ Less
Submitted 14 August, 2020; v1 submitted 8 September, 2019;
originally announced September 2019.
-
A Structural Average of Labeled Merge Trees for Uncertainty Visualization
Authors:
Lin Yan,
Yusu Wang,
Elizabeth Munch,
Ellen Gasparovic,
Bei Wang
Abstract:
Physical phenomena in science and engineering are frequently modeled using scalar fields. In scalar field topology, graph-based topological descriptors such as merge trees, contour trees, and Reeb graphs are commonly used to characterize topological changes in the (sub)level sets of scalar fields. One of the biggest challenges and opportunities to advance topology-based visualization is to underst…
▽ More
Physical phenomena in science and engineering are frequently modeled using scalar fields. In scalar field topology, graph-based topological descriptors such as merge trees, contour trees, and Reeb graphs are commonly used to characterize topological changes in the (sub)level sets of scalar fields. One of the biggest challenges and opportunities to advance topology-based visualization is to understand and incorporate uncertainty into such topological descriptors to effectively reason about their underlying data. In this paper, we study a structural average of a set of labeled merge trees and use it to encode uncertainty in data. Specifically, we compute a 1-center tree that minimizes its maximum distance to any other tree in the set under a well-defined metric called the interleaving distance. We provide heuristic strategies that compute structural averages of merge trees whose labels do not fully agree. We further provide an interactive visualization system that resembles a numerical calculator that takes as input a set of merge trees and outputs a tree as their structural average. We also highlight structural similarities between the input and the average and incorporate uncertainty information for visual exploration. We develop a novel measure of uncertainty, referred to as consistency, via a metric-space view of the input trees. Finally, we demonstrate an application of our framework through merge trees that arise from ensembles of scalar fields. Our work is the first to employ interleaving distances and consistency to study a global, mathematically rigorous, structural average of merge trees in the context of uncertainty visualization.
△ Less
Submitted 8 October, 2019; v1 submitted 31 July, 2019;
originally announced August 2019.
-
Intrinsic Interleaving Distance for Merge Trees
Authors:
Ellen Gasparovic,
Elizabeth Munch,
Steve Oudot,
Katharine Turner,
Bei Wang,
Yusu Wang
Abstract:
Merge trees are a type of graph-based topological summary that tracks the evolution of connected components in the sublevel sets of scalar functions. They enjoy widespread applications in data analysis and scientific visualization. In this paper, we consider the problem of comparing two merge trees via the notion of interleaving distance in the metric space setting. We investigate various theoreti…
▽ More
Merge trees are a type of graph-based topological summary that tracks the evolution of connected components in the sublevel sets of scalar functions. They enjoy widespread applications in data analysis and scientific visualization. In this paper, we consider the problem of comparing two merge trees via the notion of interleaving distance in the metric space setting. We investigate various theoretical properties of such a metric. In particular, we show that the interleaving distance is intrinsic on the space of labeled merge trees and provide an algorithm to construct metric 1-centers for collections of labeled merge trees. We further prove that the intrinsic property of the interleaving distance also holds for the space of unlabeled merge trees. Our results are a first step toward performing statistics on graph-based topological summaries.
△ Less
Submitted 2 February, 2022; v1 submitted 31 July, 2019;
originally announced August 2019.
-
Approximating Continuous Functions on Persistence Diagrams Using Template Functions
Authors:
Jose A. Perea,
Elizabeth Munch,
Firas A. Khasawneh
Abstract:
The persistence diagram is an increasingly useful tool from Topological Data Analysis, but its use alongside typical machine learning techniques requires mathematical finesse. The most success to date has come from methods that map persistence diagrams into vector spaces, in a way which maximizes the structure preserved. This process is commonly referred to as featurization. In this paper, we desc…
▽ More
The persistence diagram is an increasingly useful tool from Topological Data Analysis, but its use alongside typical machine learning techniques requires mathematical finesse. The most success to date has come from methods that map persistence diagrams into vector spaces, in a way which maximizes the structure preserved. This process is commonly referred to as featurization. In this paper, we describe a mathematical framework for featurization called \emph{template functions}, and we show that it addresses the problem of approximating continuous functions on compact subsets of the space of persistence diagrams. Specifically, we begin by characterizing relative compactness with respect to the bottleneck distance, and then provide explicit theoretical methods for constructing compact-open dense subsets of continuous functions on persistence diagrams. These dense subsets -- obtained via template functions -- are leveraged for supervised learning tasks with persistence diagrams. Specifically, we test the method for classification and regression algorithms on several examples including shape data and dynamical systems.
△ Less
Submitted 12 April, 2022; v1 submitted 19 February, 2019;
originally announced February 2019.
-
The $\ell^\infty$-Cophenetic Metric for Phylogenetic Trees as an Interleaving Distance
Authors:
Elizabeth Munch,
Anastasios Stefanou
Abstract:
There are many metrics available to compare phylogenetic trees since this is a fundamental task in computational biology. In this paper, we focus on one such metric, the $\ell^\infty$-cophenetic metric introduced by Cardona et al. This metric works by representing a phylogenetic tree with $n$ labeled leaves as a point in $\mathbb{R}^{n(n+1)/2}$ known as the cophenetic vector, then comparing the tw…
▽ More
There are many metrics available to compare phylogenetic trees since this is a fundamental task in computational biology. In this paper, we focus on one such metric, the $\ell^\infty$-cophenetic metric introduced by Cardona et al. This metric works by representing a phylogenetic tree with $n$ labeled leaves as a point in $\mathbb{R}^{n(n+1)/2}$ known as the cophenetic vector, then comparing the two resulting Euclidean points using the $\ell^\infty$ distance. Meanwhile, the interleaving distance is a formal categorical construction generalized from the definition of Chazal et al., originally introduced to compare persistence modules arising from the field of topological data analysis. We show that the $\ell^\infty$-cophenetic metric is an example of an interleaving distance. To do this, we define phylogenetic trees as a category of merge trees with some additional structure; namely labelings on the leaves plus a requirement that morphisms respect these labels. Then we can use the definition of a flow on this category to give an interleaving distance. Finally, we show that, because of the additional structure given by the categories defined, the map sending a labeled merge tree to the cophenetic vector is, in fact, an isometric embedding, thus proving that the $\ell^\infty$-cophenetic metric is, in fact, an interleaving distance.
△ Less
Submitted 28 February, 2018;
originally announced March 2018.
-
Evolutionary homology on coupled dynamical systems
Authors:
Zixuan Cang,
Elizabeth Munch,
Guo-Wei Wei
Abstract:
Time dependence is a universal phenomenon in nature, and a variety of mathematical models in terms of dynamical systems have been developed to understand the time-dependent behavior of real-world problems. Originally constructed to analyze the topological persistence over spatial scales, persistent homology has rarely been devised for time evolution. We propose the use of a new filtration function…
▽ More
Time dependence is a universal phenomenon in nature, and a variety of mathematical models in terms of dynamical systems have been developed to understand the time-dependent behavior of real-world problems. Originally constructed to analyze the topological persistence over spatial scales, persistent homology has rarely been devised for time evolution. We propose the use of a new filtration function for persistent homology which takes as input the adjacent oscillator trajectories of a dynamical system. We also regulate the dynamical system by a weighted graph Laplacian matrix derived from the network of interest, which embeds the topological connectivity of the network into the dynamical system. The resulting topological signatures, which we call evolutionary homology (EH) barcodes, reveal the topology-function relationship of the network and thus give rise to the quantitative analysis of nodal properties. The proposed EH is applied to protein residue networks for protein thermal fluctuation analysis, rendering the most accurate B-factor prediction of a set of 364 proteins. This work extends the utility of dynamical systems to the quantitative modeling and analysis of realistic physical systems.
△ Less
Submitted 13 February, 2018;
originally announced February 2018.
-
Theory of interleavings on categories with a flow
Authors:
Vin de Silva,
Elizabeth Munch,
Anastasios Stefanou
Abstract:
The interleaving distance was originally defined in the field of Topological Data Analysis (TDA) by Chazal et al. as a metric on the class of persistence modules parametrized over the real line. Bubenik et al. subsequently extended the definition to categories of functors on a poset, the objects in these categories being regarded as `generalized persistence modules'. These metrics typically depend…
▽ More
The interleaving distance was originally defined in the field of Topological Data Analysis (TDA) by Chazal et al. as a metric on the class of persistence modules parametrized over the real line. Bubenik et al. subsequently extended the definition to categories of functors on a poset, the objects in these categories being regarded as `generalized persistence modules'. These metrics typically depend on the choice of a lax semigroup of endomorphisms of the poset. The purpose of the present paper is to develop a more general framework for the notion of interleaving distance using the theory of `actegories'. Specifically, we extend the notion of interleaving distance to arbitrary categories equipped with a flow, i.e. a lax monoidal action by the monoid $[0,\infty)$. In this way, the class of objects in such a category acquires the structure of a Lawvere metric space. Functors that are colax $[0,\infty)$-equivariant yield maps that are $1$-Lipschitz. This leads to concise proofs of various known stability results from TDA, by considering appropriate colax $[0,\infty)$-equivariant functors. Along the way, we show that several common metrics, including the Hausdorff distance and the $L^{\infty}$-norm, can be realized as interleaving distances in this general perspective.
△ Less
Submitted 30 May, 2018; v1 submitted 13 June, 2017;
originally announced June 2017.
-
Convergence between Categorical Representations of Reeb Space and Mapper
Authors:
Elizabeth Munch,
Bei Wang
Abstract:
The Reeb space, which generalizes the notion of a Reeb graph, is one of the few tools in topological data analysis and visualization suitable for the study of multivariate scientific datasets. First introduced by Edelsbrunner et al., it compresses the components of the level sets of a multivariate mapping and obtains a summary representation of their relationships. A related construction called ma…
▽ More
The Reeb space, which generalizes the notion of a Reeb graph, is one of the few tools in topological data analysis and visualization suitable for the study of multivariate scientific datasets. First introduced by Edelsbrunner et al., it compresses the components of the level sets of a multivariate mapping and obtains a summary representation of their relationships. A related construction called mapper, and a special case of the mapper construction called the Joint Contour Net have been shown to be effective in visual analytics. Mapper and JCN are intuitively regarded as discrete approximations of the Reeb space, however without formal proofs or approximation guarantees. An open question has been proposed by Dey et al. as to whether the mapper construction converges to the Reeb space in the limit.
In this paper, we are interested in developing the theoretical understanding of the relationship between the Reeb space and its discrete approximations to support its use in practical data analysis. Using tools from category theory, we formally prove the convergence between the Reeb space and mapper in terms of an interleaving distance between their categorical representations. Given a sequence of refined discretizations, we prove that these approximations converge to the Reeb space in the interleaving distance; this also helps to quantify the approximation quality of the discretization at a fixed resolution.
△ Less
Submitted 12 April, 2016; v1 submitted 13 December, 2015;
originally announced December 2015.
-
Strong Equivalence of the Interleaving and Functional Distortion Metrics for Reeb Graphs
Authors:
Ulrich Bauer,
Elizabeth Munch,
Yusu Wang
Abstract:
The Reeb graph is a construction that studies a topological space through the lens of a real valued function. It has widely been used in applications, however its use on real data means that it is desirable and increasingly necessary to have methods for comparison of Reeb graphs. Recently, several methods to define metrics on the space of Reeb graphs have been presented. In this paper, we focus on…
▽ More
The Reeb graph is a construction that studies a topological space through the lens of a real valued function. It has widely been used in applications, however its use on real data means that it is desirable and increasingly necessary to have methods for comparison of Reeb graphs. Recently, several methods to define metrics on the space of Reeb graphs have been presented. In this paper, we focus on two: the functional distortion distance and the interleaving distance. The former is based on the Gromov--Hausdorff distance, while the latter utilizes the equivalence between Reeb graphs and a particular class of cosheaves. However, both are defined by constructing a near-isomorphism between the two graphs of study. In this paper, we show that the two metrics are strongly equivalent on the space of Reeb graphs. In particular, this gives an immediate proof of bottleneck stability for persistence diagrams in terms of the Reeb graph interleaving distance.
△ Less
Submitted 20 December, 2014;
originally announced December 2014.
-
Topological and Statistical Behavior Classifiers for Tracking Applications
Authors:
Paul Bendich,
Sang Chin,
Jesse Clarke,
Jonathan deSena,
John Harer,
Elizabeth Munch,
Andrew Newman,
David Porter,
David Rouse,
Nate Strawn,
Adam Watkins
Abstract:
We introduce the first unified theory for target tracking using Multiple Hypothesis Tracking, Topological Data Analysis, and machine learning. Our string of innovations are 1) robust topological features are used to encode behavioral information, 2) statistical models are fitted to distributions over these topological features, and 3) the target type classification methods of Wigren and Bar Shalom…
▽ More
We introduce the first unified theory for target tracking using Multiple Hypothesis Tracking, Topological Data Analysis, and machine learning. Our string of innovations are 1) robust topological features are used to encode behavioral information, 2) statistical models are fitted to distributions over these topological features, and 3) the target type classification methods of Wigren and Bar Shalom et al. are employed to exploit the resulting likelihoods for topological features inside of the tracking procedure. To demonstrate the efficacy of our approach, we test our procedure on synthetic vehicular data generated by the Simulation of Urban Mobility package.
△ Less
Submitted 1 June, 2014;
originally announced June 2014.
-
Probabilistic Fréchet Means for Time Varying Persistence Diagrams
Authors:
Elizabeth Munch,
Katharine Turner,
Paul Bendich,
Sayan Mukherjee,
Jonathan Mattingly,
John Harer
Abstract:
In order to use persistence diagrams as a true statistical tool, it would be very useful to have a good notion of mean and variance for a set of diagrams. In 2011, Mileyko and his collaborators made the first study of the properties of the Fréchet mean in $(\mathcal{D}_p,W_p)$, the space of persistence diagrams equipped with the p-th Wasserstein metric. In particular, they showed that the Fréchet…
▽ More
In order to use persistence diagrams as a true statistical tool, it would be very useful to have a good notion of mean and variance for a set of diagrams. In 2011, Mileyko and his collaborators made the first study of the properties of the Fréchet mean in $(\mathcal{D}_p,W_p)$, the space of persistence diagrams equipped with the p-th Wasserstein metric. In particular, they showed that the Fréchet mean of a finite set of diagrams always exists, but is not necessarily unique. The means of a continuously-varying set of diagrams do not themselves (necessarily) vary continuously, which presents obvious problems when trying to extend the Fréchet mean definition to the realm of vineyards.
We fix this problem by altering the original definition of Fréchet mean so that it now becomes a probability measure on the set of persistence diagrams; in a nutshell, the mean of a set of diagrams will be a weighted sum of atomic measures, where each atom is itself a persistence diagram determined using a perturbation of the input diagrams. This definition gives for each $N$ a map $(\mathcal{D}_p)^N \to \mathbb{P}(\mathcal{D}_p)$. We show that this map is Hölder continuous on finite diagrams and thus can be used to build a useful statistic on time-varying persistence diagrams, better known as vineyards.
△ Less
Submitted 17 November, 2014; v1 submitted 24 July, 2013;
originally announced July 2013.