Search | arXiv e-print repository

arXiv:2506.19857 [pdf, ps, other]

Finding the Cores of Higher Graphs Using Geometric and Topological Means: A Survey

Authors: Inés García-Redondo, Claudia Landi, Sarah Percival, Anda Skeja, Bei Wang, Ling Zhou

Abstract: In this survey, we explore recent literature on finding the cores of higher graphs using geometric and topological means. We study graphs, hypergraphs, and simplicial complexes, all of which are models of higher graphs. We study the notion of a core, which is a minimalist representation of a higher graph that retains its geometric or topological information. We focus on geometric and topological m… ▽ More In this survey, we explore recent literature on finding the cores of higher graphs using geometric and topological means. We study graphs, hypergraphs, and simplicial complexes, all of which are models of higher graphs. We study the notion of a core, which is a minimalist representation of a higher graph that retains its geometric or topological information. We focus on geometric and topological methods based on discrete curvatures, effective resistance, and persistent homology. We aim to connect tools from graph theory, discrete geometry, and computational topology to inspire new research on the simplification of higher graphs. △ Less

Submitted 9 June, 2025; originally announced June 2025.

Comments: 54 pages

arXiv:2504.03865 [pdf, other]

Towards an Optimal Bound for the Interleaving Distance on Mapper Graphs

Authors: Erin Wolf Chambers, Ishika Ghosh, Elizabeth Munch, Sarah Percival, Bei Wang

Abstract: Mapper graphs are a widely used tool in topological data analysis and visualization. They can be viewed as discrete approximations of Reeb graphs, offering insight into the shape and connectivity of complex data. Given a high-dimensional point cloud $\mathbb{X}$ equipped with a function $f: \mathbb{X} \to \mathbb{R}$, a mapper graph provides a summary of the topological structure of $\mathbb{X}$ i… ▽ More Mapper graphs are a widely used tool in topological data analysis and visualization. They can be viewed as discrete approximations of Reeb graphs, offering insight into the shape and connectivity of complex data. Given a high-dimensional point cloud $\mathbb{X}$ equipped with a function $f: \mathbb{X} \to \mathbb{R}$, a mapper graph provides a summary of the topological structure of $\mathbb{X}$ induced by $f$, where each node represents a local neighborhood, and edges connect nodes whose corresponding neighborhoods overlap. Our focus is the interleaving distance for mapper graphs, arising from a discretization of the version for Reeb graphs, which is NP-hard to compute. This distance quantifies the similarity between two mapper graphs by measuring the extent to which they must be ``stretched" to become comparable. Recent work introduced a loss function that provides an upper bound on the interleaving distance for mapper graphs, which evaluates how far a given assignment is from being a true interleaving. Finding the loss is computationally tractable, offering a practical way to estimate the distance. In this paper, we employ a categorical formulation of mapper graphs and develop the first framework for computing the associated loss function. Since the quality of the bound depends on the chosen assignment, we optimize this loss function by formulating the problem of finding the best assignment as an integer linear programming problem. To evaluate the effectiveness of our optimization, we apply it to small mapper graphs where the interleaving distance is known, demonstrating that the optimized upper bound successfully matches the interleaving distance in these cases. Additionally, we conduct an experiment on the MPEG-7 dataset, computing the pairwise optimal loss on a collection of mapper graphs derived from images and leveraging the distance bound for image classification. △ Less

Submitted 4 April, 2025; originally announced April 2025.

MSC Class: 55N31

arXiv:2408.11180 [pdf, other]

Any Graph is a Mapper Graph

Authors: Enrique G Alvarado, Robin Belton, Kang-Ju Lee, Sourabh Palande, Sarah Percival, Emilie Purvine, Sarah Tymochko

Abstract: The Mapper algorithm is a popular tool for visualization and data exploration in topological data analysis. We investigate an inverse problem for the Mapper algorithm: Given a dataset $X$ and a graph $G$, does there exist a set of Mapper parameters such that the output Mapper graph of $X$ is isomorphic to $G$? We provide constructions that affirmatively answer this question. Our results demonstrat… ▽ More The Mapper algorithm is a popular tool for visualization and data exploration in topological data analysis. We investigate an inverse problem for the Mapper algorithm: Given a dataset $X$ and a graph $G$, does there exist a set of Mapper parameters such that the output Mapper graph of $X$ is isomorphic to $G$? We provide constructions that affirmatively answer this question. Our results demonstrate that it is possible to engineer Mapper parameters to generate a desired graph. △ Less

Submitted 20 August, 2024; originally announced August 2024.

Comments: 13 pages, 4 figures

arXiv:2407.09442 [pdf, other]

A Distance for Geometric Graphs via the Labeled Merge Tree Interleaving Distance

Authors: Erin Wolf Chambers, Elizabeth Munch, Sarah Percival, Xinyi Wang

Abstract: Geometric graphs appear in many real-world data sets, such as road networks, sensor networks, and molecules. We investigate the notion of distance between embedded graphs and present a metric to measure the distance between two geometric graphs via merge trees. In order to preserve as much useful information as possible from the original data, we introduce a way of rotating the sublevel set to obt… ▽ More Geometric graphs appear in many real-world data sets, such as road networks, sensor networks, and molecules. We investigate the notion of distance between embedded graphs and present a metric to measure the distance between two geometric graphs via merge trees. In order to preserve as much useful information as possible from the original data, we introduce a way of rotating the sublevel set to obtain the merge trees via the idea of the directional transform. We represent the merge trees using a surjective multi-labeling scheme and then compute the distance between two representative matrices. We show some theoretically desirable qualities and present two methods of computation: approximation via sampling and exact distance using a kinetic data structure, both in polynomial time. We illustrate its utility by implementing it on two data sets. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2309.06634 [pdf, other]

G-Mapper: Learning a Cover in the Mapper Construction

Authors: Enrique Alvarado, Robin Belton, Emily Fischer, Kang-Ju Lee, Sourabh Palande, Sarah Percival, Emilie Purvine

Abstract: The Mapper algorithm is a visualization technique in topological data analysis (TDA) that outputs a graph reflecting the structure of a given dataset. However, the Mapper algorithm requires tuning several parameters in order to generate a ``nice" Mapper graph. This paper focuses on selecting the cover parameter. We present an algorithm that optimizes the cover of a Mapper graph by splitting a cove… ▽ More The Mapper algorithm is a visualization technique in topological data analysis (TDA) that outputs a graph reflecting the structure of a given dataset. However, the Mapper algorithm requires tuning several parameters in order to generate a ``nice" Mapper graph. This paper focuses on selecting the cover parameter. We present an algorithm that optimizes the cover of a Mapper graph by splitting a cover repeatedly according to a statistical test for normality. Our algorithm is based on G-means clustering which searches for the optimal number of clusters in $k$-means by iteratively applying the Anderson-Darling test. Our splitting procedure employs a Gaussian mixture model to carefully choose the cover according to the distribution of the given data. Experiments for synthetic and real-world datasets demonstrate that our algorithm generates covers so that the Mapper graphs retain the essence of the datasets, while also running significantly faster than a previous iterative method. △ Less

Submitted 18 February, 2025; v1 submitted 12 September, 2023; originally announced September 2023.

Comments: 22 pages, to appear in SIAM Journal on Mathematics of Data Science (SIMODS)

MSC Class: 62R40; 62R07; 68T09; 62H30

arXiv:2307.15130 [pdf, other]

Bounding the Interleaving Distance for Mapper Graphs with a Loss Function

Authors: Erin W. Chambers, Elizabeth Munch, Sarah Percival, Bei Wang

Abstract: Data consisting of a graph with a function mapping into $\mathbb{R}^d$ arise in many data applications, encompassing structures such as Reeb graphs, geometric graphs, and knot embeddings. As such, the ability to compare and cluster such objects is required in a data analysis pipeline, leading to a need for distances between them. In this work, we study the interleaving distance on discretization o… ▽ More Data consisting of a graph with a function mapping into $\mathbb{R}^d$ arise in many data applications, encompassing structures such as Reeb graphs, geometric graphs, and knot embeddings. As such, the ability to compare and cluster such objects is required in a data analysis pipeline, leading to a need for distances between them. In this work, we study the interleaving distance on discretization of these objects, called mapper graphs when $d=1$, where functor representations of the data can be compared by finding pairs of natural transformations between them. However, in many cases, computation of the interleaving distance is NP-hard. For this reason, we take inspiration from recent work by Robinson to find quality measures for families of maps that do not rise to the level of a natural transformation, called assignments. We then endow the functor images with the extra structure of a metric space and define a loss function which measures how far an assignment is from making the required diagrams of an interleaving commute. Finally we show that the computation of the loss function is polynomial with a given assignment. We believe this idea is both powerful and translatable, with the potential to provide approximations and bounds on interleavings in a broad array of contexts. △ Less

Submitted 19 May, 2025; v1 submitted 27 July, 2023; originally announced July 2023.

Comments: Accepted version

MSC Class: 55N31

arXiv:1804.00605 [pdf, other]

On the Reeb spaces of definable maps

Authors: Saugata Basu, Nathanael Cox, Sarah Percival

Abstract: We prove that the Reeb space of a proper definable map $f:X \rightarrow Y$ in an arbitrary o-minimal expansion of a real closed field is realizable as a proper definable quotient. This result can be seen as an o-minimal analog of Stein factorization of proper morphisms in algebraic geometry. We also show that the Betti numbers of the Reeb space of $f$ can be arbitrarily large compared to those of… ▽ More We prove that the Reeb space of a proper definable map $f:X \rightarrow Y$ in an arbitrary o-minimal expansion of a real closed field is realizable as a proper definable quotient. This result can be seen as an o-minimal analog of Stein factorization of proper morphisms in algebraic geometry. We also show that the Betti numbers of the Reeb space of $f$ can be arbitrarily large compared to those of $X$, unlike in the special case of Reeb graphs of manifolds. Nevertheless, in the special case when $f:X \rightarrow Y$ is a semi-algebraic map and $X$ is closed and bounded, we prove a singly exponential upper bound on the Betti numbers of the Reeb space of $f$ in terms of the number and degrees of the polynomials defining $X,Y$, and $f$. △ Less

Submitted 27 July, 2020; v1 submitted 2 April, 2018; originally announced April 2018.

Comments: 34 pages. Major revision with expanded proof of a key proposition

MSC Class: 14P10; 03C64; 55R70

Showing 1–7 of 7 results for author: Percival, S