-
The Fisher Geometry and Geodesics of the Multivariate Normals, without Differential Geometry
Authors:
Brodie A. J. Lawson,
Kevin Burrage,
Kerrie Mengersen,
Rodrigo Weber dos Santos
Abstract:
Choosing the Fisher information as the metric tensor for a Riemannian manifold provides a powerful yet fundamental way to understand statistical distribution families. Distances along this manifold become a compelling measure of statistical distance, and paths of shorter distance improve sampling techniques that leverage a sequence of distributions in their operation. Unfortunately, even for a dis…
▽ More
Choosing the Fisher information as the metric tensor for a Riemannian manifold provides a powerful yet fundamental way to understand statistical distribution families. Distances along this manifold become a compelling measure of statistical distance, and paths of shorter distance improve sampling techniques that leverage a sequence of distributions in their operation. Unfortunately, even for a distribution as generally tractable as the multivariate normal distribution, this information geometry proves unwieldy enough that closed-form solutions for shortest-distance paths or their lengths remain unavailable outside of limited special cases. In this review we present for general statisticians the most practical aspects of the Fisher geometry for this fundamental distribution family. Rather than a differential geometric treatment, we use an intuitive understanding of the covariance-induced curvature of this manifold to unify the special cases with known closed-form solution and review approximate solutions for the general case. We also use the multivariate normal information geometry to better understand the paths or distances commonly used in statistics (annealing, Wasserstein). Given the unavailability of a general solution, we also discuss the methods used for numerically obtaining geodesics in the space of multivariate normals, identifying remaining challenges and suggesting methodological improvements.
△ Less
Submitted 2 June, 2023;
originally announced June 2023.
-
Gaussian Persistence Curves
Authors:
Yu-Min Chung,
Michael Hull,
Austin Lawson,
Neil Pritchard
Abstract:
Topological data analysis (TDA) is a rising field in the intersection of mathematics, statistics, and computer science/data science. The cornerstone of TDA is persistent homology, which produces a summary of topological information called a persistence diagram. To utilize machine and deep learning methods on persistence diagrams, These diagrams are further summarized by transforming them into func…
▽ More
Topological data analysis (TDA) is a rising field in the intersection of mathematics, statistics, and computer science/data science. The cornerstone of TDA is persistent homology, which produces a summary of topological information called a persistence diagram. To utilize machine and deep learning methods on persistence diagrams, These diagrams are further summarized by transforming them into functions. In this paper we investigate the stability and injectivity of a class of smooth, one-dimensional functional summaries called Gaussian persistence curves.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
Horospherical random graphs
Authors:
Indira Chatterji,
Austin Lawson
Abstract:
Expanders are sparse graph that are strongly connected, where {\it connectivity} is quantified using eigenvalues of the adjacency matrix, and {\it sparsity} in terms of vertex valency. We give a model of random graphs and study their connectivity and sparsity. This model is a particular case of soft geometric random graphs, and allows to construct sparse graphs with good expansion properties, as w…
▽ More
Expanders are sparse graph that are strongly connected, where {\it connectivity} is quantified using eigenvalues of the adjacency matrix, and {\it sparsity} in terms of vertex valency. We give a model of random graphs and study their connectivity and sparsity. This model is a particular case of soft geometric random graphs, and allows to construct sparse graphs with good expansion properties, as well as highly clustered ones. On those graphs, we study the speed at which random walks spread in the graph, and visit all vertices. As an illustration, we build a model for mainland France and study the spread of random walks under several types of lockdown. Our experiments show that completely closing medium and long distance travel to slow down the spread of a random walk is more efficient than than local restrictions.
△ Less
Submitted 28 December, 2021; v1 submitted 7 December, 2021;
originally announced December 2021.
-
A Random Persistence Diagram Generator
Authors:
Theodore Papamarkou,
Farzana Nasrin,
Austin Lawson,
Na Gong,
Orlando Rios,
Vasileios Maroulas
Abstract:
Topological data analysis (TDA) studies the shape patterns of data. Persistent homology is a widely used method in TDA that summarizes homological features of data at multiple scales and stores them in persistence diagrams (PDs). In this paper, we propose a random persistence diagram generator (RPDG) method that generates a sequence of random PDs from the ones produced by the data. RPDG is underpi…
▽ More
Topological data analysis (TDA) studies the shape patterns of data. Persistent homology is a widely used method in TDA that summarizes homological features of data at multiple scales and stores them in persistence diagrams (PDs). In this paper, we propose a random persistence diagram generator (RPDG) method that generates a sequence of random PDs from the ones produced by the data. RPDG is underpinned by a model based on pairwise interacting point processes, and a reversible jump Markov chain Monte Carlo (RJ-MCMC) algorithm. A first example, which is based on a synthetic dataset, demonstrates the efficacy of RPDG and provides a comparison with another method for sampling PDs. A second example demonstrates the utility of RPDG to solve a materials science problem given a real dataset of small sample size.
△ Less
Submitted 14 September, 2022; v1 submitted 15 April, 2021;
originally announced April 2021.
-
On finite molecularization domains
Authors:
Andrew J. Hetzel,
Anna L. Lawson,
Andreas Reinhart
Abstract:
In this paper, we advance an ideal-theoretic analogue of a "finite factorization domain" (FFD), giving such a domain the moniker "finite molecularization domain" (FMD). We characterize FMD's as those factorable domains (termed "molecular domains" in the paper) for which every nonzero ideal is divisible by only finitely many nonfactorable ideals (termed "molecules" in the paper) and the monoid of n…
▽ More
In this paper, we advance an ideal-theoretic analogue of a "finite factorization domain" (FFD), giving such a domain the moniker "finite molecularization domain" (FMD). We characterize FMD's as those factorable domains (termed "molecular domains" in the paper) for which every nonzero ideal is divisible by only finitely many nonfactorable ideals (termed "molecules" in the paper) and the monoid of nonzero ideals of the domain is unit-cancellative, in the language of Fan, Geroldinger, Kainrath, and Tringali. We develop a number of connections, particularly at the local level, amongst the concepts of "FMD", "FFD", and the "finite superideal domains" (FSD's) of Hetzel and Lawson. Characterizations of when $k[X^2, X^3]$, where $k$ is a field, and the classical $D+M$ construction are FMD's are provided. We also demonstrate that if $R$ is a Dedekind domain with the finite norm property, then $R[X]$ is an FMD.
△ Less
Submitted 7 January, 2021;
originally announced January 2021.
-
Homogenisation for the monodomain model in the presence of microscopic fibrotic structures
Authors:
Brodie A. J. Lawson,
Rodrigo Weber dos Santos,
Ian W. Turner,
Alfonso Bueno-Orovio,
Pamela Burrage,
Kevin Burrage
Abstract:
Computational models in cardiac electrophysiology are notorious for long runtimes, restricting the numbers of nodes and mesh elements in the numerical discretisations used for their solution. This makes it particularly challenging to incorporate structural heterogeneities on small spatial scales, preventing a full understanding of the critical arrhythmogenic effects of conditions such as cardiac f…
▽ More
Computational models in cardiac electrophysiology are notorious for long runtimes, restricting the numbers of nodes and mesh elements in the numerical discretisations used for their solution. This makes it particularly challenging to incorporate structural heterogeneities on small spatial scales, preventing a full understanding of the critical arrhythmogenic effects of conditions such as cardiac fibrosis. In this work, we explore the technique of homogenisation by volume averaging for the inclusion of non-conductive micro-structures into larger-scale cardiac meshes with minor computational overhead. Importantly, our approach is not restricted to periodic patterns, enabling homogenised models to represent, for example, the intricate patterns of collagen deposition present in different types of fibrosis. We first highlight the importance of appropriate boundary condition choice for the closure problems that define the parameters of homogenised models. Then, we demonstrate the technique's ability to correctly upscale the effects of fibrotic patterns with a spatial resolution of 10 $μ$m into much larger numerical mesh sizes of 100-250 $μ$m. The homogenised models using these coarser meshes correctly predict critical pro-arrhythmic effects of fibrosis, including slowed conduction, source/sink mismatch, and stabilisation of re-entrant activation patterns. As such, this approach to homogenisation represents a significant step towards whole organ simulations that unravel the effects of microscopic cardiac tissue heterogeneities.
△ Less
Submitted 10 December, 2020;
originally announced December 2020.
-
Coarse free products
Authors:
Greg Bell,
Austin Lawson
Abstract:
We define a notion of free product for coarse spaces that generalizes the corresponding notion of a free product for groups. We show that free products preserve coarse properties such as coarse property C, finite coarse decomposition complexity, and coarse property A. We also give an upper bound estimate on the dimension of a coarse free product in terms of the dimension of its factors.
We define a notion of free product for coarse spaces that generalizes the corresponding notion of a free product for groups. We show that free products preserve coarse properties such as coarse property C, finite coarse decomposition complexity, and coarse property A. We also give an upper bound estimate on the dimension of a coarse free product in terms of the dimension of its factors.
△ Less
Submitted 24 June, 2020; v1 submitted 16 May, 2019;
originally announced May 2019.
-
Persistence Curves: A canonical framework for summarizing persistence diagrams
Authors:
Yu-Min Chung,
Austin Lawson
Abstract:
Persistence diagrams are one of the main tools in the field of Topological Data Analysis (TDA). They contain fruitful information about the shape of data. The use of machine learning algorithms on the space of persistence diagrams proves to be challenging as the space lacks an inner product. For that reason, transforming these diagrams in a way that is compatible with machine learning is an import…
▽ More
Persistence diagrams are one of the main tools in the field of Topological Data Analysis (TDA). They contain fruitful information about the shape of data. The use of machine learning algorithms on the space of persistence diagrams proves to be challenging as the space lacks an inner product. For that reason, transforming these diagrams in a way that is compatible with machine learning is an important topic currently researched in TDA. In this paper, our main contribution consists of three components. First, we develop a general and unifying framework of vectorizing diagrams that we call the \textit{Persistence Curves} (PCs), and show that several well-known summaries, such as Persistence Landscapes, fall under the PC framework. Second, we propose several new summaries based on PC framework and provide a theoretical foundation for their stability analysis. Finally, we apply proposed PCs to two applications---texture classification and determining the parameters of a discrete dynamical system; their performances are competitive with other TDA methods.
△ Less
Submitted 9 August, 2021; v1 submitted 16 April, 2019;
originally announced April 2019.
-
The space of persistence diagrams fails to have Yu's property A
Authors:
Greg Bell,
Austin Lawson,
C. Neil Pritchard,
Dan Yasaki
Abstract:
We define a simple obstruction to Yu's property A that we call $k$-prisms. This structure allows for a straightforward proof that the space of persistence diagrams fails to have property A in a Wasserstein metric.
We define a simple obstruction to Yu's property A that we call $k$-prisms. This structure allows for a straightforward proof that the space of persistence diagrams fails to have property A in a Wasserstein metric.
△ Less
Submitted 20 January, 2021; v1 submitted 6 February, 2019;
originally announced February 2019.
-
Coarse direct products and property {C}
Authors:
G. Bell,
A. Lawson
Abstract:
We show that coarse property C is preserved by finite coarse direct products. We also show that the coarse analog of Dydak's countable asymptotic dimension is equivalent to the coarse version of straight finite decomposition complexity and is therefore preserved by direct products.
We show that coarse property C is preserved by finite coarse direct products. We also show that the coarse analog of Dydak's countable asymptotic dimension is equivalent to the coarse version of straight finite decomposition complexity and is therefore preserved by direct products.
△ Less
Submitted 16 October, 2018; v1 submitted 8 December, 2017;
originally announced December 2017.
-
An exploration of Nathanson's $g$-adic representations of integers
Authors:
Greg Bell,
Austin Lawson,
Neil Pritchard,
Dan Yasaki
Abstract:
We use Nathanson's $g$-adic representation of integers to relate metric properties of Cayley graphs of the integers with respect to various infinite generating sets $S$ to problems in additive number theory. If $S$ consists of all powers of a fixed integer $g$, we find explicit formulas for the smallest positive integer of a given length. This is related to finding the smallest positive integer ex…
▽ More
We use Nathanson's $g$-adic representation of integers to relate metric properties of Cayley graphs of the integers with respect to various infinite generating sets $S$ to problems in additive number theory. If $S$ consists of all powers of a fixed integer $g$, we find explicit formulas for the smallest positive integer of a given length. This is related to finding the smallest positive integer expressible as a fixed number of sums and differences of powers of $g$. We also consider $S$ to be the set of all powers of all primes and bound the diameter of Cayley graph by relating it to Goldbach's conjecture.
△ Less
Submitted 17 January, 2019; v1 submitted 2 November, 2017;
originally announced November 2017.
-
Weighted Persistent Homology
Authors:
Greg Bell,
Austin Lawson,
Joshua Martin,
James Rudzinski,
Clifford Smyth
Abstract:
We introduce weighted versions of the classical Čech and Vietoris-Rips complexes. We show that a version of the Vietoris-Rips Lemma holds for these weighted complexes and that they enjoy appropriate stability properties. We also give some preliminary applications of these weighted complexes.
We introduce weighted versions of the classical Čech and Vietoris-Rips complexes. We show that a version of the Vietoris-Rips Lemma holds for these weighted complexes and that they enjoy appropriate stability properties. We also give some preliminary applications of these weighted complexes.
△ Less
Submitted 5 December, 2018; v1 submitted 31 August, 2017;
originally announced September 2017.
-
A Bayesian Hierarchical Modeling Approach to Dietary Assessment via Food Frequency
Authors:
Andrew Lawson,
Daniela Nitcheva
Abstract:
Previous likelihood-based linear modeling of nutritional data has been limited by the availability of software that allows flexible error structures in the data. We demonstrate the use of a Bayesian modeling approach to the analysis of such data. Our goal is to model the relationship between the energy intake derived from Food Frequency Questionnaires (FFQs) and the energy expenditure estimated…
▽ More
Previous likelihood-based linear modeling of nutritional data has been limited by the availability of software that allows flexible error structures in the data. We demonstrate the use of a Bayesian modeling approach to the analysis of such data. Our goal is to model the relationship between the energy intake derived from Food Frequency Questionnaires (FFQs) and the energy expenditure estimated from the doubly labeled water method. We consider models with different distributions for the FFQ energy intake. The models include previously identified covariates describing social desirability, education and their possible interaction that are felt to impact the reported FFQ. The models also include random effects to account for subject specific random variation (frailty) and also to account for the complex patterns of measurement error inherent in these data. Issues arising within the work relate both to the selection of relevant linear and non-linear models, the use of random effects, and the relevance of goodness-of-fit criteria such as DIC and PPL in assessing the most appropriate model.
△ Less
Submitted 12 January, 2007;
originally announced January 2007.