-
Bayesian modeling of nearly mutually orthogonal processes
Authors:
James Matuk,
Amy H. Herring,
David B. Dunson
Abstract:
Functional factor analysis is an important dimension reduction method for functional and longitudinal data. Factor loadings give insight into patterns of variability of the observations, while latent factors provide a low-dimensional representation of the data that is useful for inferential tasks. Constraining the functional factor loadings to be mutually orthogonal is desirable for model parsimon…
▽ More
Functional factor analysis is an important dimension reduction method for functional and longitudinal data. Factor loadings give insight into patterns of variability of the observations, while latent factors provide a low-dimensional representation of the data that is useful for inferential tasks. Constraining the functional factor loadings to be mutually orthogonal is desirable for model parsimony, but is computationally challenging. In this work, we introduce nearly mutually orthogonal processes, which can be used to effectively enforce mutual orthogonality of the factor loadings, while maintaining computational simplicity and efficiency. The joint distribution is governed by a penalty parameter that determines the degree to which the processes are mutually orthogonal and is related to ease of posterior computation. We demonstrate that our approach can be used for flexible and interpretable inference in an application to studying the effects of breastfeeding status, illness, and demographic factors on weight dynamics in early childhood. Code is available on GitHub: https://github.com/jamesmatuk/NeMO-FFA
△ Less
Submitted 6 March, 2023; v1 submitted 24 May, 2022;
originally announced May 2022.
-
Topo-Geometric Analysis of Variability in Point Clouds using Persistence Landscapes
Authors:
James Matuk,
Sebastian Kurtek,
Karthik Bharath
Abstract:
Topological data analysis provides a set of tools to uncover low-dimensional structure in noisy point clouds. Prominent amongst the tools is persistence homology, which summarizes birth-death times of homological features using data objects known as persistence diagrams. To better aid statistical analysis, a functional representation of the diagrams, known as persistence landscapes, enable use of…
▽ More
Topological data analysis provides a set of tools to uncover low-dimensional structure in noisy point clouds. Prominent amongst the tools is persistence homology, which summarizes birth-death times of homological features using data objects known as persistence diagrams. To better aid statistical analysis, a functional representation of the diagrams, known as persistence landscapes, enable use of functional data analysis and machine learning tools. Topological and geometric variabilities inherent in point clouds are confounded in both persistence diagrams and landscapes, and it is important to distinguish topological signal from noise to draw reliable conclusions on the structure of the point clouds when using persistence homology. We develop a framework for decomposing variability in persistence diagrams into topological signal and topological noise through alignment of persistence landscapes using an elastic Riemannian metric. Aligned landscapes (amplitude) isolate the topological signal. Reparameterizations used for landscape alignment (phase) are linked to a resolution parameter used to generate persistence diagrams, and capture topological noise in the form of geometric, global scaling and sampling variabilities. We illustrate the importance of decoupling topological signal and topological noise in persistence diagrams (landscapes) using several simulated examples. We also demonstrate that our approach provides novel insights in two real data studies.
△ Less
Submitted 1 February, 2024; v1 submitted 29 June, 2021;
originally announced June 2021.
-
Bayesian Inference for Polycrystalline Materials
Authors:
James Matuk,
Oksana Chkrebtii,
Stephen Niezgoda
Abstract:
Polycrystalline materials, such as metals, are comprised of heterogeneously oriented crystals. Observed crystal orientations are modelled as a sample from an orientation distribution function (ODF), which determines a variety of material properties and is therefore of great interest to practitioners. Observations consist of quaternions, 4-dimensional unit vectors reflecting both orientation and ro…
▽ More
Polycrystalline materials, such as metals, are comprised of heterogeneously oriented crystals. Observed crystal orientations are modelled as a sample from an orientation distribution function (ODF), which determines a variety of material properties and is therefore of great interest to practitioners. Observations consist of quaternions, 4-dimensional unit vectors reflecting both orientation and rotation of a single crystal. Thus, an ODF must account for known crystal symmetries as well as satisfy the unit length constraint. A popular method for estimating ODFs non-parametrically is symmetrized kernel density estimation. However, disadvantages of this approach include difficulty in interpreting results quantitatively, as well as in quantifying uncertainty in the ODF. We propose to use a mixture of symmetric Bingham distributions as a flexible parametric ODF model, inferring the number of mixture components, the mixture weights, and scale and location parameters based on crystal orientation data. Furthermore, our Bayesian approach allows for structured uncertainty quantification of the parameters of interest. We discuss details of the sampling methodology and conclude with analyses of various orientation datasets, interpretations of parameters of interest, and comparison with kernel density estimation methods.
△ Less
Submitted 8 December, 2020;
originally announced December 2020.
-
Bayesian Framework for Simultaneous Registration and Estimation of Noisy, Sparse and Fragmented Functional Data
Authors:
James Matuk,
Karthik Bharath,
Oksana Chkrebtii,
Sebastian Kurtek
Abstract:
In many applications, smooth processes generate data that is recorded under a variety of observation regimes, such as dense, sparse or fragmented observations that are often contaminated with error. The statistical goal of registering and estimating the individual underlying functions from discrete observations has thus far been mainly approached sequentially without formal uncertainty propagation…
▽ More
In many applications, smooth processes generate data that is recorded under a variety of observation regimes, such as dense, sparse or fragmented observations that are often contaminated with error. The statistical goal of registering and estimating the individual underlying functions from discrete observations has thus far been mainly approached sequentially without formal uncertainty propagation, or in an application-specific manner. We propose a unified Bayesian framework for simultaneous registration and estimation, which is flexible enough to accommodate inference on individual functions under general observation regimes. Our ability to do this relies on the specification of strongly informative prior models over the amplitude component of function variability. We provide two strategies for this critical choice: a data-driven approach that defines an empirical basis for the amplitude subspace based on training data, and a shape-restricted approach when the relative location and number of local extrema is well-understood. The proposed methods build on elastic functional data analysis, which separately models amplitude and phase variability inherent in functional data. We emphasize the importance of uncertainty quantification and visualization of these two components as they provide complementary information about the estimated functions. We validate the framework using simulations and real applications to medical imaging and biometrics.
△ Less
Submitted 11 December, 2019;
originally announced December 2019.