-
Robust Comparison of Kernel Densities on Spherical Domains
Authors:
Zhengwu Zhang,
Eric Klassen,
Anuj Srivastava
Abstract:
While spherical data arises in many contexts, including in directional statistics, the current tools for density estimation and population comparison on spheres are quite limited. Popular approaches for comparing populations (on Euclidean domains) mostly involvea two-step procedure: (1) estimate probability density functions (pdfs) from their respective samples, most commonly using the kernel dens…
▽ More
While spherical data arises in many contexts, including in directional statistics, the current tools for density estimation and population comparison on spheres are quite limited. Popular approaches for comparing populations (on Euclidean domains) mostly involvea two-step procedure: (1) estimate probability density functions (pdfs) from their respective samples, most commonly using the kernel density estimator, and, (2) compare pdfs using a metric such as the L2 norm. However, both the estimated pdfs and their differences depend heavily on the chosen kernels, bandwidths, and sample sizes. Here we develop a framework for comparing spherical populations that is robust to these choices. Essentially, we characterize pdfs on spherical domains by quantifying their smoothness. Our framework uses a spectral representation, with densities represented by their coefficients with respect to the eigenfunctions of the Laplacian operator on a sphere. The change in smoothness, akin to using different kernel bandwidths, is controlled by exponential decays in coefficient values. Then we derive a proper distance for comparing pdf coefficients while equalizing smoothness levels, negating influences of sample size and bandwidth. This signifies a fair and meaningful comparisons of populations, despite vastly different sample sizes, and leads to a robust and improved performance. We demonstrate this framework using examples of variables on S1 and S2, and evaluate its performance using a number of simulations and real data experiments.
△ Less
Submitted 11 May, 2018;
originally announced May 2018.
-
Nonparametric Spherical Regression Using Diffeomorphic Mappings
Authors:
Michael Rosenthal,
Wei Wu,
Eric Klassen,
Anuj Srivastava
Abstract:
Spherical regression explores relationships between variables on spherical domains. We develop a nonparametric model that uses a diffeomorphic map from a sphere to itself. The restriction of this mapping to diffeomorphisms is natural in several settings. The model is estimated in a penalized maximum-likelihood framework using gradient-based optimization. Towards that goal, we specify a first-order…
▽ More
Spherical regression explores relationships between variables on spherical domains. We develop a nonparametric model that uses a diffeomorphic map from a sphere to itself. The restriction of this mapping to diffeomorphisms is natural in several settings. The model is estimated in a penalized maximum-likelihood framework using gradient-based optimization. Towards that goal, we specify a first-order roughness penalty using the Jacobian of diffeomorphisms. We compare the prediction performance of the proposed model with state-of-the-art methods using simulated and real data involving cloud deformations, wind directions, and vector-cardiograms. This model is found to outperform others in capturing relationships between spherical variables.
△ Less
Submitted 2 February, 2017;
originally announced February 2017.
-
Phase-Amplitude Separation and Modeling of Spherical Trajectories
Authors:
Zhengwu Zhang,
Eric Klassen,
Anuj Srivastava
Abstract:
This paper studies the problem of separating phase-amplitude components in sample paths of a spherical process (longitudinal data on a unit two-sphere). Such separation is essential for efficient modeling and statistical analysis of spherical longitudinal data in a manner that is invariant to any phase variability. The key idea is to represent each path or trajectory with a pair of variables, a st…
▽ More
This paper studies the problem of separating phase-amplitude components in sample paths of a spherical process (longitudinal data on a unit two-sphere). Such separation is essential for efficient modeling and statistical analysis of spherical longitudinal data in a manner that is invariant to any phase variability. The key idea is to represent each path or trajectory with a pair of variables, a starting point and a Transported Square-Root Velocity Curve (TSRVC). A TSRVC is a curve in the tangent (vector) space at the starting point and has some important invariance properties under the L2 norm. The space of all such curves forms a vector bundle and the L2 norm, along with the standard Riemannian metric on S2, provides a natural metric on this vector bundle. This invariant representation allows for separating phase and amplitude components in given data, using a template-based idea. Furthermore, the metric property is useful in deriving computational procedures for clustering, mean computation, principal component analysis (PCA), and modeling. This comprehensive framework is demonstrated using two datasets: a set of bird-migration trajectories and a set of hurricane paths in the Atlantic ocean.
△ Less
Submitted 23 March, 2016;
originally announced March 2016.
-
Statistical analysis of trajectories on Riemannian manifolds: Bird migration, hurricane tracking and video surveillance
Authors:
Jingyong Su,
Sebastian Kurtek,
Eric Klassen,
Anuj Srivastava
Abstract:
We consider the statistical analysis of trajectories on Riemannian manifolds that are observed under arbitrary temporal evolutions. Past methods rely on cross-sectional analysis, with the given temporal registration, and consequently may lose the mean structure and artificially inflate observed variances. We introduce a quantity that provides both a cost function for temporal registration and a pr…
▽ More
We consider the statistical analysis of trajectories on Riemannian manifolds that are observed under arbitrary temporal evolutions. Past methods rely on cross-sectional analysis, with the given temporal registration, and consequently may lose the mean structure and artificially inflate observed variances. We introduce a quantity that provides both a cost function for temporal registration and a proper distance for comparison of trajectories. This distance is used to define statistical summaries, such as sample means and covariances, of synchronized trajectories and "Gaussian-type" models to capture their variability at discrete times. It is invariant to identical time-warpings (or temporal reparameterizations) of trajectories. This is based on a novel mathematical representation of trajectories, termed transported square-root vector field (TSRVF), and the $\mathbb{L}^2$ norm on the space of TSRVFs. We illustrate this framework using three representative manifolds---$\mathbb{S}^2$, $\mathrm {SE}(2)$ and shape space of planar contours---involving both simulated and real data. In particular, we demonstrate: (1) improvements in mean structures and significant reductions in cross-sectional variances using real data sets, (2) statistical modeling for capturing variability in aligned trajectories, and (3) evaluating random trajectories under these models. Experimental results concern bird migration, hurricane tracking and video surveillance.
△ Less
Submitted 5 May, 2014;
originally announced May 2014.
-
Registration of Functional Data Using Fisher-Rao Metric
Authors:
Anuj Srivastava,
Wei Wu,
Sebastian Kurtek,
Eric Klassen,
J. S. Marron
Abstract:
We introduce a novel geometric framework for separating the phase and the amplitude variability in functional data of the type frequently studied in growth curve analysis. This framework uses the Fisher-Rao Riemannian metric to derive a proper distance on the quotient space of functions modulo the time-warping group. A convenient square-root velocity function (SRVF) representation transforms the F…
▽ More
We introduce a novel geometric framework for separating the phase and the amplitude variability in functional data of the type frequently studied in growth curve analysis. This framework uses the Fisher-Rao Riemannian metric to derive a proper distance on the quotient space of functions modulo the time-warping group. A convenient square-root velocity function (SRVF) representation transforms the Fisher-Rao metric into the standard $\ltwo$ metric, simplifying the computations. This distance is then used to define a Karcher mean template and warp the individual functions to align them with the Karcher mean template. The strength of this framework is demonstrated by deriving a consistent estimator of a signal observed under random warping, scaling, and vertical translation. These ideas are demonstrated using both simulated and real data from different application domains: the Berkeley growth study, handwritten signature curves, neuroscience spike trains, and gene expression signals. The proposed method is empirically shown to be be superior in performance to several recently published methods for functional alignment.
△ Less
Submitted 16 May, 2011; v1 submitted 19 March, 2011;
originally announced March 2011.