-
Two-Sample Tests for Optimal Lifts, Manifold Stability and Reverse Labeling Reflection Shap
Authors:
Do Tran Van,
Susovan Pal,
Benjamin Eltzner,
Stephan F. Huckemann
Abstract:
We consider a quotient of a complete Riemannian manifold modulo an isometrically and properly acting Lie group and lifts of the quotient to the manifolds in optimal position to a reference point on the manifold. With respect to the pushed forward Riemannian volume onto the quotient we derive continuity and uniqueness a.e. and smoothness to large extents also with respect to the reference point. In…
▽ More
We consider a quotient of a complete Riemannian manifold modulo an isometrically and properly acting Lie group and lifts of the quotient to the manifolds in optimal position to a reference point on the manifold. With respect to the pushed forward Riemannian volume onto the quotient we derive continuity and uniqueness a.e. and smoothness to large extents also with respect to the reference point. In consequence we derive a general manifold stability theorem: the Fréchet mean lies in the highest dimensional stratum assumed with positive probability, and a strong law for optimal lifts. This allows to define new two-sample tests utilizing individual optimal lifts which outperform existing two-sample tests on simulated data. They also outperform existing tests on a newly derived reverse labeling reflection shape space, that is used to model filament data of microtubules within cells in a biological application.
△ Less
Submitted 22 March, 2025;
originally announced March 2025.
-
Constrained Shape Analysis with Applications to RNA Structure
Authors:
Kanti V. Mardia,
Benjamin Eltzner,
Stephan F. Huckemann
Abstract:
In many applications of shape analysis, lengths between some landmarks are constrained. For instance, biomolecules often have some bond lengths and some bond angles constrained, and variation occurs only along unconstrained bonds and constrained bonds' torsions where the latter are conveniently modelled by dihedral angles. Our work has been motivated by low resolution biomolecular chain RNA where…
▽ More
In many applications of shape analysis, lengths between some landmarks are constrained. For instance, biomolecules often have some bond lengths and some bond angles constrained, and variation occurs only along unconstrained bonds and constrained bonds' torsions where the latter are conveniently modelled by dihedral angles. Our work has been motivated by low resolution biomolecular chain RNA where only some prominent atomic bonds can be well identified. Here, we propose a new modelling strategy for such constrained shape analysis starting with a product of polar coordinates (polypolars), where, due to constraints, for example, some radial coordinates should be omitted, leaving products of spheres (polyspheres). We give insight into these coordinates for particular cases such as five landmarks which are motivated by a practical RNA application. We also discuss distributions for polypolar coordinates and give a specific methodology with illustration when the constrained size-and-shape variables are concentrated. There are applications of this in clustering and we give some insight into a modified version of the MINT-AGE algorithm.
△ Less
Submitted 22 February, 2025;
originally announced February 2025.
-
Drift Models on Complex Projective Space for Electron-Nuclear Double Resonance
Authors:
Henrik Wiechers,
Markus Zobel,
Marina Bennati,
Igor Tkach,
Benjamin Eltzner,
Stephan Huckemann,
Yvo Pokern
Abstract:
ENDOR spectroscopy is an important tool to determine the complicated three-dimensional structure of biomolecules and in particular enables measurements of intramolecular distances. Usually, spectra are determined by averaging the data matrix, which does not take into account the significant thermal drifts that occur in the measurement process. In contrast, we present an asymptotic analysis for the…
▽ More
ENDOR spectroscopy is an important tool to determine the complicated three-dimensional structure of biomolecules and in particular enables measurements of intramolecular distances. Usually, spectra are determined by averaging the data matrix, which does not take into account the significant thermal drifts that occur in the measurement process. In contrast, we present an asymptotic analysis for the homoscedastic drift model, a pioneering parametric model that achieves striking model fits in practice and allows both hypothesis testing and confidence intervals for spectra. The ENDOR spectrum and an orthogonal component are modeled as an element of complex projective space, and formulated in the framework of generalized Fréchet means. To this end, two general formulations of strong consistency for set-valued Fréchet means are extended and subsequently applied to the homoscedastic drift model to prove strong consistency. Building on this, central limit theorems for the ENDOR spectrum are shown. Furthermore, we extend applicability by taking into account a phase noise contribution leading to the heteroscedastic drift model. Both drift models offer improved signal-to-noise ratio over pre-existing models.
△ Less
Submitted 23 July, 2023;
originally announced July 2023.
-
Exploring Uniform Finite Sample Stickiness
Authors:
Susanne Ulmer,
Do Tran Van,
Stephan F. Huckemann
Abstract:
It is well known, that Fréchet means on non-Euclidean spaces may exhibit nonstandard asymptotic rates depending on curvature. Even for distributions featuring standard asymptotic rates, there are non-Euclidean effects, altering finite sampling rates up to considerable sample sizes. These effects can be measured by the variance modulation function proposed by Pennec (2019). Among others, in view of…
▽ More
It is well known, that Fréchet means on non-Euclidean spaces may exhibit nonstandard asymptotic rates depending on curvature. Even for distributions featuring standard asymptotic rates, there are non-Euclidean effects, altering finite sampling rates up to considerable sample sizes. These effects can be measured by the variance modulation function proposed by Pennec (2019). Among others, in view of statistical inference, it is important to bound this function on intervals of sampling sizes. In a first step into this direction, for the special case of a K-spider we give such an interval, based only on folded moments and total probabilities of spider legs and illustrate the method by simulations.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
Diffusion Means in Geometric Spaces
Authors:
Benjamin Eltzner,
Pernille Hansen,
Stephan F. Huckemann,
Stefan Sommer
Abstract:
We introduce a location statistic for distributions on non-linear geometric spaces, the diffusion mean, serving as an extension and an alternative to the Fréchet mean. The diffusion mean arises as the generalization of Gaussian maximum likelihood analysis to non-linear spaces by maximizing the likelihood of a Brownian motion. The diffusion mean depends on a time parameter $t$, which admits the int…
▽ More
We introduce a location statistic for distributions on non-linear geometric spaces, the diffusion mean, serving as an extension and an alternative to the Fréchet mean. The diffusion mean arises as the generalization of Gaussian maximum likelihood analysis to non-linear spaces by maximizing the likelihood of a Brownian motion. The diffusion mean depends on a time parameter $t$, which admits the interpretation of the allowed variance of the diffusion. The diffusion $t$-mean of a distribution $X$ is the most likely origin of a Brownian motion at time $t$, given the end-point distribution $X$. We give a detailed description of the asymptotic behavior of the diffusion estimator and provide sufficient conditions for the diffusion estimator to be strongly consistent. Particularly, we present a smeary central limit theorem for diffusion means and we show that joint estimation of the mean and diffusion variance rules out smeariness in all directions simultaneously in general situations. Furthermore, we investigate properties of the diffusion mean for distributions on the sphere $\mathbb S^n$. Experimentally, we consider simulated data and data from magnetic pole reversals, all indicating similar or improved convergence rate compared to the Fréchet mean. Here, we additionally estimate $t$ and consider its effects on smeariness and uniqueness of the diffusion mean for distributions on the sphere.
△ Less
Submitted 4 December, 2022; v1 submitted 25 May, 2021;
originally announced May 2021.
-
Clustering Schemes on the Torus with Application to RNA Clashes
Authors:
Henrik Wiechers,
Benjamin Eltzner,
Stephan F. Huckemann,
Kanti V. Mardia
Abstract:
Molecular structures of RNA molecules reconstructed from X-ray crystallography frequently contain errors. Motivated by this problem we examine clustering on a torus since RNA shapes can be described by dihedral angles. A previously developed clustering method for torus data involves two tuning parameters and we assess clustering results for different parameter values in relation to the problem of…
▽ More
Molecular structures of RNA molecules reconstructed from X-ray crystallography frequently contain errors. Motivated by this problem we examine clustering on a torus since RNA shapes can be described by dihedral angles. A previously developed clustering method for torus data involves two tuning parameters and we assess clustering results for different parameter values in relation to the problem of so-called RNA clashes. This clustering problem is part of the dynamically evolving field of statistics on manifolds. Statistical problems on the torus highlight general challenges for statistics on manifolds. Therefore, the torus PCA and clustering methods we propose make an important contribution to directional statistics and statistics on manifolds in general.
△ Less
Submitted 28 February, 2021;
originally announced April 2021.
-
Finite Sample Smeariness on Spheres
Authors:
Benjamin Eltzner,
Shayan Hundrieser,
Stephan F. Huckemann
Abstract:
Finite Sample Smeariness (FSS) has been recently discovered. It means that the distribution of sample Fréchet means of underlying rather unsuspicious random variables can behave as if it were smeary for quite large regimes of finite sample sizes. In effect classical quantile-based statistical testing procedures do not preserve nominal size, they reject too often under the null hypothesis. Suitably…
▽ More
Finite Sample Smeariness (FSS) has been recently discovered. It means that the distribution of sample Fréchet means of underlying rather unsuspicious random variables can behave as if it were smeary for quite large regimes of finite sample sizes. In effect classical quantile-based statistical testing procedures do not preserve nominal size, they reject too often under the null hypothesis. Suitably designed bootstrap tests, however, amend for FSS. On the circle it has been known that arbitrarily sized FSS is possible, and that all distributions with a nonvanishing density feature FSS. These results are extended to spheres of arbitrary dimension. In particular all rotationally symmetric distributions, not necessarily supported on the entire sphere feature FSS of Type I. While on the circle there is also FSS of Type II it is conjectured that this is not possible on higher-dimensional spheres.
△ Less
Submitted 28 February, 2021;
originally announced March 2021.
-
Characteristic and Necessary Minutiae in Fingerprints
Authors:
Johannes Wieditz,
Yvo Pokern,
Dominic Schuhmacher,
Stephan Huckemann
Abstract:
Fingerprints feature a ridge pattern with moderately varying ridge frequency (RF), following an orientation field (OF), which usually features some singularities. Additionally at some points, called minutiae, ridge lines end or fork and this point pattern is usually used for fingerprint identification and authentication. Whenever the OF features divergent ridge lines (e.g. near singularities), a n…
▽ More
Fingerprints feature a ridge pattern with moderately varying ridge frequency (RF), following an orientation field (OF), which usually features some singularities. Additionally at some points, called minutiae, ridge lines end or fork and this point pattern is usually used for fingerprint identification and authentication. Whenever the OF features divergent ridge lines (e.g. near singularities), a nearly constant RF necessitates the generation of more ridge lines, originating at minutiae. We call these the necessary minutiae. It turns out that fingerprints feature additional minutiae which occur at rather arbitrary locations. We call these the random minutiae or, since they may convey fingerprint individuality beyond the OF, the characteristic minutiae. In consequence, the minutiae point pattern is assumed to be a realization of the superposition of two stochastic point processes: a Strauss point process (whose activity function is given by the divergence field) with an additional hard core, and a homogeneous Poisson point process, modelling the necessary and the characteristic minutiae, respectively. We perform Bayesian inference using an MCMC-based minutiae separating algorithm (MiSeal). In simulations, it provides good mixing and good estimation of underlying parameters. In application to fingerprints, we can separate the two minutiae patterns and verify by example of two different prints with similar OF that characteristic minutiae convey fingerprint individuality.
△ Less
Submitted 1 June, 2021; v1 submitted 16 September, 2020;
originally announced September 2020.
-
Finite Sample Smeariness of Fréchet Means and Application to Climate
Authors:
Shayan Hundrieser,
Benjamin Eltzner,
Stephan F. Huckemann
Abstract:
Fréchet means on non-Euclidean spaces may exhibit nonstandard asymptotic rates rendering quantile-based asymptotic inference inapplicable. We show here that this affects, among others, all circular distributions whose support exceeds a half circle. We exhaustively describe this phenomenon and introduce a new concept which we call finite samples smeariness (FSS). In the presence of FSS, it turns ou…
▽ More
Fréchet means on non-Euclidean spaces may exhibit nonstandard asymptotic rates rendering quantile-based asymptotic inference inapplicable. We show here that this affects, among others, all circular distributions whose support exceeds a half circle. We exhaustively describe this phenomenon and introduce a new concept which we call finite samples smeariness (FSS). In the presence of FSS, it turns out that quantile-based tests for equality of Fréchet means systematically feature effective levels higher than their nominal level which perseveres asymptotically in case of Type I FSS. In contrast, suitable bootstrap-based tests correct for FSS and asymptotically attain the correct level. For illustration of the relevance of FSS in real data, we apply our method to directional wind data from two European cities. It turns out that quantile based tests, not correcting for FSS, find a multitude of significant wind changes. This multitude condenses to a few years featuring significant wind changes, when our bootstrap tests are applied, correcting for FSS.
△ Less
Submitted 26 July, 2021; v1 submitted 5 May, 2020;
originally announced May 2020.
-
Information geometry for phylogenetic trees
Authors:
Maryam K. Garba,
Tom M. W. Nye,
Jonas Lueg,
Stephan F. Huckemann
Abstract:
We propose a new space of phylogenetic trees which we call wald space. The motivation is to develop a space suitable for statistical analysis of phylogenies, but with a geometry based on more biologically principled assumptions than existing spaces: in wald space, trees are close if they induce similar distributions on genetic sequence data. As a point set, wald space contains the previously devel…
▽ More
We propose a new space of phylogenetic trees which we call wald space. The motivation is to develop a space suitable for statistical analysis of phylogenies, but with a geometry based on more biologically principled assumptions than existing spaces: in wald space, trees are close if they induce similar distributions on genetic sequence data. As a point set, wald space contains the previously developed Billera-Holmes-Vogtmann (BHV) tree space; it also contains disconnected forests, like the edge-product (EP) space but without certain singularities of the EP space. We investigate two related geometries on wald space. The first is the geometry of the Fisher information metric of character distributions induced by the two-state symmetric Markov substitution process on each tree. Infinitesimally, the metric is proportional to the Kullback-Leibler divergence, or equivalently, as we show, any to f -divergence. The second geometry is obtained analogously but using a related continuous-valued Gaussian process on each tree, and it can be viewed as the trace metric of the affine-invariant metric for covariance matrices. We derive a gradient descent algorithm to project from the ambient space of covariance matrices to wald space. For both geometries we derive computational methods to compute geodesics in polynomial time and show numerically that the two information geometries (discrete and continuous) are very similar. In particular geodesics are approximated extrinsically. Comparison with the BHV geometry shows that our canonical and biologically motivated space is substantially different.
△ Less
Submitted 17 September, 2020; v1 submitted 29 March, 2020;
originally announced March 2020.
-
Confidence Tubes for Curves on SO(3) and Identification of Subject-Specific Gait Change after Kneeling
Authors:
Fabian J. E. Telschow,
Michael R. Pierrynowski,
Stephan F. Huckemann
Abstract:
In order to identify changes of gait patterns, e.g. due to prolonged occupational kneeling, which is believed to be major risk factor, among others, for the development of knee osteoarthritis, we develop confidence tubes for curves following a Gaussian perturbation model on SO(3). These are based on an application of the Gaussian kinematic formula to a process of Hotelling statistics and we approx…
▽ More
In order to identify changes of gait patterns, e.g. due to prolonged occupational kneeling, which is believed to be major risk factor, among others, for the development of knee osteoarthritis, we develop confidence tubes for curves following a Gaussian perturbation model on SO(3). These are based on an application of the Gaussian kinematic formula to a process of Hotelling statistics and we approximate them by a computible version, for which we show convergence. Simulations endorse our method, which in application to gait curves from eight volunteers undergoing kneeling tasks, identifies phases of the gait cycle that have changed due to kneeling tasks. We find that after kneeling, deviation from normal gait is stronger, in particular for older aged male volunteers. Notably our method adjusts for different walking speeds and marker replacement at different visits.
△ Less
Submitted 14 September, 2019;
originally announced September 2019.
-
Detecting Anisotropy in Fingerprint Growth
Authors:
Karla Markert,
Karolin Krehl,
Carsten Gottschlich,
Stephan F. Huckemann
Abstract:
From infancy to adulthood, human growth is anisotropic, much more along the proximal-distal axis (height) than along the medial-lateral axis (width), particularly at extremities. Detecting and modeling the rate of anisotropy in fingerprint growth, and possibly other growth patterns as well, facilitates the use of children's fingerprints for long-term biometric identification. Using standard finger…
▽ More
From infancy to adulthood, human growth is anisotropic, much more along the proximal-distal axis (height) than along the medial-lateral axis (width), particularly at extremities. Detecting and modeling the rate of anisotropy in fingerprint growth, and possibly other growth patterns as well, facilitates the use of children's fingerprints for long-term biometric identification. Using standard fingerprint scanners, anisotropic growth is highly overshadowed by the varying distortions created by each imprint, and it seems that this difficulty has hampered to date the development of suitable methods, detecting anisotropy, let alone, designing models. We provide a tool chain to statistically detect, with a given confidence, anisotropic growth in fingerprints and its preferred axis, where we only require a standard fingerprint scanner and a minutiae matcher. We build on a perturbation model, a new Procrustes-type algorithm, use and develop several parametric and non-parametric tests for different hypotheses, in particular for neighborhood hypotheses to detect the axis of anisotropy, where the latter tests are tunable to measurement accuracy. Taking into account realistic distortions caused by pressing fingers on scanners, our simulations based on real data indicate that, for example, already in rather small samples (56 matches) we can significantly detect proximal-distal growth if it exceeds medial-lateral growth by only around 5 percent. Our method is well applicable to future datasets of children fingerprint time series and we provide an implementation of our algorithms and tests with matched minutiae pattern data.
△ Less
Submitted 19 January, 2018;
originally announced January 2018.
-
Möbius Moduli for Fingerprint Orientation Fields
Authors:
Christina Imdahl,
Carsten Gottschlich,
Stephan Huckemann,
Ken'ichi Ohshika
Abstract:
We propose a novel fingerprint descriptor, namely Möbius moduli, measuring local deviation of orientation fields (OF) of fingerprints from conformal fields, and we propose a method to robustly measure them, based on tetraquadrilaterals to approximate a conformal modulus locally with one due to a Möbius transformation. Conformal fields arise by the approximation of fingerprint OFs given by zero pol…
▽ More
We propose a novel fingerprint descriptor, namely Möbius moduli, measuring local deviation of orientation fields (OF) of fingerprints from conformal fields, and we propose a method to robustly measure them, based on tetraquadrilaterals to approximate a conformal modulus locally with one due to a Möbius transformation. Conformal fields arise by the approximation of fingerprint OFs given by zero pole models, which are determined by the singular points and a rotation. This approximation is very coarse, e.g. for fingerprints with no singular points (arch type), the zero-pole model's OF has parallel lines. Quadratic differential (QD) models, which are obtained from zero-pole models by adding suitable singularities outside the observation window, approximate real fingerprints much better. For example, for arch type fingerprints, parallel lines along the distal joint change slowly into circular lines around the nail furrow. Still, QD models are not fully realistic because, for example along the central axis of arch type fingerprints, ridge line curvatures usually first increase and then decrease again. It is impossible to model this with QDs, which, due to complex analyticity, also produce conformal fields only. In fact, as one of many applications of the new descriptor, we show, using histograms of curvature and conformality index (log of the absolute value of the Möbius modulus), that local deviation from conformality in fingerprints occurs systematically at high curvature which is not reflected by state of the art fingerprint models as are used, for instance, in the well known synthetic fingerprint generation tool SFinGe and these differences robustely discriminate real prints from SFinGe's synthetic prints.
△ Less
Submitted 7 August, 2017;
originally announced August 2017.
-
Small sphere distributions for directional data with application to medical imaging
Authors:
Byungwon Kim,
Stephan Huckemann,
Jörn Schulz,
Sungkyu Jung
Abstract:
We propose new small-sphere distributional families for modeling multivariate directional data on $(\mathbb{S}^{p-1})^K$ for $p \ge 3$ and $K \ge 1$. In a special case of univariate directions in $\Re^3$, the new densities model random directions on $\mathbb{S}^2$ with a tendency to vary along a small circle on the sphere, and with a unique mode on the small circle. The proposed multivariate densi…
▽ More
We propose new small-sphere distributional families for modeling multivariate directional data on $(\mathbb{S}^{p-1})^K$ for $p \ge 3$ and $K \ge 1$. In a special case of univariate directions in $\Re^3$, the new densities model random directions on $\mathbb{S}^2$ with a tendency to vary along a small circle on the sphere, and with a unique mode on the small circle. The proposed multivariate densities enable us to model association among multivariate directions, and are useful in medical imaging, where multivariate directions are used to represent shape and shape changes of 3-dimensional objects. When the underlying objects are rotationally deformed under noise, for instance, twisted and/or bend, corresponding directions tend to follow the proposed small-sphere distributions. The proposed models have several advantages over other methods analyzing small-circle-concentrated data, including inference procedures on the association and small-circle fitting. We demonstrate the use of the proposed multivariate small-sphere distributions in analyses of skeletally-represented object shapes and human knee gait data.
△ Less
Submitted 28 May, 2017;
originally announced May 2017.
-
Functional Inference on Rotational Curves and Identification of Human Gait at the Knee Joint
Authors:
Fabian J. E. Telschow,
Stephan F. Huckemann,
Michael R. Pierrynowski
Abstract:
We extend Gaussian perturbation models in classical functional data analysis to the three-dimensional rotational group where a zero-mean Gaussian process in the Lie algebra under the Lie exponential spreads multiplicatively around a central curve. As an estimator, we introduce point-wise extrinsic mean curves which feature strong perturbation consistency, and which are asymptotically a.s. unique a…
▽ More
We extend Gaussian perturbation models in classical functional data analysis to the three-dimensional rotational group where a zero-mean Gaussian process in the Lie algebra under the Lie exponential spreads multiplicatively around a central curve. As an estimator, we introduce point-wise extrinsic mean curves which feature strong perturbation consistency, and which are asymptotically a.s. unique and differentiable, if the model is so. Further, we consider the group action of time warping and that of spatial isometries that are connected to the identity. The latter can be asymptotically consistently estimated if lifted to the unit quaternions. Introducing a generic loss for Lie groups, the former can be estimated, and based on curve length, due to asymptotic differentiability, we propose two-sample permutation tests involving various combinations of the group actions. This methodology allows inference on gait patterns due to the rotational motion of the lower leg with respect to the upper leg. This was previously not possible because, among others, the usual analysis of separate Euler angles is not independent of marker placement, even if performed by trained specialists.
△ Less
Submitted 11 November, 2016;
originally announced November 2016.
-
Backward Nested Descriptors Asymptotics with Inference on Stem Cell Differentiation
Authors:
Stephan F. Huckemann,
Benjamin Eltzner
Abstract:
For sequences of random backward nested subspaces as occur, say, in dimension reduction for manifold or stratified space valued data, asymptotic results are derived. In fact, we formulate our results more generally for backward nested families of descriptors (BNFD). Under rather general conditions, asymptotic strong consistency holds. Under additional, still rather general hypotheses, among them e…
▽ More
For sequences of random backward nested subspaces as occur, say, in dimension reduction for manifold or stratified space valued data, asymptotic results are derived. In fact, we formulate our results more generally for backward nested families of descriptors (BNFD). Under rather general conditions, asymptotic strong consistency holds. Under additional, still rather general hypotheses, among them existence of a.s. local twice differentiable charts, asymptotic joint normality of a BNFD can be shown. If charts factor suitably, this leads to individual asymptotic normality for the last element, a principal nested mean or a principal nested geodesic, say. It turns out that these results pertain to principal nested spheres (PNS) and principal nested great subsphere (PNGS) analysis by Jung et al. (2010) as well as to the intrinsic mean on a first geodesic principal component (IMo1GPC) for manifolds and Kendall's shape spaces. A nested bootstrap two-sample test is derived and illustrated with simulations. In a study on real data, PNGS is applied to track early human mesenchymal stem cell differentiation over a coarse time grid and, among others, to locate a change point with direct consequences for the design of further studies.
△ Less
Submitted 3 September, 2016;
originally announced September 2016.
-
Torus Principal Component Analysis with an Application to RNA Structures
Authors:
Benjamin Eltzner,
Stephan Huckemann,
Kanti V. Mardia
Abstract:
There are several cutting edge applications needing PCA methods for data on tori and we propose a novel torus-PCA method with important properties that can be generally applied. There are two existing general methods: tangent space PCA and geodesic PCA. However, unlike tangent space PCA, our torus-PCA honors the cyclic topology of the data space whereas, unlike geodesic PCA, our torus-PCA produces…
▽ More
There are several cutting edge applications needing PCA methods for data on tori and we propose a novel torus-PCA method with important properties that can be generally applied. There are two existing general methods: tangent space PCA and geodesic PCA. However, unlike tangent space PCA, our torus-PCA honors the cyclic topology of the data space whereas, unlike geodesic PCA, our torus-PCA produces a variety of non-winding, non-dense descriptors. This is achieved by deforming tori into spheres and then using a variant of the recently developed principle nested spheres analysis. This PCA analysis involves a step of small sphere fitting and we provide an improved test to avoid overfitting. However, deforming tori into spheres creates singularities. We introduce a data-adaptive pre-clustering technique to keep the singularities away from the data. For the frequently encountered case that the residual variance around the PCA main component is small, we use a post-mode hunting technique for more fine-grained clustering. Thus in general, there are three successive interrelated key steps of torus-PCA in practice: pre-clustering, deformation, and post-mode hunting. We illustrate our method with two recently studied RNA structure (tori) data sets: one is a small RNA data set which is established as the benchmark for PCA and we validate our method through this data. Another is a large RNA data set (containing the small RNA data set) for which we show that our method provides interpretable principal components as well as giving further insight into its structure.
△ Less
Submitted 16 November, 2015;
originally announced November 2015.
-
The circular SiZer, inferred persistence of shape parameters and application to early stem cell differentiation
Authors:
Stephan Huckemann,
Kwang-Rae Kim,
Axel Munk,
Florian Rehfeldt,
Max Sommerfeld,
Joachim Weickert,
Carina Wollnik
Abstract:
We generalize the SiZer of Chaudhuri and Marron (J. Amer. Statist. Assoc. 94 (1999) 807-823, Ann. Statist. 28 (2000) 408-428) for the detection of shape parameters of densities on the real line to the case of circular data. It turns out that only the wrapped Gaussian kernel gives a symmetric, strongly Lipschitz semi-group satisfying "circular" causality, that is, not introducing possibly artificia…
▽ More
We generalize the SiZer of Chaudhuri and Marron (J. Amer. Statist. Assoc. 94 (1999) 807-823, Ann. Statist. 28 (2000) 408-428) for the detection of shape parameters of densities on the real line to the case of circular data. It turns out that only the wrapped Gaussian kernel gives a symmetric, strongly Lipschitz semi-group satisfying "circular" causality, that is, not introducing possibly artificial modes with increasing levels of smoothing. Some notable differences between Euclidean and circular scale space theory are highlighted. Based on this, we provide an asymptotic theory to make inference about the persistence of shape features. The resulting circular mode persistence diagram is applied to the analysis of early mechanically-induced differentiation in adult human stem cells from their actin-myosin filament structure. As a consequence, the circular SiZer based on the wrapped Gaussian kernel (WiZer) allows the verification at a controlled error level of the observation reported by Zemel et al. (Nat. Phys. 6 (2010) 468-473): Within early stem cell differentiation, polarizations of stem cells exhibit preferred directions in three different micro-environments.
△ Less
Submitted 5 July, 2016; v1 submitted 12 April, 2014;
originally announced April 2014.
-
Drift Estimation in Sparse Sequential Dynamic Imaging: with Application to Nanoscale Fluorescence Microscopy
Authors:
Alexander Hartmann,
Stephan Huckemann,
Jörn Dannemann,
Oskar Laitenberger,
Claudia Geisler,
Alexander Egner,
Axel Munk
Abstract:
A major challenge in many modern superresolution fluorescence microscopy techniques at the nanoscale lies in the correct alignment of long sequences of sparse but spatially and temporally highly resolved images. This is caused by the temporal drift of the protein structure, e.g. due to temporal thermal inhomogeneity of the object of interest or its supporting area during the observation process. W…
▽ More
A major challenge in many modern superresolution fluorescence microscopy techniques at the nanoscale lies in the correct alignment of long sequences of sparse but spatially and temporally highly resolved images. This is caused by the temporal drift of the protein structure, e.g. due to temporal thermal inhomogeneity of the object of interest or its supporting area during the observation process. We develop a simple semiparametric model for drift correction in SMS microscopy. Then we propose an M-estimator for the drift and show its asymptotic normality. This is used to correct the final image and it is shown that this purely statistical method is competitive with state of the art calibration techniques which require to incorporate fiducial markers into the specimen. Moreover, a simple bootstrap algorithm allows to quantify the precision of the drift estimate and its effect on the final image estimation. We argue that purely statistical drift correction is even more robust than fiducial tracking rendering the latter superfluous in many applications. The practicability of our method is demonstrated by a simulation study and by an SMS application. This serves as a prototype for many other typical imaging techniques where sparse observations with highly temporal resolution are blurred by motion of the object to be reconstructed.
△ Less
Submitted 21 December, 2014; v1 submitted 6 March, 2014;
originally announced March 2014.
-
Intrinsic Means on the Circle: Uniqueness, Locus and Asymptotics
Authors:
Thomas Hotz,
Stephan Huckemann
Abstract:
This paper gives a comprehensive treatment of local uniqueness, asymptotics and numerics for intrinsic means on the circle. It turns out that local uniqueness as well as rates of convergence are governed by the distribution near the antipode. In a nutshell, if the distribution there is locally less than uniform, we have local uniqueness and asymptotic normality with a rate of 1 / \surdn. With incr…
▽ More
This paper gives a comprehensive treatment of local uniqueness, asymptotics and numerics for intrinsic means on the circle. It turns out that local uniqueness as well as rates of convergence are governed by the distribution near the antipode. In a nutshell, if the distribution there is locally less than uniform, we have local uniqueness and asymptotic normality with a rate of 1 / \surdn. With increased proximity to the uniform distribution the rate can be arbitrarly slow, and in the limit, local uniqueness is lost. Further, we give general distributional conditions, e.g. unimodality, that ensure global uniqueness. Along the way, we discover that sample means can occur only at the vertices of a regular polygon which allows to compute intrinsic sample means in linear time from sorted data. This algorithm is finally applied in a simulation study demonstrating the dependence of the convergence rates on the behavior of the density at the antipode.
△ Less
Submitted 10 August, 2011;
originally announced August 2011.
-
Intrinsic Inference on the Mean Geodesic of Planar Shapes and Tree Discrimination by Leaf Growth
Authors:
Stephan Huckemann
Abstract:
For planar landmark based shapes, taking into account the non-Euclidean geometry of the shape space, a statistical test for a common mean first geodesic principal component (GPC) is devised. It rests on one of two asymptotic scenarios, both of which are identical in a Euclidean geometry. For both scenarios, strong consistency and central limit theorems are established, along with an algorithm for…
▽ More
For planar landmark based shapes, taking into account the non-Euclidean geometry of the shape space, a statistical test for a common mean first geodesic principal component (GPC) is devised. It rests on one of two asymptotic scenarios, both of which are identical in a Euclidean geometry. For both scenarios, strong consistency and central limit theorems are established, along with an algorithm for the computation of a Ziezold mean geodesic. In application, this allows to verify the geodesic hypothesis for leaf growth of Canadian black poplars and to discriminate genetically different trees by observations of leaf shape growth over brief time intervals. With a test based on Procrustes tangent space coordinates, not involving the shape space's curvature, neither can be achieved.
△ Less
Submitted 16 September, 2010;
originally announced September 2010.
-
On the meaning of mean shape
Authors:
Stephan Huckemann
Abstract:
Various concepts of mean shape previously unrelated in the literature are brought into relation. In particular for non-manifolds such as Kendall's 3D shape space, this paper answers the question, for which means one may apply a two-sample test. The answer is positive if intrinsic or Ziezold means are used. The underlying general result of manifold stability of a mean on a shape space, the quotient…
▽ More
Various concepts of mean shape previously unrelated in the literature are brought into relation. In particular for non-manifolds such as Kendall's 3D shape space, this paper answers the question, for which means one may apply a two-sample test. The answer is positive if intrinsic or Ziezold means are used. The underlying general result of manifold stability of a mean on a shape space, the quotient due to an isometric action of a compact Lie group on a Riemannian manifold, blends the Slice Theorem from differential geometry with the statistics of shape. For 3D Procrustes means, however, a counterexample is given. To further elucidate on subtleties of means, for spheres and Kendall's shape spaces, a first order relationship between intrinsic, residual/Procrustean and extrinsic/Ziezold means is derived stating that for high concentration the latter approximately divides the (generalized) geodesic segment between the former two by the ratio $1:3$. This fact, consequences of coordinate choices for the power of tests and other details, e.g. that extrinsic Schoenberg means may increase dimension are discussed and illustrated by simulations and exemplary datasets.
△ Less
Submitted 12 May, 2011; v1 submitted 3 February, 2010;
originally announced February 2010.
-
Inference on 3D Procrustes means: tree bole growth, rank-deficient diffusion tensors and perturbation models
Authors:
Stephan Huckemann
Abstract:
The Central Limit Theorem (CLT) for extrinsic and intrinsic means on manifolds is extended to a generalization of Fréchet means. Examples are the Procrustes mean for 3D Kendall shapes as well as a mean introduced by Ziezold. This allows for one-sample tests previously not possible, and to numerically assess the `inconsistency of the Procrustes mean' for a perturbation model and `inconsistency' wit…
▽ More
The Central Limit Theorem (CLT) for extrinsic and intrinsic means on manifolds is extended to a generalization of Fréchet means. Examples are the Procrustes mean for 3D Kendall shapes as well as a mean introduced by Ziezold. This allows for one-sample tests previously not possible, and to numerically assess the `inconsistency of the Procrustes mean' for a perturbation model and `inconsistency' within a model recently proposed for diffusion tensor imaging. Also it is shown that the CLT can be extended to mildly rank deficient diffusion tensors. An application to forestry gives the temporal evolution of Douglas fir tree stems tending strongly towards cylinders at early ages and tending away with increased competition.
△ Less
Submitted 2 February, 2011; v1 submitted 3 February, 2010;
originally announced February 2010.
-
Dynamic shape analysis and comparison of leaf growth
Authors:
Stephan Huckemann
Abstract:
In the statistical analysis of shape a goal beyond the analysis of static shapes lies in the quantification of `same' deformation of different shapes. Typically, shape spaces are modelled as Riemannian manifolds on which parallel transport along geodesics naturally qualifies as a measure for the `similarity' of deformation. Since these spaces are usually defined as combinations of Riemannian imm…
▽ More
In the statistical analysis of shape a goal beyond the analysis of static shapes lies in the quantification of `same' deformation of different shapes. Typically, shape spaces are modelled as Riemannian manifolds on which parallel transport along geodesics naturally qualifies as a measure for the `similarity' of deformation. Since these spaces are usually defined as combinations of Riemannian immersions and submersions, only for few well featured spaces such as spheres or complex projective spaces (which are Kendall's spaces for 2D shapes), parallel transport along geodesics can be computed explicitly. In this contribution a general numerical method to compute parallel transport along geodesics when no explicit formula is available is provided. This method is applied to the shape spaces of closed 2D contours based on angular direction and to Kendall's spaces of shapes of arbitrary dimension. In application to the temporal evolution of leaf shape over a growing period, one leaf's shape-growth dynamics can be applied to another leaf. For a specific poplar tree investigated it is found that leaves of initially and terminally different shape evolve rather parallel, i.e. with comparable dynamics.
△ Less
Submitted 3 February, 2010;
originally announced February 2010.