-
Data-driven modelling of nonlinear dynamics by barycentric coordinates and memory
Authors:
Niklas Wulkow,
Péter Koltai,
Vikram Sunkara,
Christof Schütte
Abstract:
We present a numerical method to model dynamical systems from data. We use the recently introduced method Scalable Probabilistic Approximation (SPA) to project points from a Euclidean space to convex polytopes and represent these projected states of a system in new, lower-dimensional coordinates denoting their position in the polytope. We then introduce a specific nonlinear transformation to const…
▽ More
We present a numerical method to model dynamical systems from data. We use the recently introduced method Scalable Probabilistic Approximation (SPA) to project points from a Euclidean space to convex polytopes and represent these projected states of a system in new, lower-dimensional coordinates denoting their position in the polytope. We then introduce a specific nonlinear transformation to construct a model of the dynamics in the polytope and to transform back into the original state space. To overcome the potential loss of information from the projection to a lower-dimensional polytope, we use memory in the sense of the delay-embedding theorem of Takens. By construction, our method produces stable models. We illustrate the capacity of the method to reproduce even chaotic dynamics and attractors with multiple connected components on various examples.
△ Less
Submitted 16 February, 2022; v1 submitted 13 December, 2021;
originally announced December 2021.
-
Nonparametric approximation of conditional expectation operators
Authors:
Mattes Mollenhauer,
Péter Koltai
Abstract:
Given the joint distribution of two random variables $X,Y$ on some second countable locally compact Hausdorff space, we investigate the statistical approximation of the $L^2$-operator defined by $[Pf](x) := \mathbb{E}[ f(Y) \mid X = x ]$ under minimal assumptions. By modifying its domain, we prove that $P$ can be arbitrarily well approximated in operator norm by Hilbert-Schmidt operators acting on…
▽ More
Given the joint distribution of two random variables $X,Y$ on some second countable locally compact Hausdorff space, we investigate the statistical approximation of the $L^2$-operator defined by $[Pf](x) := \mathbb{E}[ f(Y) \mid X = x ]$ under minimal assumptions. By modifying its domain, we prove that $P$ can be arbitrarily well approximated in operator norm by Hilbert-Schmidt operators acting on a reproducing kernel Hilbert space. This fact allows to estimate $P$ uniformly by finite-rank operators over a dense subspace even when $P$ is not compact. In terms of modes of convergence, we thereby obtain the superiority of kernel-based techniques over classically used parametric projection approaches such as Galerkin methods. This also provides a novel perspective on which limiting object the nonparametric estimate of $P$ converges to. As an application, we show that these results are particularly important for a large family of spectral analysis techniques for Markov transition operators. Our investigation also gives a new asymptotic perspective on the so-called kernel conditional mean embedding, which is the theoretical foundation of a wide variety of techniques in kernel-based nonparametric inference.
△ Less
Submitted 5 August, 2023; v1 submitted 23 December, 2020;
originally announced December 2020.
-
Kernel Autocovariance Operators of Stationary Processes: Estimation and Convergence
Authors:
Mattes Mollenhauer,
Stefan Klus,
Christof Schütte,
Péter Koltai
Abstract:
We consider autocovariance operators of a stationary stochastic process on a Polish space that is embedded into a reproducing kernel Hilbert space. We investigate how empirical estimates of these operators converge along realizations of the process under various conditions. In particular, we examine ergodic and strongly mixing processes and obtain several asymptotic results as well as finite sampl…
▽ More
We consider autocovariance operators of a stationary stochastic process on a Polish space that is embedded into a reproducing kernel Hilbert space. We investigate how empirical estimates of these operators converge along realizations of the process under various conditions. In particular, we examine ergodic and strongly mixing processes and obtain several asymptotic results as well as finite sample error bounds. We provide applications of our theory in terms of consistency results for kernel PCA with dependent data and the conditional mean embedding of transition probabilities. Finally, we use our approach to examine the nonparametric estimation of Markov transition operators and highlight how our theory can give a consistency analysis for a large family of spectral analysis methods including kernel-based dynamic mode decomposition.
△ Less
Submitted 29 November, 2022; v1 submitted 2 April, 2020;
originally announced April 2020.
-
Dimensionality Reduction of Complex Metastable Systems via Kernel Embeddings of Transition Manifolds
Authors:
Andreas Bittracher,
Stefan Klus,
Boumediene Hamzi,
Péter Koltai,
Christof Schütte
Abstract:
We present a novel kernel-based machine learning algorithm for identifying the low-dimensional geometry of the effective dynamics of high-dimensional multiscale stochastic systems. Recently, the authors developed a mathematical framework for the computation of optimal reaction coordinates of such systems that is based on learning a parametrization of a low-dimensional transition manifold in a cert…
▽ More
We present a novel kernel-based machine learning algorithm for identifying the low-dimensional geometry of the effective dynamics of high-dimensional multiscale stochastic systems. Recently, the authors developed a mathematical framework for the computation of optimal reaction coordinates of such systems that is based on learning a parametrization of a low-dimensional transition manifold in a certain function space. In this article, we enhance this approach by embedding and learning this transition manifold in a reproducing kernel Hilbert space, exploiting the favorable properties of kernel embeddings. Under mild assumptions on the kernel, the manifold structure is shown to be preserved under the embedding, and distortion bounds can be derived. This leads to a more robust and more efficient algorithm compared to previous parametrization approaches.
△ Less
Submitted 3 February, 2020; v1 submitted 18 April, 2019;
originally announced April 2019.
-
Markov-chain-inspired search for MH370
Authors:
P. Miron,
F. J. Beron-Vera,
M. J. Olascoaga,
P. Koltai
Abstract:
Markov-chain models are constructed for the probabilistic description of the drift of marine debris from Malaysian Airlines flight MH370. En route from Kuala Lumpur to Beijing, the MH370 mysteriously disappeared in the southeastern Indian Ocean on 8 March 2014, somewhere along the arc of the 7th ping ring around the Inmarsat-3F1 satellite position when the airplane lost contact. The models are obt…
▽ More
Markov-chain models are constructed for the probabilistic description of the drift of marine debris from Malaysian Airlines flight MH370. En route from Kuala Lumpur to Beijing, the MH370 mysteriously disappeared in the southeastern Indian Ocean on 8 March 2014, somewhere along the arc of the 7th ping ring around the Inmarsat-3F1 satellite position when the airplane lost contact. The models are obtained by discretizing the motion of undrogued satellite-tracked surface drifting buoys from the global historical data bank. A spectral analysis, Bayesian estimation, and the computation of most probable paths between the Inmarsat arc and confirmed airplane debris beaching sites are shown to constrain the crash site, near 25$^{\circ}$S on the Inmarsat arc.
△ Less
Submitted 14 March, 2019;
originally announced March 2019.
-
Variational Koopman models: slow collective variables and molecular kinetics from short off-equilibrium simulations
Authors:
Hao Wu,
Feliks Nüske,
Fabian Paul,
Stefan Klus,
Peter Koltai,
Frank Noé
Abstract:
Markov state models (MSMs) and Master equation models are popular approaches to approximate molecular kinetics, equilibria, metastable states, and reaction coordinates in terms of a state space discretization usually obtained by clustering. Recently, a powerful generalization of MSMs has been introduced, the variational approach (VA) of molecular kinetics and its special case the time-lagged indep…
▽ More
Markov state models (MSMs) and Master equation models are popular approaches to approximate molecular kinetics, equilibria, metastable states, and reaction coordinates in terms of a state space discretization usually obtained by clustering. Recently, a powerful generalization of MSMs has been introduced, the variational approach (VA) of molecular kinetics and its special case the time-lagged independent component analysis (TICA), which allow us to approximate slow collective variables and molecular kinetics by linear combinations of smooth basis functions or order parameters. While it is known how to estimate MSMs from trajectories whose starting points are not sampled from an equilibrium ensemble, this has not yet been the case for TICA and the VA. Previous estimates from short trajectories, have been strongly biased and thus not variationally optimal. Here, we employ Koopman operator theory and ideas from dynamic mode decomposition (DMD) to extend the VA and TICA to non-equilibrium data. The main insight is that the VA and TICA provide a coefficient matrix that we call Koopman model, as it approximates the underlying dynamical (Koopman) operator in conjunction with the basis set used. This Koopman model can be used to compute a stationary vector to reweight the data to equilibrium. From such a Koopman-reweighted sample, equilibrium expectation values and variationally optimal reversible Koopman models can be constructed even with short simulations. The Koopman model can be used to propagate densities, and its eigenvalue decomposition provide estimates of relaxation timescales and slow collective variables for dimension reduction. Koopman models are generalizations of Markov state models, TICA and the linear VA and allow molecular kinetics to be described without a cluster discretization.
△ Less
Submitted 22 January, 2017; v1 submitted 20 October, 2016;
originally announced October 2016.