-
Investigating swimming technical skills by a double partition clustering of multivariate functional data allowing for dimension selection
Authors:
Antoine Bouvet,
Salima El Kolei,
Matthieu Marbac
Abstract:
Investigating technical skills of swimmers is a challenge for performance improvement, that can be achieved by analyzing multivariate functional data recorded by Inertial Measurement Units (IMU). To investigate technical levels of front-crawl swimmers, a new model-based approach is introduced to obtain two complementary partitions reflecting, for each swimmer, its swimming pattern and its ability…
▽ More
Investigating technical skills of swimmers is a challenge for performance improvement, that can be achieved by analyzing multivariate functional data recorded by Inertial Measurement Units (IMU). To investigate technical levels of front-crawl swimmers, a new model-based approach is introduced to obtain two complementary partitions reflecting, for each swimmer, its swimming pattern and its ability to reproduce it. Contrary to the usual approaches for functional data clustering, the proposed approach also considers the information of the residuals resulting from the functional basis decomposition. Indeed, after decomposing into functional basis both the original signal (measuring the swimming pattern) and the signal of squared residuals (measuring the ability to reproduce the swimming pattern), the method fits the joint distribution of the coefficients related to both decompositions by considering dependency between both partitions. Modeling this dependency is mandatory since the difficulty of reproducing a swimming pattern depends on its shape. Moreover, a sparse decomposition of the distribution within components that permits a selection of the relevant dimensions during clustering is proposed. The partitions obtained on the IMU data aggregate the kinematical stroke variability linked to swimming technical skills and allow relevant biomechanical strategy for front-crawl sprint performance to be identified.
△ Less
Submitted 28 March, 2023;
originally announced March 2023.
-
Estimation of the Order of Non-Parametric Hidden Markov Models using the Singular Values of an Integral Operator
Authors:
Marie Du Roy de Chaumaray,
Salima El Kolei,
Marie-Pierre Etienne,
Matthieu Marbac
Abstract:
We are interested in assessing the order of a finite-state Hidden Markov Model (HMM) with the only two assumptions that the transition matrix of the latent Markov chain has full rank and that the density functions of the emission distributions are linearly independent. We introduce a new procedure for estimating this order by investigating the rank of some well-chosen integral operator which relie…
▽ More
We are interested in assessing the order of a finite-state Hidden Markov Model (HMM) with the only two assumptions that the transition matrix of the latent Markov chain has full rank and that the density functions of the emission distributions are linearly independent. We introduce a new procedure for estimating this order by investigating the rank of some well-chosen integral operator which relies on the distribution of a pair of consecutive observations. This method circumvents the usual limits of the spectral method when it is used for estimating the order of an HMM: it avoids the choice of the basis functions; it does not require any knowledge of an upper-bound on the order of the HMM (for the spectral method, such an upper-bound is defined by the number of basis functions); it permits to easily handle different types of data (including continuous data, circular data or multivariate continuous data) with a suitable choice of kernel. The method relies on the fact that the order of the HMM can be identified from the distribution of a pair of consecutive observations and that this order is equal to the rank of some integral operator (\emph{i.e.} the number of its singular values that are non-zero). Since only the empirical counter-part of the singular values of the operator can be obtained, we propose a data-driven thresholding procedure. An upper-bound on the probability of overestimating the order of the HMM is established. Moreover, sufficient conditions on the bandwidth used for kernel density estimation and on the threshold are stated to obtain the consistency of the estimator of the order of the HMM. The procedure is easily implemented since the values of all the tuning parameters are determined by the sample size.
△ Less
Submitted 27 November, 2023; v1 submitted 7 October, 2022;
originally announced October 2022.
-
Nonparametric estimation in a regression model with additive and multiplicative noise
Authors:
Christophe Chesneau,
Salima El Kolei,
Junke Kou,
Fabien Navarro
Abstract:
In this paper, we consider an unknown functional estimation problem in a general nonparametric regression model with the feature of having both multiplicative and additive noise.We propose two new wavelet estimators in this general context. We prove that they achieve fast convergence rates under the mean integrated square error over Besov spaces. The obtained rates have the particularity of being…
▽ More
In this paper, we consider an unknown functional estimation problem in a general nonparametric regression model with the feature of having both multiplicative and additive noise.We propose two new wavelet estimators in this general context. We prove that they achieve fast convergence rates under the mean integrated square error over Besov spaces. The obtained rates have the particularity of being established under weak conditions on the model. A numerical study in a context comparable to stochastic frontier estimation (with the difference that the boundary is not necessarily a production function) supports the theory.
△ Less
Submitted 20 June, 2020; v1 submitted 18 June, 2019;
originally announced June 2019.
-
Adaptive Density Estimation on Bounded Domains
Authors:
Karine Bertin,
Salima El Kolei,
Nicolas Klutchnikoff
Abstract:
We study the estimation, in Lp-norm, of density functions defined on [0,1]^d. We construct a new family of kernel density estimators that do not suffer from the so-called boundary bias problem and we propose a data-driven procedure based on the Goldenshluger and Lepski approach that jointly selects a kernel and a bandwidth. We derive two estimators that satisfy oracle-type inequalities. They are a…
▽ More
We study the estimation, in Lp-norm, of density functions defined on [0,1]^d. We construct a new family of kernel density estimators that do not suffer from the so-called boundary bias problem and we propose a data-driven procedure based on the Goldenshluger and Lepski approach that jointly selects a kernel and a bandwidth. We derive two estimators that satisfy oracle-type inequalities. They are also proved to be adaptive over a scale of anisotropic or isotropic Sobolev-Slobodetskii classes (which are particular cases of Besov or Sobolev classical classes). The main interest of the isotropic procedure is to obtain adaptive results without any restriction on the smoothness parameter.
△ Less
Submitted 25 October, 2018;
originally announced October 2018.
-
Analysis, detection and correction of misspecified discrete time state space models
Authors:
Salima El Kolei,
Frédéric Patras
Abstract:
Misspecifications (i.e. errors on the parameters) of state space models lead to incorrect inference of the hidden states. This paper studies weakly nonlin-ear state space models with additive Gaussian noises and proposes a method for detecting and correcting misspecifications. The latter induce a biased estimator of the hidden state but also happen to induce correlation on innovations and other re…
▽ More
Misspecifications (i.e. errors on the parameters) of state space models lead to incorrect inference of the hidden states. This paper studies weakly nonlin-ear state space models with additive Gaussian noises and proposes a method for detecting and correcting misspecifications. The latter induce a biased estimator of the hidden state but also happen to induce correlation on innovations and other residues. This property is used to find a well-defined objective function for which an optimisation routine is applied to recover the true parameters of the model. It is argued that this method can consistently estimate the bias on the parameter. We demonstrate the algorithm on various models of increasing complexity.
△ Less
Submitted 3 April, 2017;
originally announced April 2017.
-
Parametric inference of hidden discrete-time diffusion processes by deconvolution
Authors:
Salima El Kolei,
Florian Pelgrin
Abstract:
We study a new parametric approach for hidden discrete-time diffusion models. This method is based on contrast minimization and deconvolution and leads to estimate a large class of stochastic models with nonlinear drift and nonlinear diffusion. It can be applied, for example, for ecological and financial state space models. After proving consistency and asymptotic normality of the estimation, lead…
▽ More
We study a new parametric approach for hidden discrete-time diffusion models. This method is based on contrast minimization and deconvolution and leads to estimate a large class of stochastic models with nonlinear drift and nonlinear diffusion. It can be applied, for example, for ecological and financial state space models. After proving consistency and asymptotic normality of the estimation, leading to asymptotic confidence intervals, we provide a thorough numerical study, which compares many classical methods used in practice (Non Linear Least Square estimator, Monte Carlo Expectation Maxi-mization Likelihood estimator and Bayesian estimators) to estimate stochastic volatility model. We prove that our estimator clearly outperforms the Maximum Likelihood Estimator in term of computing time, but also most of the other methods. We also show that this contrast method is the most stable and also does not need any tuning parameter.
△ Less
Submitted 19 December, 2016; v1 submitted 27 December, 2015;
originally announced December 2015.
-
Propagation of initial errors on the parameters for linear and Gaussian state space models
Authors:
Salima El Kolei
Abstract:
For linear and Gaussian state space models parametrized by $θ_0 \in Θ\subset \mathbb{R}^r, r \geq 1$ corresponding to the vector of parameters of the model, the Kalman filter gives exactly the solution for the optimal filtering under weak assumptions. This result supposes that $θ_0$ is perfectly known. In most real applications, this assumption is not realistic since $θ_0$ is unknown and has to be…
▽ More
For linear and Gaussian state space models parametrized by $θ_0 \in Θ\subset \mathbb{R}^r, r \geq 1$ corresponding to the vector of parameters of the model, the Kalman filter gives exactly the solution for the optimal filtering under weak assumptions. This result supposes that $θ_0$ is perfectly known. In most real applications, this assumption is not realistic since $θ_0$ is unknown and has to be estimated. In this paper, we analysis the Kalman filter for a biased estimator of $θ_0$. We show the propagation of this bias on the estimation of the hidden state. We give an expression of this propagation for linear and Gaussian state space models and we extend this result for almost linear models estimated by the Extended Kalman filter. An illustration is given for the autoregressive process with measurement noises widely studied in econometrics to model economic and financial data.
△ Less
Submitted 14 March, 2013;
originally announced March 2013.
-
Parametric estimation of hidden stochastic model by contrast minimization and deconvolution: application to the Stochastic Volatility Model
Authors:
Salima El Kolei
Abstract:
We study a new parametric approach for particular hidden stochastic models such as the Stochastic Volatility model. This method is based on contrast minimization and deconvolution. After proving consistency and asymptotic normality of the estimation leading to asymptotic confidence intervals, we provide a thorough numerical study, which compares most of the classical methods that are used in pract…
▽ More
We study a new parametric approach for particular hidden stochastic models such as the Stochastic Volatility model. This method is based on contrast minimization and deconvolution. After proving consistency and asymptotic normality of the estimation leading to asymptotic confidence intervals, we provide a thorough numerical study, which compares most of the classical methods that are used in practice (Quasi Maximum Likelihood estimator, Simulated Expectation Maximization Likelihood estimator and Bayesian estimators). We prove that our estimator clearly outperforms the Maximum Likelihood Estimator in term of computing time, but also most of the other methods. We also show that this contrast method is the most robust with respect to non Gaussianity of the error and also does not need any tuning parameter.
△ Less
Submitted 14 March, 2013; v1 submitted 12 February, 2012;
originally announced February 2012.