-
An efficient Averaged Stochastic Gauss-Newton algorithm for estimating parameters of non linear regressions models
Authors:
Peggy Cénac,
Antoine Godichon-Baggioni,
Bruno Portier
Abstract:
Non linear regression models are a standard tool for modeling real phenomena, with several applications in machine learning, ecology, econometry... Estimating the parameters of the model has garnered a lot of attention during many years. We focus here on a recursive method for estimating parameters of non linear regressions. Indeed, these kinds of methods, whose most famous are probably the stocha…
▽ More
Non linear regression models are a standard tool for modeling real phenomena, with several applications in machine learning, ecology, econometry... Estimating the parameters of the model has garnered a lot of attention during many years. We focus here on a recursive method for estimating parameters of non linear regressions. Indeed, these kinds of methods, whose most famous are probably the stochastic gradient algorithm and its averaged version, enable to deal efficiently with massive data arriving sequentially. Nevertheless, they can be, in practice, very sensitive to the case where the eigen-values of the Hessian of the functional we would like to minimize are at different scales. To avoid this problem, we first introduce an online Stochastic Gauss-Newton algorithm. In order to improve the estimates behavior in case of bad initialization, we also introduce a new Averaged Stochastic Gauss-Newton algorithm and prove its asymptotic efficiency.
△ Less
Submitted 16 September, 2020; v1 submitted 23 June, 2020;
originally announced June 2020.
-
Variable Length Memory Chains: characterization of stationary probability measures
Authors:
Peggy Cénac,
Brigitte Chauvin,
Camille Noûs,
Frédéric Paccaut,
Nicolas Pouyanne
Abstract:
Variable Length Memory Chains (VLMC), which are generalizations of finite order Markov chains, turn out to be an essential tool to modelize random sequences in many domains, as well as an interesting object in contemporary probability theory. The question of the existence of stationary probability measures leads us to introduce a key combinatorial structure for words produced by a VLMC: the Longes…
▽ More
Variable Length Memory Chains (VLMC), which are generalizations of finite order Markov chains, turn out to be an essential tool to modelize random sequences in many domains, as well as an interesting object in contemporary probability theory. The question of the existence of stationary probability measures leads us to introduce a key combinatorial structure for words produced by a VLMC: the Longest Internal Suffix. This notion allows us to state a necessary and sufficient condition for a general VLMC to admit a unique invariant probability measure. This condition turns out to get a much simpler form for a subclass of VLMC: the stable VLMC. This natural subclass, unlike the general case, enjoys a renewal property. Namely, a stable VLMC induces a semi-Markov chain on an at most countable state space. Unfortunately, this discrete time renewal process does not contain the whole information of the VLMC, preventing the study of a stable VLMC to be reduced to the study of its induced semi-Markov chain. For a subclass of stable VLMC, the convergence in distribution of a VLMC towards its stationary probability measure is established. Finally, finite state space semi-Markov chains turn out to be very special stable VLMC, shedding some new light on their limit distributions.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
Variable Length Markov Chains, Persistent Random Walks: a close encounter
Authors:
P. Cénac,
B. Chauvin,
F. Paccaut,
N. Pouyanne
Abstract:
This is the story of the encounter between two worlds: the world of random walks and the world of Variable Length Markov Chains (VLMC). The meeting point turns around the semi-Markov property of underlying processes.
This is the story of the encounter between two worlds: the world of random walks and the world of Variable Length Markov Chains (VLMC). The meeting point turns around the semi-Markov property of underlying processes.
△ Less
Submitted 10 September, 2019;
originally announced September 2019.
-
Characterization of stationary probability measures for Variable Length Markov Chains
Authors:
Peggy Cénac,
Brigitte Chauvin,
Frédéric Paccaut,
Nicolas Pouyanne
Abstract:
By introducing a key combinatorial structure for words produced by a Variable Length Markov Chain (VLMC), the longest internal suffix, precise characterizations of existence and uniqueness of a stationary probability measure for a VLMC chain are given. These characterizations turn into necessary and sufficient conditions for VLMC associated to a subclass of probabilised context trees: the shift-st…
▽ More
By introducing a key combinatorial structure for words produced by a Variable Length Markov Chain (VLMC), the longest internal suffix, precise characterizations of existence and uniqueness of a stationary probability measure for a VLMC chain are given. These characterizations turn into necessary and sufficient conditions for VLMC associated to a subclass of probabilised context trees: the shift-stable context trees. As a by-product, we prove that a VLMC chain whose stabilized context tree is again a context tree has at most one stationary probability measure.
△ Less
Submitted 3 July, 2018;
originally announced July 2018.
-
Recurrence of Multidimensional Persistent Random Walks. Fourier and Series Criteria
Authors:
Peggy Cénac,
Basile De Loynes,
Yoann Offret,
Arnaud Rousselle
Abstract:
The recurrence features of persistent random walks built from variable length Markov chains are investigated. We observe that these stochastic processes can be seen as L{é}vy walks for which the persistence times depend on some internal Markov chain: they admit Markov random walk skeletons. A recurrence versus transience dichotomy is highlighted. We first give a sufficient Fourier criterion for t…
▽ More
The recurrence features of persistent random walks built from variable length Markov chains are investigated. We observe that these stochastic processes can be seen as L{é}vy walks for which the persistence times depend on some internal Markov chain: they admit Markov random walk skeletons. A recurrence versus transience dichotomy is highlighted. We first give a sufficient Fourier criterion for the recurrence, close to the usual Chung-Fuchs one, assuming in addition the positive recurrence of the driving chain and a series criterion is derived. The key tool is the Nagaev-Guivarc'h method. Finally, we focus on particular two-dimensional persistent random walks, including directionally reinforced random walks, for which necessary and sufficient Fourier and series criteria are obtained. Inspired by \cite{Rainer2007}, we produce a genuine counterexample to the conjecture of \cite{Mauldin1996}. As for the one-dimensional situation studied in \cite{PRWI}, it is easier for a persistent random walk than its skeleton to be recurrent but here the difference is extremely thin. These results are based on a surprisingly novel -- to our knowledge -- upper bound for the L{é}vy concentration function associated with symmetric distributions.
△ Less
Submitted 8 December, 2017;
originally announced December 2017.
-
Persistent random walks. II. Functional Scaling Limits
Authors:
Peggy Cénac,
Arnaud Le Ny,
Basile De Loynes,
Yoann Offret
Abstract:
We give a complete and unified description -- under some stability assumptions -- of the functional scaling limits associated with some persistent random walks for which the recurrent or transient type is studied in [1]. As a result, we highlight a phase transition phenomenon with respect to the memory. It turns out that the limit process is either Markovian or not according to -- to put it in a n…
▽ More
We give a complete and unified description -- under some stability assumptions -- of the functional scaling limits associated with some persistent random walks for which the recurrent or transient type is studied in [1]. As a result, we highlight a phase transition phenomenon with respect to the memory. It turns out that the limit process is either Markovian or not according to -- to put it in a nutshell -- the rate of decrease of the distribution tails corresponding to the persistent times. In the memoryless situation, the limits are classical strictly stable L{é}vy processes of infinite variations. However, we point out that the description of the critical Cauchy case fills some lacuna even in the closely related context of Directionally Reinforced Random Walks (DRRWs) for which it has not been considered yet. Besides, we need to introduced some relevant generalized drift -- extended the classical one -- in order to study the critical case but also the situation when the limit is no longer Markovian. It appears to be in full generality a drift in mean for the Persistent Random Walk (PRW). The limit processes keeping some memory -- given by some variable length Markov chain -- of the underlying PRW are called arcsine Lamperti anomalous diffusions due to their marginal distribution which are computed explicitly here. To this end, we make the connection with the governing equations for L{é}vy walks, the occupation times of skew Bessel processes and a more general class modelled on Lamperti processes. We also stress that we clarify some misunderstanding regarding this marginal distribution in the framework of DRRWs. Finally, we stress that the latter situation is more flexible -- as in the first paper -- in the sense that the results can be easily generalized to a wider class of PRWs without renewal pattern.
△ Less
Submitted 1 December, 2016;
originally announced December 2016.
-
Persistent random walks
Authors:
Peggy Cénac,
Basile De Loynes,
Arnaud Le Ny,
Yoann Offret
Abstract:
We consider a walker that at each step keeps the same direction with a probabilitythat depends on the time already spent in the direction the walker is currently moving. In this paper, we study some asymptotic properties of this persistent random walk and give the conditions of recurrence or transience in terms of "transition" probabilities to keep on the same direction or to change, without assum…
▽ More
We consider a walker that at each step keeps the same direction with a probabilitythat depends on the time already spent in the direction the walker is currently moving. In this paper, we study some asymptotic properties of this persistent random walk and give the conditions of recurrence or transience in terms of "transition" probabilities to keep on the same direction or to change, without assuming that the latter admits any stationary probability. Examples are exhibited when this process is recurrent even if the random walk is not symmetric.
△ Less
Submitted 13 September, 2015;
originally announced September 2015.
-
Online estimation of the geometric median in Hilbert spaces : non asymptotic confidence balls
Authors:
Hervé Cardot,
Peggy Cénac,
Antoine Godichon
Abstract:
Estimation procedures based on recursive algorithms are interesting and powerful techniques that are able to deal rapidly with (very) large samples of high dimensional data. The collected data may be contaminated by noise so that robust location indicators, such as the geometric median, may be preferred to the mean. In this context, an estimator of the geometric median based on a fast and efficien…
▽ More
Estimation procedures based on recursive algorithms are interesting and powerful techniques that are able to deal rapidly with (very) large samples of high dimensional data. The collected data may be contaminated by noise so that robust location indicators, such as the geometric median, may be preferred to the mean. In this context, an estimator of the geometric median based on a fast and efficient averaged non linear stochastic gradient algorithm has been developed by Cardot, Cénac and Zitt (2013). This work aims at studying more precisely the non asymptotic behavior of this algorithm by giving non asymptotic confidence balls. This new result is based on the derivation of improved $L^2$ rates of convergence as well as an exponential inequality for the martingale terms of the recursive non linear Robbins-Monro algorithm.
△ Less
Submitted 27 January, 2015;
originally announced January 2015.
-
Almost sure central limit theorems for random ratios and applications to LSE for fractional Ornstein-Uhlenbeck processes
Authors:
Peggy Cénac,
Khalifa Es-Sebaiy
Abstract:
We investigate an almost sure limit theorem (ASCLT) for sequences of random variables having the form of a ratio of two terms such that the numerator satisfies the ASCLT and the denominator is a positive term which converges almost surely to 1. This result leads to the ASCLT for least square estimators for Ornstein-Uhlenbeck process driven by fractional Brownian motion.
We investigate an almost sure limit theorem (ASCLT) for sequences of random variables having the form of a ratio of two terms such that the numerator satisfies the ASCLT and the denominator is a positive term which converges almost surely to 1. This result leads to the ASCLT for least square estimators for Ornstein-Uhlenbeck process driven by fractional Brownian motion.
△ Less
Submitted 1 September, 2012;
originally announced September 2012.
-
Persistent random walks, variable length Markov chains and piecewise deterministic Markov processes
Authors:
Peggy Cénac,
Brigitte Chauvin,
Samuel Herrmann,
Pierre Vallois
Abstract:
A classical random walk $(S_t, t\in\mathbb{N})$ is defined by $S_t:=\displaystyle\sum_{n=0}^t X_n$, where $(X_n)$ are i.i.d. When the increments $(X_n)_{n\in\mathbb{N}}$ are a one-order Markov chain, a short memory is introduced in the dynamics of $(S_t)$. This so-called "persistent" random walk is nolonger Markovian and, under suitable conditions, the rescaled process converges towards the integr…
▽ More
A classical random walk $(S_t, t\in\mathbb{N})$ is defined by $S_t:=\displaystyle\sum_{n=0}^t X_n$, where $(X_n)$ are i.i.d. When the increments $(X_n)_{n\in\mathbb{N}}$ are a one-order Markov chain, a short memory is introduced in the dynamics of $(S_t)$. This so-called "persistent" random walk is nolonger Markovian and, under suitable conditions, the rescaled process converges towards the integrated telegraph noise (ITN) as the time-scale and space-scale parameters tend to zero (see Herrmann and Vallois, 2010; Tapiero-Vallois, Tapiero-Vallois2}). The ITN process is effectively non-Markovian too. The aim is to consider persistent random walks $(S_t)$ whose increments are Markov chains with variable order which can be infinite. This variable memory is enlighted by a one-to-one correspondence between $(X_n)$ and a suitable Variable Length Markov Chain (VLMC), since for a VLMC the dependency from the past can be unbounded.
The key fact is to consider the non Markovian letter process $(X_n)$ as the margin of a couple $(X_n,M_n)_{n\ge 0}$ where $(M_n)_{n\ge 0}$ stands for the memory of the process $(X_n)$. We prove that, under a suitable rescaling, $(S_n,X_n,M_n)$ converges in distribution towards a time continuous process $(S^0(t),X(t),M(t))$. The process $(S^0(t))$ is a semi-Markov and Piecewise Deterministic Markov Process whose paths are piecewise linear.
△ Less
Submitted 16 August, 2012;
originally announced August 2012.
-
Recursive estimation of the conditional geometric median in Hilbert spaces
Authors:
Hervé Cardot,
Peggy Cénac,
Pierre-André Zitt
Abstract:
A recursive estimator of the conditional geometric median in Hilbert spaces is studied. It is based on a stochastic gradient algorithm whose aim is to minimize a weighted L1 criterion and is consequently well adapted for robust online estimation. The weights are controlled by a kernel function and an associated bandwidth. Almost sure convergence and L2 rates of convergence are proved under general…
▽ More
A recursive estimator of the conditional geometric median in Hilbert spaces is studied. It is based on a stochastic gradient algorithm whose aim is to minimize a weighted L1 criterion and is consequently well adapted for robust online estimation. The weights are controlled by a kernel function and an associated bandwidth. Almost sure convergence and L2 rates of convergence are proved under general conditions on the conditional distribution as well as the sequence of descent steps of the algorithm and the sequence of bandwidths. Asymptotic normality is also proved for the averaged version of the algorithm with an optimal rate of convergence. A simulation study confirms the interest of this new and fast algorithm when the sample sizes are large. Finally, the ability of these recursive algorithms to deal with very high-dimensional data is illustrated on the robust estimation of television audience profiles conditional on the total time spent watching television over a period of 24 hours.
△ Less
Submitted 14 April, 2012;
originally announced April 2012.
-
Uncommon Suffix Tries
Authors:
Peggy Cénac,
Brigitte Chauvin,
Frédéric Paccaut,
Nicolas Pouyanne
Abstract:
Common assumptions on the source producing the words inserted in a suffix trie with $n$ leaves lead to a $\log n$ height and saturation level. We provide an example of a suffix trie whose height increases faster than a power of $n$ and another one whose saturation level is negligible with respect to $\log n$. Both are built from VLMC (Variable Length Markov Chain) probabilistic sources; they are e…
▽ More
Common assumptions on the source producing the words inserted in a suffix trie with $n$ leaves lead to a $\log n$ height and saturation level. We provide an example of a suffix trie whose height increases faster than a power of $n$ and another one whose saturation level is negligible with respect to $\log n$. Both are built from VLMC (Variable Length Markov Chain) probabilistic sources; they are easily extended to families of sources having the same properties. The first example corresponds to a "logarithmic infinite comb" and enjoys a non uniform polynomial mixing. The second one corresponds to a "factorial infinite comb" for which mixing is uniform and exponential.
△ Less
Submitted 20 December, 2011; v1 submitted 18 December, 2011;
originally announced December 2011.
-
Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm
Authors:
Hervé Cardot,
Peggy Cénac,
Pierre-André Zitt
Abstract:
With the progress of measurement apparatus and the development of automatic sensors it is not unusual anymore to get thousands of samples of observations taking values in high dimension spaces such as functional spaces. In such large samples of high dimensional data, outlying curves may not be uncommon and even a few individuals may corrupt simple statistical indicators such as the mean trajectory…
▽ More
With the progress of measurement apparatus and the development of automatic sensors it is not unusual anymore to get thousands of samples of observations taking values in high dimension spaces such as functional spaces. In such large samples of high dimensional data, outlying curves may not be uncommon and even a few individuals may corrupt simple statistical indicators such as the mean trajectory. We focus here on the estimation of the geometric median which is a direct generalization of the real median and has nice robustness properties. The geometric median being defined as the minimizer of a simple convex functional that is differentiable everywhere when the distribution has no atoms, it is possible to estimate it with online gradient algorithms. Such algorithms are very fast and can deal with large samples. Furthermore they also can be simply updated when the data arrive sequentially. We state the almost sure consistency and the L2 rates of convergence of the stochastic gradient estimator as well as the asymptotic normality of its averaged version. We get that the asymptotic distribution of the averaged version of the algorithm is the same as the classic estimators which are based on the minimization of the empirical loss function. The performances of our averaged sequential estimator, both in terms of computation speed and accuracy of the estimations, are evaluated with a small simulation study. Our approach is also illustrated on a sample of more 5000 individual television audiences measured every second over a period of 24 hours.
△ Less
Submitted 20 May, 2011; v1 submitted 22 January, 2011;
originally announced January 2011.
-
A fast and recursive algorithm for clustering large datasets with $k$-medians
Authors:
Hervé Cardot,
Peggy Cénac,
Jean-Marie Monnez
Abstract:
Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the $k$-means algorithm, a new class of recursive stochastic gradient algorithms designed for the $k$-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast an…
▽ More
Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the $k$-means algorithm, a new class of recursive stochastic gradient algorithms designed for the $k$-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast and are well adapted to deal with large samples of data that are allowed to arrive sequentially. It is proved that the stochastic gradient algorithm converges almost surely to the set of stationary points of the underlying loss criterion. A particular attention is paid to the averaged versions, which are known to have better performances, and a data-driven procedure that allows automatic selection of the value of the descent step is proposed.
The performance of the averaged sequential estimator is compared on a simulation study, both in terms of computation speed and accuracy of the estimations, with more classical partitioning techniques such as $k$-means, trimmed $k$-means and PAM (partitioning around medoids). Finally, this new online clustering technique is illustrated on determining television audience profiles with a sample of more than 5000 individual television audiences measured every minute over a period of 24 hours.
△ Less
Submitted 18 October, 2011; v1 submitted 21 January, 2011;
originally announced January 2011.
-
Variable length Markov chains and dynamical sources
Authors:
Peggy Cénac,
Brigitte Chauvin,
Frédéric Paccaut,
Nicolas Pouyanne
Abstract:
Infinite random sequences of letters can be viewed as stochastic chains or as strings produced by a source, in the sense of information theory. The relationship between Variable Length Markov Chains (VLMC) and probabilistic dynamical sources is studied. We establish a probabilistic frame for context trees and VLMC and we prove that any VLMC is a dynamical source for which we explicitly build the m…
▽ More
Infinite random sequences of letters can be viewed as stochastic chains or as strings produced by a source, in the sense of information theory. The relationship between Variable Length Markov Chains (VLMC) and probabilistic dynamical sources is studied. We establish a probabilistic frame for context trees and VLMC and we prove that any VLMC is a dynamical source for which we explicitly build the mapping. On two examples, the ``comb'' and the ``bamboo blossom'', we find a necessary and sufficient condition for the existence and the unicity of a stationary probability measure for the VLMC. These two examples are detailed in order to provide the associated Dirichlet series as well as the generating functions of word occurrences.
△ Less
Submitted 18 July, 2010;
originally announced July 2010.
-
On the Almost Sure Central Limit Theorem for Vector Martingales: Convergence of Moments and Statistical Applications
Authors:
Bernard Bercu,
Peggy Cénac,
Guy Fayolle
Abstract:
We investigate the almost sure asymptotic properties of vector martingale transforms. Assuming some appropriate regularity conditions both on the increasing process and on the moments of the martingale, we prove that normalized moments of any even order converge in the almost sure cental limit theorem for martingales. A conjecture about almost sure upper bounds under wider hypotheses is formulat…
▽ More
We investigate the almost sure asymptotic properties of vector martingale transforms. Assuming some appropriate regularity conditions both on the increasing process and on the moments of the martingale, we prove that normalized moments of any even order converge in the almost sure cental limit theorem for martingales. A conjecture about almost sure upper bounds under wider hypotheses is formulated. The theoretical results are supported by examples borrowed from statistical applications, including linear autoregressive models and branching processes with immigration, for which new asymptotic properties are established on estimation and prediction errors.
△ Less
Submitted 18 December, 2008;
originally announced December 2008.
-
Digital search trees and chaos game representation
Authors:
Peggy Cénac,
Brigitte Chauvin,
Stéphane Ginouillac,
Nicolas Pouyanne
Abstract:
In this paper, we consider a possible representation of a DNA sequence in a quaternary tree, in which on can visualize repetitions of subwords. The CGR-tree turns a sequence of letters into a digital search tree (DST), obtained from the suffixes of the reversed sequence. Several results are known concerning the height and the insertion depth for DST built from i.i.d. successive sequences. Here,…
▽ More
In this paper, we consider a possible representation of a DNA sequence in a quaternary tree, in which on can visualize repetitions of subwords. The CGR-tree turns a sequence of letters into a digital search tree (DST), obtained from the suffixes of the reversed sequence. Several results are known concerning the height and the insertion depth for DST built from i.i.d. successive sequences. Here, the successive inserted wors are strongly dependent. We give the asymptotic behaviour of the insertion depth and of the length of branches for the CGR-tree obtained from the suffixes of reversed i.i.d. or Markovian sequence. This behaviour turns out to be at first order the same one as in the case of independent words. As a by-product, asymptotic results on the length of longest runs in a Markovian sequence are obtained.
△ Less
Submitted 29 May, 2006;
originally announced May 2006.