-
Laplace Transform Based Low-Complexity Learning of Continuous Markov Semigroups
Authors:
Vladimir R. Kostic,
Karim Lounici,
Hélène Halconruy,
Timothée Devergne,
Pietro Novelli,
Massimiliano Pontil
Abstract:
Markov processes serve as a universal model for many real-world random processes. This paper presents a data-driven approach for learning these models through the spectral decomposition of the infinitesimal generator (IG) of the Markov semigroup. The unbounded nature of IGs complicates traditional methods such as vector-valued regression and Hilbert-Schmidt operator analysis. Existing techniques,…
▽ More
Markov processes serve as a universal model for many real-world random processes. This paper presents a data-driven approach for learning these models through the spectral decomposition of the infinitesimal generator (IG) of the Markov semigroup. The unbounded nature of IGs complicates traditional methods such as vector-valued regression and Hilbert-Schmidt operator analysis. Existing techniques, including physics-informed kernel regression, are computationally expensive and limited in scope, with no recovery guarantees for transfer operator methods when the time-lag is small. We propose a novel method that leverages the IG's resolvent, characterized by the Laplace transform of transfer operators. This approach is robust to time-lag variations, ensuring accurate eigenvalue learning even for small time-lags. Our statistical analysis applies to a broader class of Markov processes than current methods while reducing computational complexity from quadratic to linear in the state dimension. Finally, we illustrate the behaviour of our method in two experiments.
△ Less
Submitted 6 June, 2025; v1 submitted 18 October, 2024;
originally announced October 2024.
-
Learning the Infinitesimal Generator of Stochastic Diffusion Processes
Authors:
Vladimir R. Kostic,
Karim Lounici,
Helene Halconruy,
Timothee Devergne,
Massimiliano Pontil
Abstract:
We address data-driven learning of the infinitesimal generator of stochastic diffusion processes, essential for understanding numerical simulations of natural and physical systems. The unbounded nature of the generator poses significant challenges, rendering conventional analysis techniques for Hilbert-Schmidt operators ineffective. To overcome this, we introduce a novel framework based on the ene…
▽ More
We address data-driven learning of the infinitesimal generator of stochastic diffusion processes, essential for understanding numerical simulations of natural and physical systems. The unbounded nature of the generator poses significant challenges, rendering conventional analysis techniques for Hilbert-Schmidt operators ineffective. To overcome this, we introduce a novel framework based on the energy functional for these stochastic processes. Our approach integrates physical priors through an energy-based risk metric in both full and partial knowledge settings. We evaluate the statistical performance of a reduced-rank estimator in reproducing kernel Hilbert spaces (RKHS) in the partial knowledge setting. Notably, our approach provides learning bounds independent of the state space dimension and ensures non-spurious spectral estimation. Additionally, we elucidate how the distortion between the intrinsic energy-induced metric of the stochastic diffusion and the RKHS metric used for generator estimation impacts the spectral learning bounds.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Evolving privacy: drift parameter estimation for discretely observed i.i.d. diffusion processes under LDP
Authors:
Chiara Amorino,
Arnaud Gloter,
Hélène Halconruy
Abstract:
The problem of estimating a parameter in the drift coefficient is addressed for $N$ discretely observed independent and identically distributed stochastic differential equations (SDEs). This is done considering additional constraints, wherein only public data can be published and used for inference. The concept of local differential privacy (LDP) is formally introduced for a system of stochastic d…
▽ More
The problem of estimating a parameter in the drift coefficient is addressed for $N$ discretely observed independent and identically distributed stochastic differential equations (SDEs). This is done considering additional constraints, wherein only public data can be published and used for inference. The concept of local differential privacy (LDP) is formally introduced for a system of stochastic differential equations. The objective is to estimate the drift parameter by proposing a contrast function based on a pseudo-likelihood approach. A suitably scaled Laplace noise is incorporated to meet the privacy requirements. Our key findings encompass the derivation of explicit conditions tied to the privacy level. Under these conditions, we establish the consistency and asymptotic normality of the associated estimator. Notably, the convergence rate is intricately linked to the privacy level, and is some situations may be completely different from the case where privacy constraints are ignored. Our results hold true as the discretization step approaches zero and the number of processes $N$ tends to infinity.
△ Less
Submitted 16 October, 2024; v1 submitted 31 January, 2024;
originally announced January 2024.
-
On a Projection Least Squares Estimator for Jump Diffusion Processes
Authors:
Hélène Halconruy,
Nicolas Marie
Abstract:
This paper deals with a projection least squares estimator of the drift function of a jump diffusion process $X$ computed from multiple independent copies of $X$ observed on $[0,T]$. Risk bounds are established on this estimator and on an associated adaptive estimator. Finally, some numerical experiments are provided.
This paper deals with a projection least squares estimator of the drift function of a jump diffusion process $X$ computed from multiple independent copies of $X$ observed on $[0,T]$. Risk bounds are established on this estimator and on an associated adaptive estimator. Finally, some numerical experiments are provided.
△ Less
Submitted 25 July, 2023; v1 submitted 24 October, 2022;
originally announced October 2022.
-
Robust density estimation with the $\mathbb{L}_{1}$-loss. Applications to the estimation of a density on the line satisfying a shape constraint
Authors:
Y. Baraud,
H. Halconruy,
G. Maillard
Abstract:
We solve the problem of estimating the distribution of presumed i.i.d. observations for the total variation loss. Our approach is based on density models and is versatile enough to cope with many different ones, including some density models for which the Maximum Likelihood Estimator (MLE for short) does not exist. We mainly illustrate the properties of our estimator on models of densities on the…
▽ More
We solve the problem of estimating the distribution of presumed i.i.d. observations for the total variation loss. Our approach is based on density models and is versatile enough to cope with many different ones, including some density models for which the Maximum Likelihood Estimator (MLE for short) does not exist. We mainly illustrate the properties of our estimator on models of densities on the line that satisfy a shape constraint. We show that it possesses some similar optimality properties, with regard to some global rates of convergence, as the MLE does when it exists. It also enjoys some adaptation properties with respect to some specific target densities in the model for which our estimator is proven to converge at parametric rate. More important is the fact that our estimator is robust, not only with respect to model misspecification, but also to contamination, the presence of outliers among the dataset and the equidistribution assumption. This means that the estimator performs almost as well as if the data were i.i.d. with density $p$ in a situation where these data are only independent and most of their marginals are close enough in total variation to a distribution with density $p$. We also show that our estimator converges to the average density of the data, when this density belongs to the model, even when none of the marginal densities belongs to it. Our main result on the risk of the estimator takes the form of an exponential deviation inequality which is non-asymptotic and involves explicit numerical constants. We deduce from it several global rates of convergence, including some bounds for the minimax $\mathbb{L}_{1}$-risks over the sets of concave and log-concave densities. These bounds derive from some specific results on the approximation of densities which are monotone, convex, concave and log-concave. Such results may be of independent interest.
△ Less
Submitted 4 January, 2024; v1 submitted 21 May, 2022;
originally announced May 2022.
-
The insider problem in the trinomial model: a discrete-time jump process approach
Authors:
Hélène Halconruy
Abstract:
In an incomplete market underpinned by the trinomial model, we consider two investors : an ordinary agent whose decisions are driven by public information and an insider who possesses from the beginning a surplus of information encoded through a random variable for which he or she knows the outcome. Through the definition of an auxiliary model based on a marked binomial process, we handle the trin…
▽ More
In an incomplete market underpinned by the trinomial model, we consider two investors : an ordinary agent whose decisions are driven by public information and an insider who possesses from the beginning a surplus of information encoded through a random variable for which he or she knows the outcome. Through the definition of an auxiliary model based on a marked binomial process, we handle the trinomial model as a volatility one, and use the stochastic analysis and Malliavin calculus toolboxes available in that context. In particular, we connect the information drift, the drift to eliminate in order to preserve the martingale property within an initial enlargement of filtration in terms of the Malliavin derivative. We solve explicitly the agent and the insider expected logarithmic utility maximisation problems and provide a hedging formula for replicable claims. We identify the insider expected additional utility with the Shannon entropy of the extra information, and examine then the existence of arbitrage opportunities for the insider.
△ Less
Submitted 1 September, 2023; v1 submitted 29 June, 2021;
originally announced June 2021.
-
Malliavin calculus for marked binomial processes: portfolio optimisation in the trinomial model and compound Poisson approximation
Authors:
Hélène Halconruy
Abstract:
In this paper we develop a stochastic analysis for marked binomial processes, that can be viewed as the discrete analogues of marked Poisson processes. The starting point is the statement of a chaotic expansion for square-integrable (marked binomial) functionals, prior to the elaboration of a Markov-Malliavin structure within this framework. We take advantage of the new formalism to deal with two…
▽ More
In this paper we develop a stochastic analysis for marked binomial processes, that can be viewed as the discrete analogues of marked Poisson processes. The starting point is the statement of a chaotic expansion for square-integrable (marked binomial) functionals, prior to the elaboration of a Markov-Malliavin structure within this framework. We take advantage of the new formalism to deal with two main applications. First, we revisit the Chen-Stein method for the (compound) Poisson approximation which we perform in the paradigm of the built Markov-Malliavin structure, before studying in the second one the problem of portfolio optimisation in the trinomial model.
△ Less
Submitted 6 September, 2021; v1 submitted 2 April, 2021;
originally announced April 2021.
-
Kernel Selection in Nonparametric Regression
Authors:
Hélène Halconruy,
Nicolas Marie
Abstract:
In the regression model $Y = b(X) +σ(X)\varepsilon$, where $X$ has a density $f$, this paper deals with an oracle inequality for an estimator of $bf$, involving a kernel in the sense of Lerasle et al. (2016), selected via the PCO method. In addition to the bandwidth selection for kernel-based estimators already studied in Lacour, Massart and Rivoirard (2017) and Comte and Marie (2020), the dimensi…
▽ More
In the regression model $Y = b(X) +σ(X)\varepsilon$, where $X$ has a density $f$, this paper deals with an oracle inequality for an estimator of $bf$, involving a kernel in the sense of Lerasle et al. (2016), selected via the PCO method. In addition to the bandwidth selection for kernel-based estimators already studied in Lacour, Massart and Rivoirard (2017) and Comte and Marie (2020), the dimension selection for anisotropic projection estimators of $f$ and $bf$ is covered.
△ Less
Submitted 20 March, 2021; v1 submitted 13 June, 2020;
originally announced June 2020.
-
Malliavin and dirichlet structures for independent random variables
Authors:
Laurent Decreusefond,
Hélène Halconruy
Abstract:
On any denumerable product of probability spaces, we construct a Malliavin gradient and then a divergence and a number operator. This yields a Dirichlet structure which can be shown to approach the usual structures for Poisson and Brownian processes. We obtain versions of almost all the classical functional inequalities in discrete settings which show that the Efron-Stein inequality can be interpr…
▽ More
On any denumerable product of probability spaces, we construct a Malliavin gradient and then a divergence and a number operator. This yields a Dirichlet structure which can be shown to approach the usual structures for Poisson and Brownian processes. We obtain versions of almost all the classical functional inequalities in discrete settings which show that the Efron-Stein inequality can be interpreted as a Poincar{é} inequality or that Hoeffding decomposition of U-statistics can be interpreted as a chaos decomposition. We obtain a version of the Lyapounov central limit theorem for independent random variables without resorting to ad-hoc couplings, thus increasing the scope of the Stein method.
△ Less
Submitted 27 July, 2018; v1 submitted 25 July, 2017;
originally announced July 2017.