-
Non-parametric estimation of net survival under dependence between death causes
Authors:
Oskar Laverny,
Nathalie Grafféo,
Roch Giorgi
Abstract:
Relative survival methodology deals with a competing risks survival model where the cause of death is unknown. This lack of information occurs regularly in population-based cancer studies. Non-parametric estimation of the net survival is possible through the Pohar Perme estimator. Derived similarly to Kaplan-Meier, it nevertheless relies on an untestable independence assumption. We propose here to…
▽ More
Relative survival methodology deals with a competing risks survival model where the cause of death is unknown. This lack of information occurs regularly in population-based cancer studies. Non-parametric estimation of the net survival is possible through the Pohar Perme estimator. Derived similarly to Kaplan-Meier, it nevertheless relies on an untestable independence assumption. We propose here to relax this assumption and provide a generalized non-parametric estimator that works for other dependence structures, by leveraging the underlying stochastic processes and martingales. We formally derive asymptotics of this estimator, providing variance estimation and log-rank-type tests. Our approach provides a new perspective on the Pohar Perme estimator and the acceptability of the underlying independence assumption. We highlight the impact of this dependence structure assumption on simulation studies, and illustrate them through an application on registry data relative to colorectal cancer, before discussing potential extensions of our methodology.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
Local moment matching with Erlang mixtures under automatic roughness penalization
Authors:
Oskar Laverny,
Philippe Lambert
Abstract:
We consider the class of Erlang mixtures for the task of density estimation on the positive real line when the only available information is given as local moments, a histogram with potentially higher order moments in some bins. By construction, the obtained moment problem is ill-posed and requires regularization. Several penalties can be used for such a task, such as a lasso penalty for sparsity…
▽ More
We consider the class of Erlang mixtures for the task of density estimation on the positive real line when the only available information is given as local moments, a histogram with potentially higher order moments in some bins. By construction, the obtained moment problem is ill-posed and requires regularization. Several penalties can be used for such a task, such as a lasso penalty for sparsity of the representation, but we focus here on a simplified roughness penalty from the P-splines literature. We show that the corresponding hyperparameter can be selected without cross-validation through the computation of the so-called effective dimension of the estimator, which makes the estimator practical and adapted to these summarized information settings. The flexibility of the local moments representations allows interesting additions such as the enforcement of Value-at-Risk and Tail Value-at-Risk constraints on the resulting estimator, making the procedure suitable for the estimation of heavy-tailed densities.
△ Less
Submitted 24 February, 2024;
originally announced February 2024.
-
Parametric divisibility of stochastic losses
Authors:
Oskar Laverny,
Alessandro Ferriero,
Ecaterina Nisipasu
Abstract:
A probability distribution is n-divisible if its nth convolution root exists. While modeling the dependence structure between several (re)insurance losses by an additive risk factor model, the infinite divisibility, that is the $n$-divisibility for all $n \in\mathbb N$, is a very desirable property. Moreover, the capacity to compute the distribution of a piece (i.e., a convolution root) is also de…
▽ More
A probability distribution is n-divisible if its nth convolution root exists. While modeling the dependence structure between several (re)insurance losses by an additive risk factor model, the infinite divisibility, that is the $n$-divisibility for all $n \in\mathbb N$, is a very desirable property. Moreover, the capacity to compute the distribution of a piece (i.e., a convolution root) is also desirable. Unfortunately, if many useful distributions are infinitely divisible, computing the distributions of their pieces is usually a challenging task that requires heavy numerical computations. However, in a few selected cases, particularly the Gamma case, the extraction of the distribution of the pieces can be performed fully parametrically, that is with negligible numerical cost and zero error. We show how this neat property of Gamma distributions can be leveraged to approximate the pieces of other distributions, and we provide several illustrations of the resulting algorithms.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
Estimation of high dimensional Gamma convolutions through random projections
Authors:
Oskar Laverny
Abstract:
Multivariate generalized Gamma convolutions are distributions defined by a convolutional semi-parametric structure. Their flexible dependence structures, the marginal possibilities and their useful convolutional expression make them appealing to the practitioner. However, fitting such distributions when the dimension gets high is a challenge. We propose stochastic estimation procedures based on th…
▽ More
Multivariate generalized Gamma convolutions are distributions defined by a convolutional semi-parametric structure. Their flexible dependence structures, the marginal possibilities and their useful convolutional expression make them appealing to the practitioner. However, fitting such distributions when the dimension gets high is a challenge. We propose stochastic estimation procedures based on the approximation of a Laguerre integrated square error via (shifted) cumulants approximation, evaluated on random projections of the dataset. Through the analysis of our loss via tools from Grassmannian cubatures, sparse optimization on measures and Wasserstein gradient flows, we show the convergence of the stochastic gradient descent to a proper estimator of the high dimensional distribution. We propose several examples on both low and high-dimensional settings.
△ Less
Submitted 25 March, 2022;
originally announced March 2022.
-
Estimation of multivariate generalized gamma convolutions through Laguerre expansions
Authors:
Oskar Laverny,
Esterina Masiello,
Véronique Maume-Deschamps,
Didier Rullière
Abstract:
The generalized gamma convolutions class of distributions appeared in Thorin's work while looking for the infinite divisibility of the log-Normal and Pareto distributions. Although these distributions have been extensively studied in the univariate case, the multivariate case and the dependence structures that can arise from it have received little interest in the literature. Furthermore, only one…
▽ More
The generalized gamma convolutions class of distributions appeared in Thorin's work while looking for the infinite divisibility of the log-Normal and Pareto distributions. Although these distributions have been extensively studied in the univariate case, the multivariate case and the dependence structures that can arise from it have received little interest in the literature. Furthermore, only one projection procedure for the univariate case was recently constructed, and no estimation procedures are available. By expanding the densities of multivariate generalized gamma convolutions into a tensorized Laguerre basis, we bridge the gap and provide performant estimation procedures for both the univariate and multivariate cases. We provide some insights about performance of these procedures, and a convergent series for the density of multivariate gamma convolutions, which is shown to be more stable than Moschopoulos's and Mathai's univariate series. We furthermore discuss some examples.
△ Less
Submitted 23 July, 2021; v1 submitted 4 March, 2021;
originally announced March 2021.
-
Dependence structure estimation using Copula Recursive Trees
Authors:
Oskar Laverny,
Esterina Masiello,
Véronique Maume-Deschamps,
Didier Rullière
Abstract:
We construct the COpula Recursive Tree (CORT) estimator: a flexible, consistent, piecewise linear estimator of a copula, leveraging the patchwork copula formalization and various piecewise constant density estimators. While the patchwork structure imposes a grid, the CORT estimator is data-driven and constructs the (possibly irregular) grid recursively from the data, minimizing a chosen distance o…
▽ More
We construct the COpula Recursive Tree (CORT) estimator: a flexible, consistent, piecewise linear estimator of a copula, leveraging the patchwork copula formalization and various piecewise constant density estimators. While the patchwork structure imposes a grid, the CORT estimator is data-driven and constructs the (possibly irregular) grid recursively from the data, minimizing a chosen distance on the copula space. The addition of the copula constraints makes usual density estimators unusable, whereas the CORT estimator is only concerned with dependence and guarantees the uniformity of margins. Refinements such as localized dimension reduction and bagging are developed, analyzed, and tested through simulated data.
△ Less
Submitted 22 February, 2021; v1 submitted 6 May, 2020;
originally announced May 2020.