-
Stochastic ordering, attractiveness and couplings in non-conservative particle systems
Authors:
Raúl Gouet,
F. Javier López,
Gerardo Sanz
Abstract:
We analyse the stochastic comparison of interacting particle systems allowing for multiple arrivals, departures and non-conservative jumps of individuals between sites. That is, if $k$ individuals leave site $x$ for site $y$, a possibly different number $l$ arrive at destination. This setting includes new models, when compared to the conservative case, such as metapopulation models with deaths dur…
▽ More
We analyse the stochastic comparison of interacting particle systems allowing for multiple arrivals, departures and non-conservative jumps of individuals between sites. That is, if $k$ individuals leave site $x$ for site $y$, a possibly different number $l$ arrive at destination. This setting includes new models, when compared to the conservative case, such as metapopulation models with deaths during migrations. It implies a sharp increase of technical complexity, given the numerous changes to consider. Known results are significantly generalised, even in the conservative case, as no particular form of the transition rates is assumed.
We obtain necessary and sufficient conditions on the rates for the stochastic comparison of the processes and prove their equivalence with the existence of an order-preserving Markovian coupling. As a corollary, we get necessary and sufficient conditions for the attractiveness of the processes. A salient feature of our approach lies in the presentation of the coupling in terms of solutions to network flow problems.
We illustrate the applicability of our results to a flexible family of population models described as interacting particle systems, with a range of parameters controlling births, deaths, catastrophes or migrations. We provide explicit conditions on the parameters for the stochastic comparison and attractiveness of the models, showing their usefulness in studying their limit behaviour. Additionally, we give three examples of constructing the coupling.
△ Less
Submitted 4 April, 2025;
originally announced April 2025.
-
Characterisation of distributions through $δ$-records and martingales
Authors:
Raúl Gouet,
Miguel Lafuente,
F. Javier López,
Gerardo Sanz
Abstract:
Given parameters $c>0, δ\ne0$ and a sequence $(X_n)$ of real-valued, integrable, independent and identically $F$-distributed random variables, we characterise distributions $F$ such that $(N_n-cM_n)$ is a martingale, where $N_n$ denotes the number of observations $X_k$ among $X_1,\ldots,X_n$ such that $X_k>M_{k-1}+δ$, called $δ$-records, and $M_k=\max\{X_1,\ldots, X_k\}$.
The problem is recast a…
▽ More
Given parameters $c>0, δ\ne0$ and a sequence $(X_n)$ of real-valued, integrable, independent and identically $F$-distributed random variables, we characterise distributions $F$ such that $(N_n-cM_n)$ is a martingale, where $N_n$ denotes the number of observations $X_k$ among $X_1,\ldots,X_n$ such that $X_k>M_{k-1}+δ$, called $δ$-records, and $M_k=\max\{X_1,\ldots, X_k\}$.
The problem is recast as $1-F(x+δ)=c\int_{x}^{\infty}(1-F)(t)dt$, for $x\in T$, with $F(T)=1$. Unlike standard functional equations, where the equality must hold for all $x$ in a fixed set, our problem involves a domain that depends on $F$ itself, introducing complexity but allowing for more possibilities of solutions.
We find the explicit expressions of all solutions when $δ< 0$ and, when $δ> 0$, for distributions with bounded support. In the unbounded support case, we focus attention on continuous and lattice distributions. In the continuous setting, with support $\mathbb{R}_+$, we reduce the problem to a delay differential equation, showing that, besides particular cases of the exponential distribution, mixtures of exponential and gamma distributions and many others are solutions as well. The lattice case, with support $\mathbb{Z}_+$ is treated analogously and reduced to the study of a difference equation. Analogous results are obtained; in particular, mixtures of geometric and negative binomial distributions are found to solve the problem.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
Estimating hazard rates from $δ$-records in discrete distributions
Authors:
Martín Alcalde,
Miguel Lafuente,
F. Javier López,
Lina Maldonado,
Gerardo Sanz
Abstract:
This paper focuses on nonparametric statistical inference of the hazard rate function of discrete distributions based on $δ$-record data. We derive the explicit expression of the maximum likelihood estimator and determine its exact distribution, as well as some important characteristics such as its bias and mean squared error. We then discuss the construction of confidence intervals and goodness-o…
▽ More
This paper focuses on nonparametric statistical inference of the hazard rate function of discrete distributions based on $δ$-record data. We derive the explicit expression of the maximum likelihood estimator and determine its exact distribution, as well as some important characteristics such as its bias and mean squared error. We then discuss the construction of confidence intervals and goodness-of-fit tests. The performance of our proposals is evaluated using simulation methods. Applications to real data are given, as well. The estimation of the hazard rate function based on usual records has been studied in the literature, although many procedures require several samples of records. In contrast, our approach relies on a single sequence of $δ$-records, simplifying the experimental design and increasing the applicability of the methods.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
Regularity of center-outward distribution functions in non-convex domains
Authors:
Eustasio del Barrio,
Alberto González Sanz
Abstract:
For a probability P in $R^d$ its center outward distribution function $F_{\pm}$, introduced in Chernozhukov et al. (2017) and Hallin et al. (2021), is a new and successful concept of multivariate distribution function based on mass transportation theory. This work proves, for a probability P with density locally bounded away from zero and infinity in its support, the continuity of the center-outwa…
▽ More
For a probability P in $R^d$ its center outward distribution function $F_{\pm}$, introduced in Chernozhukov et al. (2017) and Hallin et al. (2021), is a new and successful concept of multivariate distribution function based on mass transportation theory. This work proves, for a probability P with density locally bounded away from zero and infinity in its support, the continuity of the center-outward map on the interior of the support of P and the continuity of its inverse, the quantile, $Q_{\pm}$. This relaxes the convexity assumption in del Barrio et al. (2020). Some important consequences of this continuity are Glivenko-Cantelli type theorems and characterisation of weak convergence by the stability of the center-outward map.
△ Less
Submitted 5 April, 2023; v1 submitted 29 March, 2023;
originally announced March 2023.
-
Nonparametric Multiple-Output Center-Outward Quantile Regression
Authors:
Eustasio del Barrio,
Alberto Gonzalez Sanz,
Marc Hallin
Abstract:
Based on the novel concept of multivariate center-outward quantiles introduced recently in Chernozhukov et al. (2017) and Hallin et al. (2021), we are considering the problem of nonparametric multiple-output quantile regression. Our approach defines nested conditional center-outward quantile regression contours and regions with given conditional probability content irrespective of the underlying d…
▽ More
Based on the novel concept of multivariate center-outward quantiles introduced recently in Chernozhukov et al. (2017) and Hallin et al. (2021), we are considering the problem of nonparametric multiple-output quantile regression. Our approach defines nested conditional center-outward quantile regression contours and regions with given conditional probability content irrespective of the underlying distribution; their graphs constitute nested center-outward quantile regression tubes. Empirical counterparts of these concepts are constructed, yielding interpretable empirical regions and contours which are shown to consistently reconstruct their population versions in the Pompeiu-Hausdorff topology. Our method is entirely non-parametric and performs well in simulations including heteroskedasticity and nonlinear trends; its power as a data-analytic tool is illustrated on some real datasets.
△ Less
Submitted 26 April, 2022; v1 submitted 25 April, 2022;
originally announced April 2022.
-
Data-Centric AI Requires Rethinking Data Notion
Authors:
Mustafa Hajij,
Ghada Zamzmi,
Karthikeyan Natesan Ramamurthy,
Aldo Guzman Saenz
Abstract:
The transition towards data-centric AI requires revisiting data notions from mathematical and implementational standpoints to obtain unified data-centric machine learning packages. Towards this end, this work proposes unifying principles offered by categorical and cochain notions of data, and discusses the importance of these principles in data-centric AI transition. In the categorical notion, dat…
▽ More
The transition towards data-centric AI requires revisiting data notions from mathematical and implementational standpoints to obtain unified data-centric machine learning packages. Towards this end, this work proposes unifying principles offered by categorical and cochain notions of data, and discusses the importance of these principles in data-centric AI transition. In the categorical notion, data is viewed as a mathematical structure that we act upon via morphisms to preserve this structure. As for cochain notion, data can be viewed as a function defined in a discrete domain of interest and acted upon via operators. While these notions are almost orthogonal, they provide a unifying definition to view data, ultimately impacting the way machine learning packages are developed, implemented, and utilized by practitioners.
△ Less
Submitted 2 December, 2021; v1 submitted 6 October, 2021;
originally announced October 2021.
-
Exact and asymptotic properties of $δ$-records in the linear drift model
Authors:
Raúl Gouet,
Miguel Lafuente,
F. Javier López,
Gerardo Sanz
Abstract:
The study of records in the Linear Drift Model (LDM) has attracted much attention recently due to applications in several fields. In the present paper we study $δ$-records in the LDM, defined as observations which are greater than all previous observations, plus a fixed real quantity $δ$. We give analytical properties of the probability of $δ$-records and study the correlation between $δ$-record e…
▽ More
The study of records in the Linear Drift Model (LDM) has attracted much attention recently due to applications in several fields. In the present paper we study $δ$-records in the LDM, defined as observations which are greater than all previous observations, plus a fixed real quantity $δ$. We give analytical properties of the probability of $δ$-records and study the correlation between $δ$-record events. We also analyse the asymptotic behaviour of the number of $δ$-records among the first $n$ observations and give conditions for convergence to the Gaussian distribution. As a consequence of our results, we solve a conjecture posed in J. Stat. Mech. 2010, P10013, regarding the total number of records in a LDM with negative drift. Examples of application to particular distributions, such as Gumbel or Pareto are also provided. We illustrate our results with a real data set of summer temperatures in Spain, where the LDM is consistent with the global-warming phenomenon.
△ Less
Submitted 11 June, 2020; v1 submitted 9 June, 2020;
originally announced June 2020.
-
Asymptotic normality for the counting process of weak records and δ-records in discrete models
Authors:
Raúl Gouet,
F. Javier López,
Gerardo Sanz
Abstract:
Let $\{X_n,n\ge1\}$ be a sequence of independent and identically distributed random variables, taking non-negative integer values, and call $X_n$ a $δ$-record if $X_n>\max\{X_1,...,X_{n-1}\}+δ$, where $δ$ is an integer constant. We use martingale arguments to show that the counting process of $δ$-records among the first $n$ observations, suitably centered and scaled, is asymptotically normally d…
▽ More
Let $\{X_n,n\ge1\}$ be a sequence of independent and identically distributed random variables, taking non-negative integer values, and call $X_n$ a $δ$-record if $X_n>\max\{X_1,...,X_{n-1}\}+δ$, where $δ$ is an integer constant. We use martingale arguments to show that the counting process of $δ$-records among the first $n$ observations, suitably centered and scaled, is asymptotically normally distributed for $δ\ne0$. In particular, taking $δ=-1$ we obtain a central limit theorem for the number of weak records.
△ Less
Submitted 5 September, 2007;
originally announced September 2007.
-
Dualities for Multi-State Probabilistic Cellular Automata
Authors:
F. J. Lopez,
G. Sanz,
M. Sobottka
Abstract:
In this paper a new form of duality for probabilistic cellular automata (PCA) is introduced. Using this duality, an ergodicity result for processes having a dual is proved. Also, conditions on the probabilities defining the evolution of the processes for the existence of a dual are provided. The results are applied to wide classes of PCA which include multi-opinion voter models, competition models…
▽ More
In this paper a new form of duality for probabilistic cellular automata (PCA) is introduced. Using this duality, an ergodicity result for processes having a dual is proved. Also, conditions on the probabilities defining the evolution of the processes for the existence of a dual are provided. The results are applied to wide classes of PCA which include multi-opinion voter models, competition models and the Domany-Kinzel model.
△ Less
Submitted 14 February, 2017; v1 submitted 7 July, 2006;
originally announced July 2006.