-
Large Deviations and the Peano Phenomenon in Stochastic Differential Equations with Homogeneous Drift
Authors:
Paola Bermolen,
Valeria Goicoechea,
José R. León
Abstract:
We consider a diffusion equation in $\mathbb{R}^d$ with drift equal to the gradient of a homogeneous potential of degree $1+γ$, with $0<γ<1$, and local variance equal to $\varepsilon^2$ with $\varepsilon\to 0$. The associated deterministic system for $\varepsilon=0$ has a potential that is not a Lipschitz function at the origin. Therefore, an infinite number of solutions exist, known as the Peano…
▽ More
We consider a diffusion equation in $\mathbb{R}^d$ with drift equal to the gradient of a homogeneous potential of degree $1+γ$, with $0<γ<1$, and local variance equal to $\varepsilon^2$ with $\varepsilon\to 0$. The associated deterministic system for $\varepsilon=0$ has a potential that is not a Lipschitz function at the origin. Therefore, an infinite number of solutions exist, known as the Peano phenomenon. In this work, we study large deviations of first and second order for the system with noise, generalizing previous results for the particular potential $b(x)=x |x|^{γ-1}$. For the first-order large deviations, we recover the rate function from the well-known Freidlin-Wentzell work. For the second-order large deviation, we use a refinement of Carmona-Simon bounds for the eigenfunctions of a Schrödinger operator and prove that the exponential behavior of the process depends only on the ground state of such an operator. Moreover, a refined study of the ground state allows us to obtain the large deviation rate function explicitly and to deduce that the family of diffusions converges to the set of extreme solutions of the deterministic system.
△ Less
Submitted 14 May, 2025; v1 submitted 7 May, 2025;
originally announced May 2025.
-
Weighted Random Dot Product Graphs
Authors:
Bernardo Marenco,
Paola Bermolen,
Marcelo Fiori,
Federico Larroca,
Gonzalo Mateos
Abstract:
Modeling of intricate relational patterns has become a cornerstone of contemporary statistical research and related data science fields. Networks, represented as graphs, offer a natural framework for this analysis. This paper extends the Random Dot Product Graph (RDPG) model to accommodate weighted graphs, markedly broadening the model's scope to scenarios where edges exhibit heterogeneous weight…
▽ More
Modeling of intricate relational patterns has become a cornerstone of contemporary statistical research and related data science fields. Networks, represented as graphs, offer a natural framework for this analysis. This paper extends the Random Dot Product Graph (RDPG) model to accommodate weighted graphs, markedly broadening the model's scope to scenarios where edges exhibit heterogeneous weight distributions. We propose a nonparametric weighted (W)RDPG model that assigns a sequence of latent positions to each node. Inner products of these nodal vectors specify the moments of their incident edge weights' distribution via moment-generating functions. In this way, and unlike prior art, the WRDPG can discriminate between weight distributions that share the same mean but differ in other higher-order moments. We derive statistical guarantees for an estimator of the nodal's latent positions adapted from the workhorse adjacency spectral embedding, establishing its consistency and asymptotic normality. We also contribute a generative framework that enables sampling of graphs that adhere to a (prescribed or data-fitted) WRDPG, facilitating, e.g., the analysis and testing of observed graph metrics using judicious reference distributions. The paper is organized to formalize the model's definition, the estimation (or nodal embedding) process and its guarantees, as well as the methodologies for generating weighted graphs, all complemented by illustrative and reproducible examples showcasing the WRDPG's effectiveness in various network analytic applications.
△ Less
Submitted 6 May, 2025; v1 submitted 6 May, 2025;
originally announced May 2025.
-
Probabilistic Insights for Efficient Exploration Strategies in Reinforcement Learning
Authors:
Ernesto Garcia,
Paola Bermolen,
Matthieu Jonckheere,
Seva Shneer
Abstract:
We investigate efficient exploration strategies of environments with unknown stochastic dynamics and sparse rewards. Specifically, we analyze first the impact of parallel simulations on the probability of reaching rare states within a finite time budget. Using simplified models based on random walks and Lévy processes, we provide analytical results that demonstrate a phase transition in reaching p…
▽ More
We investigate efficient exploration strategies of environments with unknown stochastic dynamics and sparse rewards. Specifically, we analyze first the impact of parallel simulations on the probability of reaching rare states within a finite time budget. Using simplified models based on random walks and Lévy processes, we provide analytical results that demonstrate a phase transition in reaching probabilities as a function of the number of parallel simulations. We identify an optimal number of parallel simulations that balances exploration diversity and time allocation. Additionally, we analyze a restarting mechanism that exponentially enhances the probability of success by redirecting efforts toward more promising regions of the state space. Our findings contribute to a more qualitative and quantitative theory of some exploration schemes in reinforcement learning, offering insights into developing more efficient strategies for environments characterized by rare events.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
Gradient-Based Spectral Embeddings of Random Dot Product Graphs
Authors:
Marcelo Fiori,
Bernardo Marenco,
Federico Larroca,
Paola Bermolen,
Gonzalo Mateos
Abstract:
The Random Dot Product Graph (RDPG) is a generative model for relational data, where nodes are represented via latent vectors in low-dimensional Euclidean space. RDPGs crucially postulate that edge formation probabilities are given by the dot product of the corresponding latent positions. Accordingly, the embedding task of estimating these vectors from an observed graph is typically posed as a low…
▽ More
The Random Dot Product Graph (RDPG) is a generative model for relational data, where nodes are represented via latent vectors in low-dimensional Euclidean space. RDPGs crucially postulate that edge formation probabilities are given by the dot product of the corresponding latent positions. Accordingly, the embedding task of estimating these vectors from an observed graph is typically posed as a low-rank matrix factorization problem. The workhorse Adjacency Spectral Embedding (ASE) enjoys solid statistical properties, but it is formally solving a surrogate problem and can be computationally intensive. In this paper, we bring to bear recent advances in non-convex optimization and demonstrate their impact to RDPG inference. We advocate first-order gradient descent methods to better solve the embedding problem, and to organically accommodate broader network embedding applications of practical relevance. Notably, we argue that RDPG embeddings of directed graphs loose interpretability unless the factor matrices are constrained to have orthogonal columns. We thus develop a novel feasible optimization method in the resulting manifold. The effectiveness of the graph representation learning framework is demonstrated on reproducible experiments with both synthetic and real network data. Our open-source algorithm implementations are scalable, and unlike the ASE they are robust to missing edge data and can track slowly-varying latent positions from streaming graphs.
△ Less
Submitted 8 December, 2023; v1 submitted 25 July, 2023;
originally announced July 2023.
-
Large deviations for the greedy exploration process on configuration models
Authors:
Paola Bermolen,
Valeria Goicoechea,
Matthieu Jonckheere
Abstract:
We prove a large deviation principle for the greedy exploration of configuration models, building on a time-discretized version of the method proposed by Bermolen et al. and Brightwell et al. for jointly constructing a random graph from a given degree sequence and its exploration. The proof of this result follows the general strategy to study large deviations of processes proposed by Feng and Kurt…
▽ More
We prove a large deviation principle for the greedy exploration of configuration models, building on a time-discretized version of the method proposed by Bermolen et al. and Brightwell et al. for jointly constructing a random graph from a given degree sequence and its exploration. The proof of this result follows the general strategy to study large deviations of processes proposed by Feng and Kurtz, based on the convergence of non-linear semigroups. We provide an intuitive interpretation of the LD cost function using Cramer's theorem for the average of random variables with appropriate distribution, depending on the degree distribution of explored nodes. The rate function can be expressed in a closed-form formula, and the large deviations trajectories can be obtained through explicit associated optimization problems. We then deduce large deviations results for the size of the independent set constructed by the algorithm. As a particular case, we analyze these results for d-regular graphs.
△ Less
Submitted 23 December, 2021;
originally announced December 2021.
-
Sequential Algorithms and Independent Sets Discovering on Large Sparse Random Graphs
Authors:
Paola Bermolen,
Matthieu Jonckheere,
Federico Larroca,
Manuel Saenz
Abstract:
Computing the size of maximum independent sets is a NP-hard problem for fixed graphs. Characterizing and designing efficient algorithms to estimate this independence number for random graphs are notoriously difficult and still largely open issues. In a companion paper, we showed that a low complexity degree-greedy exploration is actually asymptotically optimal on a large class of sparse random gra…
▽ More
Computing the size of maximum independent sets is a NP-hard problem for fixed graphs. Characterizing and designing efficient algorithms to estimate this independence number for random graphs are notoriously difficult and still largely open issues. In a companion paper, we showed that a low complexity degree-greedy exploration is actually asymptotically optimal on a large class of sparse random graphs. Encouraged by this result, we present and study two variants of sequential exploration algorithms: static and dynamic degree-aware explorations. We derive hydrodynamic limits for both of them, which in turn allow us to compute the size of the resulting independent set. Whereas the former is simpler to compute, the latter may be used to arbitrarily approximate the degree-greedy algorithm. Both can be implemented in a distributed manner. The corresponding hydrodynamic limits constitute an efficient method to compute or bound the independence number for a large class of sparse random graphs. As an application, we then show how our method may be used to estimate the capacity of a large 802.11-based wireless network. We finally consider further indicators such as the fairness of the resulting configuration, and show how an unexpected trade-off between fairness and capacity can be achieved.
△ Less
Submitted 30 September, 2020;
originally announced September 2020.
-
Large Deviation Principle for the Greedy Exploration Algorithm over Erdös-Rényi Graphs
Authors:
P. Bermolen,
V. Goicoechea,
M. Jonckheere,
E. Mordecki
Abstract:
We prove a large deviation principle for a greedy exploration process on an Erdös-Rényi (ER) graph when the number of nodes goes to infinity. To prove our main result, we use the general strategy to study large deviations of processes proposed by Feng and Kurtz, based on the convergence of non-linear semigroups. The rate function can be expressed in a closed-form formula, and associated optimizati…
▽ More
We prove a large deviation principle for a greedy exploration process on an Erdös-Rényi (ER) graph when the number of nodes goes to infinity. To prove our main result, we use the general strategy to study large deviations of processes proposed by Feng and Kurtz, based on the convergence of non-linear semigroups. The rate function can be expressed in a closed-form formula, and associated optimization problems can be solved explicitly, providing the large deviation trajectory. Also, we derive an LDP for the size of the maximum independent set discovered by such an algorithm and analyze the probability that it exceeds known bounds for the maximal independent set. We also analyze the link between these results and the landscape complexity of the independent set and the exploration dynamic.
△ Less
Submitted 8 October, 2021; v1 submitted 9 July, 2020;
originally announced July 2020.
-
Scaling Limits and Generic Bounds for Exploration Processes
Authors:
Paola Bermolen,
Matthieu Jonckheere,
Jaron Sanders
Abstract:
We consider exploration algorithms of the random sequential adsorption type both for homogeneous random graphs and random geometric graphs based on spatial Poisson processes. At each step, a vertex of the graph becomes active and its neighboring nodes become explored. Given an initial number of vertices $N$ growing to infinity, we study statistical properties of the proportion of explored nodes in…
▽ More
We consider exploration algorithms of the random sequential adsorption type both for homogeneous random graphs and random geometric graphs based on spatial Poisson processes. At each step, a vertex of the graph becomes active and its neighboring nodes become explored. Given an initial number of vertices $N$ growing to infinity, we study statistical properties of the proportion of explored nodes in time using scaling limits. We obtain exact limits for homogeneous graphs and prove an explicit central limit theorem for the final proportion of active nodes, known as the \emph{jamming constant}, through a diffusion approximation for the exploration process. We then focus on bounding the trajectories of such exploration processes on random geometric graphs, i.e. random sequential adsorption. As opposed to homogeneous random graphs, these do not allow for a reduction in dimensionality. Instead we build on a fundamental relationship between the number of explored nodes and the discovered volume in the spatial process, and obtain generic bounds: bounds that are independent of the dimension of space and the detailed shape of the volume associated to the discovered node. Lastly, we give two trajectorial interpretations of our bounds by constructing two coupled processes that have the same fluid limits.
△ Less
Submitted 29 December, 2016;
originally announced December 2016.
-
Scaling limits for exploration algorithms
Authors:
Paola Bermolen,
Matthieu Jonckheere,
Jaron Sanders
Abstract:
We consider an exploration algorithm where at each step, a random number of items become active while related items get explored. Given an initial number of items $N$ growing to infinity and building on a strong homogeneity assumption, we study using scaling limits of Markovian processes statistical properties of the proportion of active nodes in time. This is a companion paper that rigorously est…
▽ More
We consider an exploration algorithm where at each step, a random number of items become active while related items get explored. Given an initial number of items $N$ growing to infinity and building on a strong homogeneity assumption, we study using scaling limits of Markovian processes statistical properties of the proportion of active nodes in time. This is a companion paper that rigorously establishes the claims and heuristics presented in [5].
[5] Jaron Sanders, Matthieu Jonckheere, and Servaas Kokkelmans. Sub-Poissonian statistics of jamming limits in Rydberg gases. 2015. To appear.
△ Less
Submitted 9 April, 2015;
originally announced April 2015.
-
The Jamming Constant of Uniform Random Graphs
Authors:
Paola Bermolen,
Matthieu Jonckheere,
Pascal Moyal
Abstract:
By constructing jointly a random graph and an associated exploration process, we define the dynamics of a "parking process" on a class of uniform random graphs as a measure-valued Markov process, representing the empirical degree distribution of non-explored nodes. We then establish a functional law of large numbers for this process as the number of vertices grows to infinity, allowing us to asses…
▽ More
By constructing jointly a random graph and an associated exploration process, we define the dynamics of a "parking process" on a class of uniform random graphs as a measure-valued Markov process, representing the empirical degree distribution of non-explored nodes. We then establish a functional law of large numbers for this process as the number of vertices grows to infinity, allowing us to assess the jamming constant of the considered random graphs, i.e. the size of the maximal independent set discovered by the exploration algorithm. This technique, which can be applied to any uniform random graph with a given degree distribution, can be seen as a generalization in the space of measures, of the differential equation method introduced by Wormald.
△ Less
Submitted 11 April, 2015; v1 submitted 31 October, 2013;
originally announced October 2013.