-
Low-rank bilinear autoregressive models for three-way criminal activity tensors
Authors:
Gregor Zens,
Carlos Díaz,
Daniele Durante,
Eleonora Patacchini
Abstract:
Criminal activity data are typically available via a three-way tensor encoding the reported frequencies of different crime categories across time and space. The challenges that arise in the design of interpretable, yet realistic, model-based representations of the complex dependencies within and across these three dimensions have led to an increasing adoption of black-box predictive strategies. Al…
▽ More
Criminal activity data are typically available via a three-way tensor encoding the reported frequencies of different crime categories across time and space. The challenges that arise in the design of interpretable, yet realistic, model-based representations of the complex dependencies within and across these three dimensions have led to an increasing adoption of black-box predictive strategies. Although this perspective has proved successful in producing accurate forecasts guiding targeted interventions, the lack of interpretable model-based characterizations of the dependence structures underlying criminal activity tensors prevents from inferring the cascading effects of these interventions across the different dimensions. We address this gap through the design of a low-rank bilinear autoregressive model which achieves comparable predictive performance to black-box strategies, while allowing interpretable inference on the dependence structures of criminal activity reports across crime categories, time, and space. This representation incorporates the time dimension via an autoregressive construction, accounting for spatial effects and dependencies among crime categories through a separable low-rank bilinear formulation. When applied to Chicago police reports, the proposed model showcases remarkable predictive performance and also reveals interpretable dependence structures unveiling fundamental crime dynamics. These results facilitate the design of more refined intervention policies informed by cascading effects of the policy itself.
△ Less
Submitted 2 May, 2025;
originally announced May 2025.
-
Bayesian local clustering of age-period mortality surfaces across multiple countries
Authors:
Giovanni Romanò,
Emanuele Aliverti,
Daniele Durante
Abstract:
Although traditional literature on mortality modeling has focused on single countries in isolation, recent contributions have progressively moved toward joint models for multiple countries. Besides favoring borrowing of information to improve age-period forecasts, this perspective has also potentials to infer local similarities among countries' mortality patterns in specific age classes and period…
▽ More
Although traditional literature on mortality modeling has focused on single countries in isolation, recent contributions have progressively moved toward joint models for multiple countries. Besides favoring borrowing of information to improve age-period forecasts, this perspective has also potentials to infer local similarities among countries' mortality patterns in specific age classes and periods that could unveil unexplored demographic trends, while guiding the design of targeted policies. Advancements along this latter relevant direction are currently undermined by the lack of a multi-country model capable of incorporating the core structures of age-period mortality surfaces together with clustering patterns among countries that are not global, but rather vary locally across different combinations of ages and periods. We cover this gap by developing a novel Bayesian model for log-mortality rates that characterizes the age structure of mortality through a B-spline expansion whose country-specific dynamic coefficients encode both changes of this age structure across periods and also local clustering patterns among countries under a time-dependent random partition prior for these country-specific dynamic coefficients. While flexible, this formulation admits tractable posterior inference leveraging a suitably-designed Gibbs-sampler. The application to mortality data from 14 countries unveils local similarities highlighting both previously-recognized demographic phenomena and also yet-unexplored trends.
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
Phylogenetic latent space models for network data
Authors:
Federico Pavone,
Daniele Durante,
Robin J. Ryder
Abstract:
Latent space models for network data characterize each node through a vector of latent features whose pairwise similarities define the edge probabilities among pairs of nodes. Although this formulation has led to successful implementations and impactful extensions, the overarching focus has been on directly inferring node embeddings through the latent features rather than learning the generative p…
▽ More
Latent space models for network data characterize each node through a vector of latent features whose pairwise similarities define the edge probabilities among pairs of nodes. Although this formulation has led to successful implementations and impactful extensions, the overarching focus has been on directly inferring node embeddings through the latent features rather than learning the generative process underlying the embedding. This focus prevents from borrowing information among the features of different nodes and fails to infer complex higher-level architectures regulating the formation of the network itself. For example, routinely-studied networks often exhibit multiscale structures informing on nested modular hierarchies among nodes that could be learned via tree-based representations of dependencies among latent features. We pursue this direction by developing an innovative phylogenetic latent space model that explicitly characterizes the generative process of the nodes' feature vectors via a branching Brownian motion, with branching structure parametrized by a phylogenetic tree. This tree constitutes the main object of interest and is learned under a Bayesian perspective to infer tree-based modular hierarchies among nodes that explain heterogenous multiscale patterns in the network. Identifiability results are derived along with posterior consistency theory, and the inference potentials of the newly-proposed model are illustrated in simulations and two real-data applications from criminology and neuroscience, where our formulation learns core structures hidden to state-of-the-art alternatives.
△ Less
Submitted 17 February, 2025;
originally announced February 2025.
-
Zero-inflated stochastic block modeling of efficiency-security tradeoffs in weighted criminal networks
Authors:
Chaoyi Lu,
Daniele Durante,
Nial Friel
Abstract:
Criminal networks arise from the unique attempt to balance a need of establishing frequent ties among affiliates to facilitate the coordination of illegal activities, with the necessity to sparsify the overall connectivity architecture to hide from law enforcement. This efficiency-security tradeoff is also combined with the creation of groups of redundant criminals that exhibit similar connectivit…
▽ More
Criminal networks arise from the unique attempt to balance a need of establishing frequent ties among affiliates to facilitate the coordination of illegal activities, with the necessity to sparsify the overall connectivity architecture to hide from law enforcement. This efficiency-security tradeoff is also combined with the creation of groups of redundant criminals that exhibit similar connectivity patterns, thus guaranteeing resilient network architectures. State-of-the-art models for such data are not designed to infer these unique structures. In contrast to such solutions we develop a computationally-tractable Bayesian zero-inflated Poisson stochastic block model (ZIP-SBM), which identifies groups of redundant criminals with similar connectivity patterns, and infers both overt and covert block interactions within and across such groups. This is accomplished by modeling weighted ties (corresponding to counts of interactions among pairs of criminals) via zero-inflated Poisson distributions with block-specific parameters that quantify complex patterns in the excess of zero ties in each block (security) relative to the distribution of the observed weighted ties within that block (efficiency). The performance of ZIP-SBM is illustrated in simulations and in a study of summits co-attendances in a complex Mafia organization, where we unveil efficiency-security structures adopted by the criminal organization that were hidden to previous analyses.
△ Less
Submitted 31 October, 2024;
originally announced October 2024.
-
Partially exchangeable stochastic block models for (node-colored) multilayer networks
Authors:
Daniele Durante,
Francesco Gaffi,
Antonio Lijoi,
Igor Prünster
Abstract:
Multilayer networks generalize single-layered connectivity data in several directions. These generalizations include, among others, settings where multiple types of edges are observed among the same set of nodes (edge-colored networks) or where a single notion of connectivity is measured between nodes belonging to different pre-specified layers (node-colored networks). While progress has been made…
▽ More
Multilayer networks generalize single-layered connectivity data in several directions. These generalizations include, among others, settings where multiple types of edges are observed among the same set of nodes (edge-colored networks) or where a single notion of connectivity is measured between nodes belonging to different pre-specified layers (node-colored networks). While progress has been made in statistical modeling of edge-colored networks, principled approaches that flexibly account for both within and across layer block-connectivity structures while incorporating layer information through a rigorous probabilistic construction are still lacking for node-colored multilayer networks. We fill this gap by introducing a novel class of partially exchangeable stochastic block models specified in terms of a hierarchical random partition prior for the allocation of nodes to groups, whose number is learned by the model. This goal is achieved without jeopardizing probabilistic coherence, uncertainty quantification and derivation of closed-form predictive within- and across-layer co-clustering probabilities. Our approach facilitates prior elicitation, the understanding of theoretical properties and the development of yet-unexplored predictive strategies for both the connections and the allocations of future incoming nodes. Posterior inference proceeds via a tractable collapsed Gibbs sampler, while performance is illustrated in simulations and in a real-world criminal network application. The notable gains achieved over competitors clarify the importance of developing general stochastic block models based on suitable node-exchangeability structures coherent with the type of multilayer network being analyzed.
△ Less
Submitted 14 May, 2025; v1 submitted 14 October, 2024;
originally announced October 2024.
-
Optimal lower bounds for logistic log-likelihoods
Authors:
Niccolò Anceschi,
Tommaso Rigon,
Giacomo Zanella,
Daniele Durante
Abstract:
The logit transform is arguably the most widely-employed link function beyond linear settings. This transformation routinely appears in regression models for binary data and provides, either explicitly or implicitly, a core building-block within state-of-the-art methodologies for both classification and regression. Its widespread use, combined with the lack of analytical solutions for the optimiza…
▽ More
The logit transform is arguably the most widely-employed link function beyond linear settings. This transformation routinely appears in regression models for binary data and provides, either explicitly or implicitly, a core building-block within state-of-the-art methodologies for both classification and regression. Its widespread use, combined with the lack of analytical solutions for the optimization of general losses involving the logit transform, still motivates active research in computational statistics. Among the directions explored, a central one has focused on the design of tangent lower bounds for logistic log-likelihoods that can be tractably optimized, while providing a tight approximation of these log-likelihoods. Although progress along these lines has led to the development of effective minorize-maximize (MM) algorithms for point estimation and coordinate ascent variational inference schemes for approximate Bayesian inference under several logit models, the overarching focus in the literature has been on tangent quadratic minorizers. In fact, it is still unclear whether tangent lower bounds sharper than quadratic ones can be derived without undermining the tractability of the resulting minorizer. This article addresses such a challenging question through the design and study of a novel piece-wise quadratic lower bound that uniformly improves any tangent quadratic minorizer, including the sharpest ones, while admitting a direct interpretation in terms of the classical generalized lasso problem. As illustrated in a ridge logistic regression, this unique connection facilitates more effective implementations than those provided by available piece-wise bounds, while improving the convergence speed of quadratic ones.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
Skew-symmetric approximations of posterior distributions
Authors:
Francesco Pozza,
Daniele Durante,
Botond Szabo
Abstract:
Routinely-implemented deterministic approximations of posterior distributions from, e.g., Laplace method, variational Bayes and expectation-propagation, generally rely on symmetric approximating densities, often taken to be Gaussian. This choice facilitates optimization and inference, but typically affects the quality of the overall approximation. In fact, even in basic parametric models, the post…
▽ More
Routinely-implemented deterministic approximations of posterior distributions from, e.g., Laplace method, variational Bayes and expectation-propagation, generally rely on symmetric approximating densities, often taken to be Gaussian. This choice facilitates optimization and inference, but typically affects the quality of the overall approximation. In fact, even in basic parametric models, the posterior distribution often displays asymmetries that yield bias and reduced accuracy when considering symmetric approximations. Recent research has moved towards more flexible approximating densities that incorporate skewness. However, current solutions are model-specific, lack general supporting theory, increase the computational complexity of the optimization problem, and do not provide a broadly-applicable solution to include skewness in any symmetric approximation. This article addresses such a gap by introducing a general and provably-optimal strategy to perturb any off-the-shelf symmetric approximation of a generic posterior distribution. Crucially, this novel perturbation is derived without additional optimization steps, and yields a similarly-tractable approximation within the class of skew-symmetric densities that provably enhances the finite-sample accuracy of the original symmetric approximation, and, under suitable assumptions, improves its convergence rate to the exact posterior by at least a $\sqrt{n}$ factor, in asymptotic regimes. These advancements are illustrated in numerical studies focusing on skewed perturbations of state-of-the-art Gaussian approximations.
△ Less
Submitted 21 September, 2024;
originally announced September 2024.
-
Conjugacy properties of multivariate unified skew-elliptical distributions
Authors:
Maicon J. Karling,
Daniele Durante,
Marc G. Genton
Abstract:
The broad class of multivariate unified skew-normal (SUN) distributions has been recently shown to possess important conjugacy properties. When used as priors for the coefficients vector in probit, tobit, and multinomial probit models, these distributions yield posteriors that still belong to the SUN family. Although this result has led to important advancements in Bayesian inference and computati…
▽ More
The broad class of multivariate unified skew-normal (SUN) distributions has been recently shown to possess important conjugacy properties. When used as priors for the coefficients vector in probit, tobit, and multinomial probit models, these distributions yield posteriors that still belong to the SUN family. Although this result has led to important advancements in Bayesian inference and computation, its applicability beyond likelihoods associated with fully-observed, discretized, or censored realizations from multivariate Gaussian models remains yet unexplored. This article covers such a gap by proving that the wider family of multivariate unified skew-elliptical (SUE) distributions, which extends SUNs to more general perturbations of elliptical densities, guarantees conjugacy for broader classes of models, beyond those relying on fully-observed, discretized or censored Gaussians. Such a result leverages the closure under linear combinations, conditioning and marginalization of SUE to prove that this family is conjugate to the likelihood induced by multivariate regression models for fully-observed, censored or dichotomized realizations from skew-elliptical distributions. This advancement enlarges the set of models that enable conjugate Bayesian inference to general formulations arising from elliptical and skew-elliptical families, including the multivariate Student's t and skew-t, among others.
△ Less
Submitted 4 August, 2024; v1 submitted 15 February, 2024;
originally announced February 2024.
-
One EURO for Uranus: the Elliptical Uranian Relativity Orbiter mission
Authors:
Lorenzo Iorio,
Athul P. Girija,
Daniele Durante
Abstract:
Recent years have seen increasing interest in sending a mission to Uranus, visited so far only by Voyager 2 in 1986. EURO (Elliptical Uranian Relativity Orbiter) is a preliminary mission concept investigating the possibility of dynamically measuring the planet's angular momentum by means of the Lense-Thirring effect affecting a putative Uranian orbiter. It is possible, at least in principle, to se…
▽ More
Recent years have seen increasing interest in sending a mission to Uranus, visited so far only by Voyager 2 in 1986. EURO (Elliptical Uranian Relativity Orbiter) is a preliminary mission concept investigating the possibility of dynamically measuring the planet's angular momentum by means of the Lense-Thirring effect affecting a putative Uranian orbiter. It is possible, at least in principle, to separate the relativistic precessions of the orbital inclination to the Celestial Equator and of the longitude of the ascending node of the spacecraft from its classical rates of the pericentre induced by the multipoles of the planet's gravity field by adopting an appropriate orbital configuration. For a wide and elliptical $2\,000\times 100\,000\,\mathrm{km}$ orbit, the gravitomagnetic signatures amount to tens of milliarcseconds per year, while, for a suitable choice of the initial conditions, the peak-to-peak amplitude of the range-rate shift can reach the level of $\simeq 1.5\times 10^{-3}$ millimetre per second in a single pericentre passage of a few hours. By lowering the apocentre height to $10\,000\,\mathrm{km}$, the Lense-Thirring precessions are enhanced to the level of hundreds of milliarcseconds per year. The uncertainties in the orientation of the planetary spin axis and in the inclination are major sources of systematic bias; it turns out that they should be determined with accuracies as good as $\simeq 0.1-1$ and $\simeq 1-10$ milliarcseconds, respectively.
△ Less
Submitted 11 May, 2023; v1 submitted 1 March, 2023;
originally announced March 2023.
-
Skewed Bernstein-von Mises theorem and skew-modal approximations
Authors:
Daniele Durante,
Francesco Pozza,
Botond Szabo
Abstract:
Gaussian approximations are routinely employed in Bayesian statistics to ease inference when the target posterior is intractable. Although these approximations are asymptotically justified by Bernstein-von Mises type results, in practice the expected Gaussian behavior may poorly represent the shape of the posterior, thus affecting approximation accuracy. Motivated by these considerations, we deriv…
▽ More
Gaussian approximations are routinely employed in Bayesian statistics to ease inference when the target posterior is intractable. Although these approximations are asymptotically justified by Bernstein-von Mises type results, in practice the expected Gaussian behavior may poorly represent the shape of the posterior, thus affecting approximation accuracy. Motivated by these considerations, we derive an improved class of closed-form approximations of posterior distributions which arise from a new treatment of a third-order version of the Laplace method yielding approximations in a tractable family of skew-symmetric distributions. Under general assumptions which account for misspecified models and non-i.i.d. settings, this family of approximations is shown to have a total variation distance from the target posterior whose rate of convergence improves by at least one order of magnitude the one established by the classical Bernstein-von Mises theorem. Specializing this result to the case of regular parametric models shows that the same improvement in approximation accuracy can be also derived for polynomially bounded posterior functionals. Unlike other higher-order approximations, our results prove that it is possible to derive closed-form and valid densities which are expected to provide, in practice, a more accurate, yet similarly-tractable, alternative to Gaussian approximations of the target posterior, while inheriting its limiting frequentist properties. We strengthen such arguments by developing a practical skew-modal approximation for both joint and marginal posteriors that achieves the same theoretical guarantees of its theoretical counterpart by replacing the unknown model parameters with the corresponding MAP estimate. Empirical studies confirm that our theoretical results closely match the remarkable performance observed in practice, even in finite, possibly small, sample regimes.
△ Less
Submitted 8 April, 2024; v1 submitted 8 January, 2023;
originally announced January 2023.
-
Design and performance of a Martian autonomous navigation system based on a smallsat constellation
Authors:
S. Molli,
D. Durante,
G. Boscagli,
G. Cascioli,
P. Racioppa,
E. M. Alessi,
S. Simonetti,
L. Vigna,
L. Iess
Abstract:
Deciphering the genesis and evolution of the Martian polar caps can provide crucial understanding of Mars' climate system. The growing scientific interest for the exploration of Mars at high latitudes, and the need of minimizing the resources onboard landers and rovers, motivates the need for adequate navigation support from orbit. We propose a novel concept based on a constellation that can suppo…
▽ More
Deciphering the genesis and evolution of the Martian polar caps can provide crucial understanding of Mars' climate system. The growing scientific interest for the exploration of Mars at high latitudes, and the need of minimizing the resources onboard landers and rovers, motivates the need for adequate navigation support from orbit. We propose a novel concept based on a constellation that can support autonomous navigation of different kind of users devoted to scientific investigations of those regions. We study two constellations, that differ mainly for the semi-major axis, composed of 5 small satellites (based on the SmallSats design being developed in Argotec), offering dedicated coverage of the Mars polar regions. We focus on the architecture of the inter-satellite links (ISL), the key elements providing both ephemerides and time synchronization for the broadcasting of the navigation message. Our concept is based on suitably configured coherent links, able to suppress the adverse effects of on-board clock instabilities and to provide excellent range-rate accuracies between the constellation's nodes. The data quality allows attaining good positioning performance for both constellations with a largely autonomous system. Indeed, we show that ground support can be heavily reduced by employing an ISL communication architecture.
△ Less
Submitted 3 December, 2022;
originally announced December 2022.
-
INPOP Planetary ephemerides and applications in the frame of the BepiColombo mission including new constraints on the graviton mass and dilaton parameters
Authors:
A. Fienga,
L. Bernus,
O. Minazzoli,
A. Hees,
L. Bigot,
C. Herrera,
V. Mariani,
A. Di Ruscio,
D. Durante,
D. Mary
Abstract:
We present here the new results obtained with the INPOP planetary ephemerides and BepiColombo radio-science simulations. We give new constraints for the classic General Relativity tests in terms of violation of the PPN parameters $β$ and $γ$ and the time variation of the gravitational constant G. We also present new limits for the mass of the graviton and finally we obtain new acceptable intervals…
▽ More
We present here the new results obtained with the INPOP planetary ephemerides and BepiColombo radio-science simulations. We give new constraints for the classic General Relativity tests in terms of violation of the PPN parameters $β$ and $γ$ and the time variation of the gravitational constant G. We also present new limits for the mass of the graviton and finally we obtain new acceptable intervals for the dilaton parameters $α_{0}$, $α_{T}$ and $α_{G}$. Besides these tests of gravitation, we also study the possibility of detecting the Sun core rotation.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Learning and forecasting of age-specific period mortality via B-spline processes with locally-adaptive dynamic coefficients
Authors:
Federico Pavone,
Sirio Legramanti,
Daniele Durante
Abstract:
Although the analysis of human mortality has a well-established history, the attempt to accurately forecast future death-rate patterns for different age groups and time horizons still attracts active research. Such a predictive focus has motivated an increasing shift towards more flexible representations of age-specific period mortality trajectories at the cost of reduced interpretability. Althoug…
▽ More
Although the analysis of human mortality has a well-established history, the attempt to accurately forecast future death-rate patterns for different age groups and time horizons still attracts active research. Such a predictive focus has motivated an increasing shift towards more flexible representations of age-specific period mortality trajectories at the cost of reduced interpretability. Although this perspective has led to successful predictive strategies, the inclusion of interpretable structures in modeling of human mortality can be, in fact, beneficial for improving forecasts. We pursue this direction via a novel B-spline process with locally-adaptive dynamic coefficients. Such a process outperforms state-of-the-art forecasting strategies by explicitly incorporating the core structures of period mortality within an interpretable formulation which enables inference on age-specific mortality trends and the corresponding rates of change across time. This is obtained by modeling the age-specific death counts via a Poisson log-normal model parameterized through a linear combination of B-spline bases with dynamic coefficients that characterize time changes in mortality rates via suitable stochastic differential equations. While flexible, the resulting formulation can be accurately approximated by a Gaussian state-space model that facilitates closed-form Kalman filtering, smoothing and forecasting, for both the trends of the spline coefficients and the corresponding first derivatives, which measure rates of change in mortality for different ages. As illustrated in applications to mortality data from different countries, our model outperforms state-of-the-art methods both in point forecasts and in calibration of predictive intervals. Moreover, it unveils substantial differences in mortality patterns across countries and ages, both in the past decades and during the COVID-19 pandemic.
△ Less
Submitted 12 January, 2024; v1 submitted 24 September, 2022;
originally announced September 2022.
-
Bayesian conjugacy in probit, tobit, multinomial probit and extensions: A review and new results
Authors:
Niccolò Anceschi,
Augusto Fasano,
Daniele Durante,
Giacomo Zanella
Abstract:
A broad class of models that routinely appear in several fields can be expressed as partially or fully discretized Gaussian linear regressions. Besides including basic Gaussian response settings, this class also encompasses probit, multinomial probit and tobit regression, among others, thereby yielding to one of the most widely-implemented families of models in applications. The relevance of such…
▽ More
A broad class of models that routinely appear in several fields can be expressed as partially or fully discretized Gaussian linear regressions. Besides including basic Gaussian response settings, this class also encompasses probit, multinomial probit and tobit regression, among others, thereby yielding to one of the most widely-implemented families of models in applications. The relevance of such representations has stimulated decades of research in the Bayesian field, mostly motivated by the fact that, unlike for Gaussian linear regression, the posterior distribution induced by such models does not seem to belong to a known class, under the commonly-assumed Gaussian priors for the coefficients. This has motivated several solutions for posterior inference relying on sampling-based strategies or on deterministic approximations that, however, still experience computational and accuracy issues, especially in high dimensions. The scope of this article is to review, unify and extend recent advances in Bayesian inference and computation for this class of models. To address such a goal, we prove that the likelihoods induced by these formulations share a common analytical structure that implies conjugacy with a broad class of distributions, namely the unified skew-normals (SUN), that generalize Gaussians to skewed contexts. This result unifies and extends recent conjugacy properties for specific models within the class analyzed, and opens avenues for improved posterior inference, under a broader class of formulations and priors, via novel closed-form expressions, i.i.d. samplers from the exact SUN posteriors, and more accurate and scalable approximations from VB and EP. Such advantages are illustrated in simulations and are expected to facilitate the routine-use of these core Bayesian models, while providing a novel framework to study theoretical properties and develop future extensions.
△ Less
Submitted 5 March, 2023; v1 submitted 16 June, 2022;
originally announced June 2022.
-
Concentration of discrepancy-based approximate Bayesian computation via Rademacher complexity
Authors:
Sirio Legramanti,
Daniele Durante,
Pierre Alquier
Abstract:
There has been increasing interest on summary-free solutions for approximate Bayesian computation (ABC) which replace distances among summaries with discrepancies between the empirical distributions of the observed data and the synthetic samples generated under the proposed parameter values. The success of these strategies has motivated theoretical studies on the limiting properties of the induced…
▽ More
There has been increasing interest on summary-free solutions for approximate Bayesian computation (ABC) which replace distances among summaries with discrepancies between the empirical distributions of the observed data and the synthetic samples generated under the proposed parameter values. The success of these strategies has motivated theoretical studies on the limiting properties of the induced posteriors. However, there is still the lack of a theoretical framework for summary-free ABC that (i) is unified, instead of discrepancy-specific, (ii) does not require to constrain the analysis to data generating processes and statistical models meeting specific regularity conditions, but rather facilitates the derivation of limiting properties that hold uniformly, and (iii) relies on verifiable assumptions that provide explicit concentration bounds clarifying which factors govern the limiting behavior of the ABC posterior. We address this gap via a novel theoretical framework that introduces the concept of Rademacher complexity in the analysis of the limiting properties for discrepancy-based ABC posteriors, including in non-i.i.d. and misspecified settings. This yields a unified theory that relies on constructive arguments and provides more informative asymptotic results and uniform concentration bounds, even in settings not covered by current studies. These advancements are obtained by relating the asymptotic properties of summary-free ABC posteriors to the behavior of the Rademacher complexity associated with the chosen discrepancy in the family of integral probability semimetrics (IPS). The IPS class extends summary-based distances, and includes the Wasserstein distance and maximum mean discrepancy, among others. As clarified in specialized theoretical analyses of popular IPS discrepancies and via illustrative simulations, this perspective improves the understanding of summary-free ABC.
△ Less
Submitted 24 January, 2025; v1 submitted 14 June, 2022;
originally announced June 2022.
-
Jupiter's inhomogeneous envelope
Authors:
Y. Miguel,
M. Bazot,
T. Guillot,
S. Howard,
E. Galanti,
Y. Kaspi,
W. B. Hubbard,
B. Militzer,
R. Helled,
S. K. Atreya,
J. E. P. Connerney,
D. Durante,
L. Kulowski,
J. I. Lunine,
D. Stevenson,
S. Bolton
Abstract:
While Jupiter's massive gas envelope consists mainly of hydrogen and helium, the key to understanding Jupiter's formation and evolution lies in the distribution of the remaining (heavy) elements. Before the Juno mission, the lack of high-precision gravity harmonics precluded the use of statistical analyses in a robust determination of the heavy-elements distribution in Jupiter's envelope. In this…
▽ More
While Jupiter's massive gas envelope consists mainly of hydrogen and helium, the key to understanding Jupiter's formation and evolution lies in the distribution of the remaining (heavy) elements. Before the Juno mission, the lack of high-precision gravity harmonics precluded the use of statistical analyses in a robust determination of the heavy-elements distribution in Jupiter's envelope. In this paper, we assemble the most comprehensive and diverse collection of Jupiter interior models to date and use it to study the distribution of heavy elements in the planet's envelope. We apply a Bayesian statistical approach to our interior model calculations, reproducing the Juno gravitational and atmospheric measurements and constraints from the deep zonal flows. Our results show that the gravity constraints lead to a deep entropy of Jupiter corresponding to a 1 bar temperature 5-15 K higher than traditionally assumed. We also find that uncertainties in the equation of state are crucial when determining the amount of heavy elements in Jupiter's interior. Our models put an upper limit to the inner compact core of Jupiter of 7 Earth masses, independently on the structure model (with or without dilute core) and the equation of state considered. Furthermore, we robustly demonstrate that Jupiter's envelope is inhomogenous, with a heavy-element enrichment in the interior relative to the outer envelope. This implies that heavy element enrichment continued through the gas accretion phase, with important implications for the formation of giant planets in our solar system and beyond.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
Possible evidence of p-modes in Cassini measurements of Saturn's gravity field
Authors:
Steve Markham,
Daniele Durante,
Luciano Iess,
Dave Stevenson
Abstract:
We analyze the range rate residual data from Cassini's gravity experiment that cannot be explained with a static, zonally symmetric gravity field. In this paper we reproduce the data using a simple forward model of gravity perturbations from normal modes. To do this, we stack data from multiple flybys to improve sensitivity. We find a partially degenerate set of normal mode energy spectra which su…
▽ More
We analyze the range rate residual data from Cassini's gravity experiment that cannot be explained with a static, zonally symmetric gravity field. In this paper we reproduce the data using a simple forward model of gravity perturbations from normal modes. To do this, we stack data from multiple flybys to improve sensitivity. We find a partially degenerate set of normal mode energy spectra which successfully reproduce the unknown gravity signal from Cassini's flybys. Although there is no unique solution, we find that the models most likely to fit the data are dominated by gravitational contributions from p-modes between 500-700uHz. Because f-modes at lower frequencies have stronger gravity signals for a given amplitude, this result would suggest strong frequency dependence in normal mode excitation on Saturn. We predict peak amplitudes for p-modes on the order of several kilometers, at least an order of magnitude larger than the peak amplitudes inferred by Earth-based observations of Jupiter. The large p-mode amplitudes we predict on Saturn, if they are indeed present and steady state, would imply weak damping with a lower bound of Q>1e7 for these modes, consistent with theoretical predictions.
△ Less
Submitted 8 June, 2021;
originally announced June 2021.
-
Keys of a Mission to Uranus or Neptune, the Closest Ice Giants
Authors:
Tristan Guillot,
Jonathan Fortney,
Emily Rauscher,
Mark S. Marley,
Vivien Parmentier,
Mike Line,
Hannah Wakeford,
Yohai Kaspi,
Ravit Helled,
Masahiro Ikoma,
Heather Knutson,
Kristen Menou,
Diana Valencia,
Daniele Durante,
Shigeru Ida,
Scott J. Bolton,
Cheng Li,
Kevin B. Stevenson,
Jacob Bean,
Nicolas B. Cowan,
Mark D. Hofstadter,
Ricardo Hueso,
Jeremy Leconte,
Liming Li,
Christoph Mordasini
, et al. (4 additional authors not shown)
Abstract:
Uranus and Neptune are the archetypes of "ice giants", a class of planets that may be among the most common in the Galaxy. They hold the keys to understand the atmospheric dynamics and structure of planets with hydrogen atmospheres inside and outside the solar system; however, they are also the last unexplored planets of the Solar System. Their atmospheres are active and storms are believed to be…
▽ More
Uranus and Neptune are the archetypes of "ice giants", a class of planets that may be among the most common in the Galaxy. They hold the keys to understand the atmospheric dynamics and structure of planets with hydrogen atmospheres inside and outside the solar system; however, they are also the last unexplored planets of the Solar System. Their atmospheres are active and storms are believed to be fueled by methane condensation which is both extremely abundant and occurs at low optical depth. This means that mapping temperature and methane abundance as a function of position and depth will inform us on how convection organizes in an atmosphere with no surface and condensates that are heavier than the surrounding air, a general feature of giant planets. Owing to the spatial and temporal variability of these atmospheres, an orbiter is required. A probe would provide a reference atmospheric profile to lift ambiguities inherent to remote observations. It would also measure the abundances of noble gases which can be used to reconstruct the history of planet formation in the Solar System. Finally, mapping the planets' gravity and magnetic fields will be essential to constrain their global composition, atmospheric dynamics, structure and evolution. An exploration of Uranus or Neptune will be essential to understand these planets and will also be key to constrain and analyze data obtained at Jupiter, Saturn, and for numerous exoplanets with hydrogen atmospheres.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
Bayesian Testing for Exogenous Partition Structures in Stochastic Block Models
Authors:
Sirio Legramanti,
Tommaso Rigon,
Daniele Durante
Abstract:
Network data often exhibit block structures characterized by clusters of nodes with similar patterns of edge formation. When such relational data are complemented by additional information on exogenous node partitions, these sources of knowledge are typically included in the model to supervise the cluster assignment mechanism or to improve inference on edge probabilities. Although these solutions…
▽ More
Network data often exhibit block structures characterized by clusters of nodes with similar patterns of edge formation. When such relational data are complemented by additional information on exogenous node partitions, these sources of knowledge are typically included in the model to supervise the cluster assignment mechanism or to improve inference on edge probabilities. Although these solutions are routinely implemented, there is a lack of formal approaches to test if a given external node partition is in line with the endogenous clustering structure encoding stochastic equivalence patterns among the nodes in the network. To fill this gap, we develop a formal Bayesian testing procedure which relies on the calculation of the Bayes factor between a stochastic block model with known grouping structure defined by the exogenous node partition and an infinite relational model that allows the endogenous clustering configurations to be unknown, random and fully revealed by the block-connectivity patterns in the network. A simple Markov chain Monte Carlo method for computing the Bayes factor and quantifying uncertainty in the endogenous groups is proposed. This routine is evaluated in simulations and in an application to study exogenous equivalence structures in brain networks of Alzheimer's patients.
△ Less
Submitted 25 September, 2020;
originally announced September 2020.
-
Scalable computation of predictive probabilities in probit models with Gaussian process priors
Authors:
Jian Cao,
Daniele Durante,
Marc G. Genton
Abstract:
Predictive models for binary data are fundamental in various fields, and the growing complexity of modern applications has motivated several flexible specifications for modeling the relationship between the observed predictors and the binary responses. A widely-implemented solution is to express the probability parameter via a probit mapping of a Gaussian process indexed by predictors. However, un…
▽ More
Predictive models for binary data are fundamental in various fields, and the growing complexity of modern applications has motivated several flexible specifications for modeling the relationship between the observed predictors and the binary responses. A widely-implemented solution is to express the probability parameter via a probit mapping of a Gaussian process indexed by predictors. However, unlike for continuous settings, there is a lack of closed-form results for predictive distributions in binary models with Gaussian process priors. Markov chain Monte Carlo methods and approximation strategies provide common solutions to this problem, but state-of-the-art algorithms are either computationally intractable or inaccurate in moderate-to-high dimensions. In this article, we aim to cover this gap by deriving closed-form expressions for the predictive probabilities in probit Gaussian processes that rely either on cumulative distribution functions of multivariate Gaussians or on functionals of multivariate truncated normals. To evaluate these quantities we develop novel scalable solutions based on tile-low-rank Monte Carlo methods for computing multivariate Gaussian probabilities, and on mean-field variational approximations of multivariate truncated normals. Closed-form expressions for the marginal likelihood and for the posterior distribution of the Gaussian process are also discussed. As shown in simulated and real-world empirical studies, the proposed methods scale to dimensions where state-of-the-art solutions are impractical.
△ Less
Submitted 27 January, 2022; v1 submitted 3 September, 2020;
originally announced September 2020.
-
Extended Stochastic Block Models with Application to Criminal Networks
Authors:
Sirio Legramanti,
Tommaso Rigon,
Daniele Durante,
David B. Dunson
Abstract:
Reliably learning group structures among nodes in network data is challenging in several applications. We are particularly motivated by studying covert networks that encode relationships among criminals. These data are subject to measurement errors, and exhibit a complex combination of an unknown number of core-periphery, assortative and disassortative structures that may unveil key architectures…
▽ More
Reliably learning group structures among nodes in network data is challenging in several applications. We are particularly motivated by studying covert networks that encode relationships among criminals. These data are subject to measurement errors, and exhibit a complex combination of an unknown number of core-periphery, assortative and disassortative structures that may unveil key architectures of the criminal organization. The coexistence of these noisy block patterns limits the reliability of routinely-used community detection algorithms, and requires extensions of model-based solutions to realistically characterize the node partition process, incorporate information from node attributes, and provide improved strategies for estimation and uncertainty quantification. To cover these gaps, we develop a new class of extended stochastic block models (ESBM) that infer groups of nodes having common connectivity patterns via Gibbs-type priors on the partition process. This choice encompasses many realistic priors for criminal networks, covering solutions with fixed, random and infinite number of possible groups, and facilitates the inclusion of node attributes in a principled manner. Among the new alternatives in our class, we focus on the Gnedin process as a realistic prior that allows the number of groups to be finite, random and subject to a reinforcement process coherent with criminal networks. A collapsed Gibbs sampler is proposed for the whole ESBM class, and refined strategies for estimation, prediction, uncertainty quantification and model selection are outlined. The ESBM performance is illustrated in realistic simulations and in an application to an Italian mafia network, where we unveil key complex block structures, mostly hidden from state-of-the-art alternatives.
△ Less
Submitted 4 April, 2022; v1 submitted 16 July, 2020;
originally announced July 2020.
-
A Class of Conjugate Priors for Multinomial Probit Models which Includes the Multivariate Normal One
Authors:
Augusto Fasano,
Daniele Durante
Abstract:
Multinomial probit models are routinely-implemented representations for learning how the class probabilities of categorical response data change with p observed predictors. Although several frequentist methods have been developed for estimation, inference and classification within such a class of models, Bayesian inference is still lagging behind. This is due to the apparent absence of a tractable…
▽ More
Multinomial probit models are routinely-implemented representations for learning how the class probabilities of categorical response data change with p observed predictors. Although several frequentist methods have been developed for estimation, inference and classification within such a class of models, Bayesian inference is still lagging behind. This is due to the apparent absence of a tractable class of conjugate priors, that may facilitate posterior inference on the multinomial probit coefficients. Such an issue has motivated increasing efforts toward the development of effective Markov chain Monte Carlo methods, but state-of-the-art solutions still face severe computational bottlenecks, especially in high dimensions. In this article, we show that the entire class of unified skew-normal (SUN) distributions is conjugate to several multinomial probit models. Leveraging this result and the SUN properties, we improve upon state-of-the-art solutions for posterior inference and classification both in terms of closed-form results for several functionals of interest, and also by developing novel computational methods relying either on independent and identically distributed samples from the exact posterior or on scalable and accurate variational approximations based on blocked partially-factorized representations. As illustrated in simulations and in a gastrointestinal lesions application, the magnitude of the improvements relative to current methods is particularly evident, in practice, when the focus is on high-dimensional studies.
△ Less
Submitted 25 January, 2022; v1 submitted 14 July, 2020;
originally announced July 2020.
-
Scalable and Accurate Variational Bayes for High-Dimensional Binary Regression Models
Authors:
Augusto Fasano,
Daniele Durante,
Giacomo Zanella
Abstract:
Modern methods for Bayesian regression beyond the Gaussian response setting are often computationally impractical or inaccurate in high dimensions. In fact, as discussed in recent literature, bypassing such a trade-off is still an open problem even in routine binary regression models, and there is limited theory on the quality of variational approximations in high-dimensional settings. To address…
▽ More
Modern methods for Bayesian regression beyond the Gaussian response setting are often computationally impractical or inaccurate in high dimensions. In fact, as discussed in recent literature, bypassing such a trade-off is still an open problem even in routine binary regression models, and there is limited theory on the quality of variational approximations in high-dimensional settings. To address this gap, we study the approximation accuracy of routinely-used mean-field variational Bayes solutions in high-dimensional probit regression with Gaussian priors, obtaining novel and practically relevant results on the pathological behavior of such strategies in uncertainty quantification, point estimation and prediction. Motivated by these results, we further develop a new partially-factorized variational approximation for the posterior of the probit coefficients which leverages a representation with global and local variables but, unlike for classical mean-field assumptions, it avoids a fully factorized approximation, and instead assumes a factorization only for the local variables. We prove that the resulting approximation belongs to a tractable class of unified skew-normal densities that crucially incorporates skewness and, unlike for state-of-the-art mean-field solutions, converges to the exact posterior density as p goes to infinity. To solve the variational optimization problem, we derive a tractable CAVI algorithm that easily scales to p in the tens of thousands, and provably requires a number of iterations converging to 1 as p goes to infinity. Such findings are also illustrated in extensive empirical studies where our novel solution is shown to improve the approximation accuracy of mean-field variational Bayes for any n and p, with the magnitude of these gains being remarkable in those high-dimensional p>n settings where state-of-the-art methods are computationally impractical.
△ Less
Submitted 13 April, 2022; v1 submitted 15 November, 2019;
originally announced November 2019.
-
Determining the depth of Jupiter's Great Red Spot with Juno: a Slepian approach
Authors:
Eli Galanti,
Yohai Kaspi,
Frederik J. Simons,
Daniele Durante,
Marzia Parisi,
Scott J. Bolton
Abstract:
One of Jupiter's most prominent atmospheric features, the Great Red Spot (GRS), has been observed for more than two centuries, yet little is known about its structure and dynamics below its observed cloud-level. While its anticyclonic vortex appearance suggests it might be a shallow weather-layer feature, the very long time span for which it was observed implies it is likely deeply rooted, otherwi…
▽ More
One of Jupiter's most prominent atmospheric features, the Great Red Spot (GRS), has been observed for more than two centuries, yet little is known about its structure and dynamics below its observed cloud-level. While its anticyclonic vortex appearance suggests it might be a shallow weather-layer feature, the very long time span for which it was observed implies it is likely deeply rooted, otherwise it would have been sheared apart by Jupiter's turbulent atmosphere. Determining the GRS depth will shed light not only on the processes governing the GRS, but on the dynamics of Jupiter's atmosphere as a whole. The Juno mission single flyby over the GRS (PJ7) discovered using microwave radiometer measurements that the GRS is at least a couple hundred kilometers deep (Li et al. 2017). The next flybys over the GRS (PJ18 and PJ21), will allow high-precision gravity measurements that can be used to estimate how deep the GRS winds penetrate below the cloud-level. Here we propose a novel method to determine the depth of the GRS based on the new gravity measurements and a Slepian function approach that enables an effective representation of the wind-induced spatially-confined gravity signal, and an efficient determination of the GRS depth given the limited measurements. We show that with this method the gravity signal of the GRS should be detectable for wind depths deeper than 300 kilometers, with reasonable uncertainties that depend on depth (e.g., $\pm$100km for a GRS depth of 1000km).
△ Less
Submitted 24 March, 2019;
originally announced March 2019.
-
A closed-form filter for binary time series
Authors:
Augusto Fasano,
Giovanni Rebaudo,
Daniele Durante,
Sonia Petrone
Abstract:
Non-Gaussian state-space models arise in several applications, and within this framework the binary time series setting provides a relevant example. However, unlike for Gaussian state-space models - where filtering, predictive and smoothing distributions are available in closed form - binary state-space models require approximations or sequential Monte Carlo strategies for inference and prediction…
▽ More
Non-Gaussian state-space models arise in several applications, and within this framework the binary time series setting provides a relevant example. However, unlike for Gaussian state-space models - where filtering, predictive and smoothing distributions are available in closed form - binary state-space models require approximations or sequential Monte Carlo strategies for inference and prediction. This is due to the apparent absence of conjugacy between the Gaussian states and the likelihood induced by the observation equation for the binary data. In this article we prove that the filtering, predictive and smoothing distributions in dynamic probit models with Gaussian state variables are, in fact, available and belong to a class of unified skew-normals (SUN) whose parameters can be updated recursively in time via analytical expressions. Also the key functionals of these distributions are, in principle, available, but their calculation requires the evaluation of multivariate Gaussian cumulative distribution functions. Leveraging SUN properties, we address this issue via novel Monte Carlo methods based on independent samples from the smoothing distribution, that can easily be adapted to the filtering and predictive case, thus improving state-of-the-art approximate and sequential Monte Carlo inference in small-to-moderate dimensional studies. Novel sequential Monte Carlo procedures that exploit the SUN properties are also developed to deal with online inference in high dimensions. Performance gains over competitors are outlined in a financial application.
△ Less
Submitted 18 May, 2021; v1 submitted 19 February, 2019;
originally announced February 2019.
-
Bayesian cumulative shrinkage for infinite factorizations
Authors:
Sirio Legramanti,
Daniele Durante,
David B. Dunson
Abstract:
There is a wide variety of models in which the dimension of the parameter space is unknown. For example, in factor analysis the number of latent factors is typically not known and has to be inferred from the observed data. Although classical shrinkage priors are useful in these contexts, increasing shrinkage priors can provide a more effective option, which progressively penalizes expansions with…
▽ More
There is a wide variety of models in which the dimension of the parameter space is unknown. For example, in factor analysis the number of latent factors is typically not known and has to be inferred from the observed data. Although classical shrinkage priors are useful in these contexts, increasing shrinkage priors can provide a more effective option, which progressively penalizes expansions with growing complexity. In this article we propose a novel increasing shrinkage prior, named the cumulative shrinkage process, for the parameters controlling the dimension in over-complete formulations. Our construction has broad applicability, simple interpretation, and is based on a sequence of spike and slab distributions which assign increasing mass to the spike as model complexity grows. Using factor analysis as an illustrative example, we show that this formulation has theoretical and practical advantages over current competitors, including an improved ability to recover the model dimension. An adaptive Markov chain Monte Carlo algorithm is proposed, and the methods are evaluated in simulation studies and applied to personality traits data.
△ Less
Submitted 10 September, 2020; v1 submitted 12 February, 2019;
originally announced February 2019.
-
Saturn's deep atmospheric flows revealed by the Cassini Grand Finale gravity measurements
Authors:
Eli Galanti,
Yohai Kaspi,
Yamila Miguel,
Tristan Guillot,
Daniele Durante,
Paolo Racioppa,
Luciano Iess
Abstract:
How deep do Saturn's zonal winds penetrate below the cloud-level has been a decades-long question, with important implications not only for the atmospheric dynamics, but also for the interior density structure, composition, magnetic field and core mass. The Cassini Grand Finale gravity experiment enables answering this question for the first time, with the premise that the planet's gravity harmoni…
▽ More
How deep do Saturn's zonal winds penetrate below the cloud-level has been a decades-long question, with important implications not only for the atmospheric dynamics, but also for the interior density structure, composition, magnetic field and core mass. The Cassini Grand Finale gravity experiment enables answering this question for the first time, with the premise that the planet's gravity harmonics are affected not only by the rigid body density structure but also by its flow field. Using a wide range of rigid body interior models and an adjoint based thermal wind balance, we calculate the optimal flow structure below the cloud-level and its depth. We find that with a wind profile, largely consistent with the observed winds, when extended to a depth of around 8,800 km, all the gravity harmonics measured by Cassini are explained. This solution is in agreement with considerations of angular momentum conservation, and is consistent with magnetohydrodynamics constraints.
△ Less
Submitted 12 February, 2019;
originally announced February 2019.
-
Conjugate Bayes for probit regression via unified skew-normal distributions
Authors:
Daniele Durante
Abstract:
Regression models for dichotomous data are ubiquitous in statistics. Besides being useful for inference on binary responses, these methods serve also as building blocks in more complex formulations, such as density regression, nonparametric classification and graphical models. Within the Bayesian framework, inference proceeds by updating the priors for the coefficients, typically set to be Gaussia…
▽ More
Regression models for dichotomous data are ubiquitous in statistics. Besides being useful for inference on binary responses, these methods serve also as building blocks in more complex formulations, such as density regression, nonparametric classification and graphical models. Within the Bayesian framework, inference proceeds by updating the priors for the coefficients, typically set to be Gaussians, with the likelihood induced by probit or logit regressions for the responses. In this updating, the apparent absence of a tractable posterior has motivated a variety of computational methods, including Markov Chain Monte Carlo routines and algorithms which approximate the posterior. Despite being routinely implemented, Markov Chain Monte Carlo strategies face mixing or time-inefficiency issues in large p and small n studies, whereas approximate routines fail to capture the skewness typically observed in the posterior. This article proves that the posterior distribution for the probit coefficients has a unified skew-normal kernel, under Gaussian priors. Such a novel result allows efficient Bayesian inference for a wide class of applications, especially in large p and small-to-moderate n studies where state-of-the-art computational methods face notable issues. These advances are outlined in a genetic study, and further motivate the development of a wider class of conjugate priors for probit models along with methods to obtain independent and identically distributed samples from the unified skew-normal posterior.
△ Less
Submitted 17 November, 2019; v1 submitted 26 February, 2018;
originally announced February 2018.
-
Conditionally conjugate mean-field variational Bayes for logistic models
Authors:
Daniele Durante,
Tommaso Rigon
Abstract:
Variational Bayes (VB) is a common strategy for approximate Bayesian inference, but simple methods are only available for specific classes of models including, in particular, representations having conditionally conjugate constructions within an exponential family. Models with logit components are an apparently notable exception to this class, due to the absence of conjugacy between the logistic l…
▽ More
Variational Bayes (VB) is a common strategy for approximate Bayesian inference, but simple methods are only available for specific classes of models including, in particular, representations having conditionally conjugate constructions within an exponential family. Models with logit components are an apparently notable exception to this class, due to the absence of conjugacy between the logistic likelihood and the Gaussian priors for the coefficients in the linear predictor. To facilitate approximate inference within this widely used class of models, Jaakkola and Jordan (2000) proposed a simple variational approach which relies on a family of tangent quadratic lower bounds of logistic log-likelihoods, thus restoring conjugacy between these approximate bounds and the Gaussian priors. This strategy is still implemented successfully, but less attempts have been made to formally understand the reasons underlying its excellent performance. To cover this key gap, we provide a formal connection between the above bound and a recent Pólya-gamma data augmentation for logistic regression. Such a result places the computational methods associated with the aforementioned bounds within the framework of variational inference for conditionally conjugate exponential family models, thereby allowing recent advances for this class to be inherited also by the methods relying on Jaakkola and Jordan (2000).
△ Less
Submitted 25 October, 2018; v1 submitted 19 November, 2017;
originally announced November 2017.
-
A nested expectation-maximization algorithm for latent class models with covariates
Authors:
Daniele Durante,
Antonio Canale,
Tommaso Rigon
Abstract:
We develop a nested EM routine for latent class models with covariates which allows maximization of the full-model log-likelihood and, differently from current methods, guarantees monotone log-likelihood sequences along with improved convergence rates.
We develop a nested EM routine for latent class models with covariates which allows maximization of the full-model log-likelihood and, differently from current methods, guarantees monotone log-likelihood sequences along with improved convergence rates.
△ Less
Submitted 2 August, 2018; v1 submitted 10 May, 2017;
originally announced May 2017.
-
Tractable Bayesian Density Regression via Logit Stick-Breaking Priors
Authors:
Tommaso Rigon,
Daniele Durante
Abstract:
There is a growing interest in learning how the distribution of a response variable changes with a set of predictors. Bayesian nonparametric dependent mixture models provide a flexible approach to address this goal. However, several formulations require computationally demanding algorithms for posterior inference. Motivated by this issue, we study a class of predictor-dependent infinite mixture mo…
▽ More
There is a growing interest in learning how the distribution of a response variable changes with a set of predictors. Bayesian nonparametric dependent mixture models provide a flexible approach to address this goal. However, several formulations require computationally demanding algorithms for posterior inference. Motivated by this issue, we study a class of predictor-dependent infinite mixture models, which relies on a simple representation of the stick-breaking prior via sequential logistic regressions. This formulation maintains the same desirable properties of popular predictor-dependent stick-breaking priors, and leverages a recent Pólya-gamma data augmentation to facilitate the implementation of several computational methods for posterior inference. These routines include Markov chain Monte Carlo via Gibbs sampling, expectation-maximization algorithms, and mean-field variational Bayes for scalable inference, thereby stimulating a wider implementation of Bayesian density regression by practitioners. The algorithms associated with these methods are presented in detail and tested in a toxicology study.
△ Less
Submitted 4 May, 2020; v1 submitted 11 January, 2017;
originally announced January 2017.
-
Convex Mixture Regression for Quantitative Risk Assessment
Authors:
Antonio Canale,
Daniele Durante,
David Dunson
Abstract:
There is wide interest in studying how the distribution of a continuous response changes with a predictor. We are motivated by environmental applications in which the predictor is the dose of an exposure and the response is a health outcome. A main focus in these studies is inference on dose levels associated with a given increase in risk relative to a baseline. Popular methods either dichotomize…
▽ More
There is wide interest in studying how the distribution of a continuous response changes with a predictor. We are motivated by environmental applications in which the predictor is the dose of an exposure and the response is a health outcome. A main focus in these studies is inference on dose levels associated with a given increase in risk relative to a baseline. Popular methods either dichotomize the continuous response or focus on modeling changes with the dose in the expectation of the outcome. Such choices may lead to information loss and provide inaccurate inference on dose-response relationships. We instead propose a Bayesian convex mixture regression model that allows the entire distribution of the health outcome to be unknown and changing with the dose. To balance flexibility and parsimony, we rely on a mixture model for the density at the extreme doses, and express the conditional density at each intermediate dose via a convex combination of these extremal densities. This representation generalizes classical dose-response models for quantitative outcomes, and provides a more parsimonious, but still powerful, formulation compared to nonparametric methods, thereby improving interpretability and efficiency in inference on risk functions. A Markov chain Monte Carlo algorithm for posterior inference is developed, and the benefits of our methods are outlined in simulations, along with a study on the impact of DDT exposure on gestational age.
△ Less
Submitted 9 May, 2018; v1 submitted 11 January, 2017;
originally announced January 2017.
-
A note on the multiplicative gamma process
Authors:
Daniele Durante
Abstract:
Adaptive dimensionality reduction in high-dimensional problems is a key topic in statistics. The multiplicative gamma process takes a relevant step in this direction, but improved studies on its properties are required to ease implementation. This note addresses such aim.
Adaptive dimensionality reduction in high-dimensional problems is a key topic in statistics. The multiplicative gamma process takes a relevant step in this direction, but improved studies on its properties are required to ease implementation. This note addresses such aim.
△ Less
Submitted 30 October, 2016; v1 submitted 11 October, 2016;
originally announced October 2016.
-
The effect of Jupiter oscillations on Juno gravity measurements
Authors:
D. Durante,
T. Guillot,
L. Iess
Abstract:
Seismology represents a unique method to probe the interiors of giant planets. Recently, Saturn's f-modes have been indirectly observed in its rings, and there is strong evidence for the detection of Jupiter global modes by means of ground-based, spatially-resolved, velocimetry measurements. We propose to exploit Juno's extremely accurate radio science data by looking at the gravity perturbations…
▽ More
Seismology represents a unique method to probe the interiors of giant planets. Recently, Saturn's f-modes have been indirectly observed in its rings, and there is strong evidence for the detection of Jupiter global modes by means of ground-based, spatially-resolved, velocimetry measurements. We propose to exploit Juno's extremely accurate radio science data by looking at the gravity perturbations that Jupiter's acoustic modes would produce. We evaluate the perturbation to Jupiter's gravitational field using the oscillation spectrum of a polytrope with index 1 and the corresponding radial eigenfunctions. We show that Juno will be most sensitive to the fundamental mode ($n=0$), unless its amplitude is smaller than 0.5 cm/s, i.e. 100 times weaker than the $n \sim\ 4 - 11$ modes detected by spatially-resolved velocimetry. The oscillations yield contributions to Juno's measured gravitational coefficients similar to or larger than those expected from shallow zonal winds (extending to depths less than 300 km). In the case of a strong f-mode (radial velocity $\sim$ 30 cm/s), these contributions would become of the same order as those expected from deep zonal winds (extending to 3000 km), especially on the low degree zonal harmonics, therefore requiring a new approach to the analysis of Juno data.
△ Less
Submitted 2 October, 2016;
originally announced October 2016.
-
Bayesian Learning of Dynamic Multilayer Networks
Authors:
Daniele Durante,
Nabanita Mukherjee,
Rebecca C. Steorts
Abstract:
A plethora of networks is being collected in a growing number of fields, including disease transmission, international relations, social interactions, and others. As data streams continue to grow, the complexity associated with these highly multidimensional connectivity data presents novel challenges. In this paper, we focus on the time-varying interconnections among a set of actors in multiple co…
▽ More
A plethora of networks is being collected in a growing number of fields, including disease transmission, international relations, social interactions, and others. As data streams continue to grow, the complexity associated with these highly multidimensional connectivity data presents novel challenges. In this paper, we focus on the time-varying interconnections among a set of actors in multiple contexts, called layers. Current literature lacks flexible statistical models for dynamic multilayer networks, which can enhance quality in inference and prediction by efficiently borrowing information within each network, across time, and between layers. Motivated by this gap, we develop a Bayesian nonparametric model leveraging latent space representations. Our formulation characterizes the edge probabilities as a function of shared and layer-specific actors positions in a latent space, with these positions changing in time via Gaussian processes. This representation facilitates dimensionality reduction and incorporates different sources of information in the observed data. In addition, we obtain tractable procedures for posterior computation, inference, and prediction. We provide theoretical results on the flexibility of our model. Our methods are tested on simulations and infection studies monitoring dynamic face-to-face contacts among individuals in multiple days, where we perform better than current methods in inference and prediction.
△ Less
Submitted 30 December, 2016; v1 submitted 7 August, 2016;
originally announced August 2016.
-
Bayesian inference on group differences in multivariate categorical data
Authors:
Massimiliano Russo,
Daniele Durante,
Bruno Scarpa
Abstract:
Multivariate categorical data are common in many fields. We are motivated by election polls studies assessing evidence of changes in voters opinions with their candidates preferences in the 2016 United States Presidential primaries or caucuses. Similar goals arise routinely in several applications, but current literature lacks a general methodology which combines flexibility, efficiency, and tract…
▽ More
Multivariate categorical data are common in many fields. We are motivated by election polls studies assessing evidence of changes in voters opinions with their candidates preferences in the 2016 United States Presidential primaries or caucuses. Similar goals arise routinely in several applications, but current literature lacks a general methodology which combines flexibility, efficiency, and tractability in testing for group differences in multivariate categorical data at different---potentially complex---scales. We address this goal by leveraging a Bayesian representation which factorizes the joint probability mass function for the group variable and the multivariate categorical data as the product of the marginal probabilities for the groups, and the conditional probability mass function of the multivariate categorical data, given the group membership. To enhance flexibility, we define the conditional probability mass function of the multivariate categorical data via a group-dependent mixture of tensor factorizations, thus facilitating dimensionality reduction and borrowing of information, while providing tractable procedures for computation, and accurate tests assessing global and local group differences. We compare our methods with popular competitors, and discuss improved performance in simulations and in American election polls studies.
△ Less
Submitted 9 August, 2017; v1 submitted 30 June, 2016;
originally announced June 2016.
-
Bayesian Network--Response Regression
Authors:
Lu Wang,
Daniele Durante,
Rex E. Jung,
David B. Dunson
Abstract:
There is increasing interest in learning how human brain networks vary as a function of a continuous trait, but flexible and efficient procedures to accomplish this goal are limited. We develop a Bayesian semiparametric model, which combines low-rank factorizations and flexible Gaussian process priors to learn changes in the conditional expectation of a network-valued random variable across the va…
▽ More
There is increasing interest in learning how human brain networks vary as a function of a continuous trait, but flexible and efficient procedures to accomplish this goal are limited. We develop a Bayesian semiparametric model, which combines low-rank factorizations and flexible Gaussian process priors to learn changes in the conditional expectation of a network-valued random variable across the values of a continuous predictor, while including subject-specific random effects. The formulation leads to a general framework for inference on changes in brain network structures across human traits, facilitating borrowing of information and coherently characterizing uncertainty. We provide an efficient Gibbs sampler for posterior computation along with simple procedures for inference, prediction and goodness-of-fit assessments. The model is applied to learn how human brain networks vary across individuals with different intelligence scores. Results provide interesting insights on the association between intelligence and brain connectivity, while demonstrating good predictive performance.
△ Less
Submitted 31 January, 2017; v1 submitted 2 June, 2016;
originally announced June 2016.
-
Unifying inference on brain network variations in neurological diseases: The Alzheimer's case
Authors:
Daniele Durante,
Madelaine Daianu,
Neda Jahanshad,
Paul M. Thompson,
David B. Dunson
Abstract:
There is growing interest in understanding how the structural interconnections among brain regions change with the occurrence of neurological diseases. Diffusion weighted MRI imaging has allowed researchers to non-invasively estimate a network of structural cortical connections made by white matter tracts, but current statistical methods for relating such networks to the presence or absence of a d…
▽ More
There is growing interest in understanding how the structural interconnections among brain regions change with the occurrence of neurological diseases. Diffusion weighted MRI imaging has allowed researchers to non-invasively estimate a network of structural cortical connections made by white matter tracts, but current statistical methods for relating such networks to the presence or absence of a disease cannot exploit this rich network information. Standard practice considers each edge independently or summarizes the network with a few simple features. We enable dramatic gains in biological insight via a novel unifying methodology for inference on brain network variations associated to the occurrence of neurological diseases. The key of this approach is to define a probabilistic generative mechanism directly on the space of network configurations via dependent mixtures of low-rank factorizations, which efficiently exploit network information and allow the probability mass function for the brain network-valued random variable to vary flexibly across the group of patients characterized by a specific neurological disease and the one comprising age-matched cognitively healthy individuals.
△ Less
Submitted 19 October, 2015;
originally announced October 2015.
-
Bayesian modeling of networks in complex business intelligence problems
Authors:
Daniele Durante,
Sally Paganin,
Bruno Scarpa,
David B. Dunson
Abstract:
Complex network data problems are increasingly common in many fields of application. Our motivation is drawn from strategic marketing studies monitoring customer choices of specific products, along with co-subscription networks encoding multiple purchasing behavior. Data are available for several agencies within the same insurance company, and our goal is to efficiently exploit co-subscription net…
▽ More
Complex network data problems are increasingly common in many fields of application. Our motivation is drawn from strategic marketing studies monitoring customer choices of specific products, along with co-subscription networks encoding multiple purchasing behavior. Data are available for several agencies within the same insurance company, and our goal is to efficiently exploit co-subscription networks to inform targeted advertising of cross-sell strategies to currently mono-product customers. We address this goal by developing a Bayesian hierarchical model, which clusters agencies according to common mono-product customer choices and co-subscription networks. Within each cluster, we efficiently model customer behavior via a cluster-dependent mixture of latent eigenmodels. This formulation provides key information on mono-product customer choices and multiple purchasing behavior within each cluster, informing targeted cross-sell strategies. We develop simple algorithms for tractable inference, and assess performance in simulations and an application to business intelligence.
△ Less
Submitted 28 March, 2016; v1 submitted 2 October, 2015;
originally announced October 2015.
-
Locally Adaptive Dynamic Networks
Authors:
Daniele Durante,
David B. Dunson
Abstract:
Our focus is on realistically modeling and forecasting dynamic networks of face-to-face contacts among individuals. Important aspects of such data that lead to problems with current methods include the tendency of the contacts to move between periods of slow and rapid changes, and the dynamic heterogeneity in the actors' connectivity behaviors. Motivated by this application, we develop a novel met…
▽ More
Our focus is on realistically modeling and forecasting dynamic networks of face-to-face contacts among individuals. Important aspects of such data that lead to problems with current methods include the tendency of the contacts to move between periods of slow and rapid changes, and the dynamic heterogeneity in the actors' connectivity behaviors. Motivated by this application, we develop a novel method for Locally Adaptive DYnamic (LADY) network inference. The proposed model relies on a dynamic latent space representation in which each actor's position evolves in time via stochastic differential equations. Using a state space representation for these stochastic processes and Pólya-gamma data augmentation, we develop an efficient MCMC algorithm for posterior inference along with tractable procedures for online updating and forecasting of future networks. We evaluate performance in simulation studies, and consider an application to face-to-face contacts among individuals in a primary school.
△ Less
Submitted 18 August, 2016; v1 submitted 21 May, 2015;
originally announced May 2015.
-
The Compass for Statistical Researchers
Authors:
Daniele Durante,
Davide Vidotto,
Sabrina Vettori
Abstract:
We have hiked many miles alongside several professors as we traversed our statistical path -- a regime switching trail which changed direction following a class on the foundations of our discipline. As we play the game of research in that limbo between student and academic, one thing among Prof. Bernardi's teachings has never been more clear: to draw a route in the research map you not only need t…
▽ More
We have hiked many miles alongside several professors as we traversed our statistical path -- a regime switching trail which changed direction following a class on the foundations of our discipline. As we play the game of research in that limbo between student and academic, one thing among Prof. Bernardi's teachings has never been more clear: to draw a route in the research map you not only need to know your destination, but you must also understand where you are and how you arrived there.
△ Less
Submitted 8 January, 2015;
originally announced January 2015.
-
Bayesian Inference and Testing of Group Differences in Brain Networks
Authors:
Daniele Durante,
David B. Dunson
Abstract:
Network data are increasingly collected along with other variables of interest. Our motivation is drawn from neurophysiology studies measuring brain connectivity networks for a sample of individuals along with their membership to a low or high creative reasoning group. It is of paramount importance to develop statistical methods for testing of global and local changes in the structural interconnec…
▽ More
Network data are increasingly collected along with other variables of interest. Our motivation is drawn from neurophysiology studies measuring brain connectivity networks for a sample of individuals along with their membership to a low or high creative reasoning group. It is of paramount importance to develop statistical methods for testing of global and local changes in the structural interconnections among brain regions across groups. We develop a general Bayesian procedure for inference and testing of group differences in the network structure, which relies on a nonparametric representation for the conditional probability mass function associated with a network-valued random variable. By leveraging a mixture of low-rank factorizations, we allow simple global and local hypothesis testing adjusting for multiplicity. An efficient Gibbs sampler is defined for posterior computation. We provide theoretical results on the flexibility of the model and assess testing performance in simulations. The approach is applied to provide novel insights on the relationships between human brain networks and creativity.
△ Less
Submitted 17 August, 2016; v1 submitted 24 November, 2014;
originally announced November 2014.
-
Nonparametric Bayes Modeling of Populations of Networks
Authors:
Daniele Durante,
David B. Dunson,
Joshua T. Vogelstein
Abstract:
Replicated network data are increasingly available in many research fields. In connectomic applications, inter-connections among brain regions are collected for each patient under study, motivating statistical models which can flexibly characterize the probabilistic generative mechanism underlying these network-valued data. Available models for a single network are not designed specifically for in…
▽ More
Replicated network data are increasingly available in many research fields. In connectomic applications, inter-connections among brain regions are collected for each patient under study, motivating statistical models which can flexibly characterize the probabilistic generative mechanism underlying these network-valued data. Available models for a single network are not designed specifically for inference on the entire probability mass function of a network-valued random variable and therefore lack flexibility in characterizing the distribution of relevant topological structures. We propose a flexible Bayesian nonparametric approach for modeling the population distribution of network-valued data. The joint distribution of the edges is defined via a mixture model which reduces dimensionality and efficiently incorporates network information within each mixture component by leveraging latent space representations. The formulation leads to an efficient Gibbs sampler and provides simple and coherent strategies for inference and goodness-of-fit assessments. We provide theoretical results on the flexibility of our model and illustrate improved performance --- compared to state-of-the-art models --- in simulations and application to human brain networks.
△ Less
Submitted 5 June, 2016; v1 submitted 30 June, 2014;
originally announced June 2014.
-
Bayesian semiparametric modelling of contraceptive behavior in India via sequential logistic regressions
Authors:
Tommaso Rigon,
Daniele Durante,
Nicola Torelli
Abstract:
Family planning has been characterized by highly different strategic programs in India, including method-specific contraceptive targets, coercive sterilization, and more recent target-free approaches. These major changes in family planning policies over time have motivated a considerable interest towards assessing the effectiveness of the different programs, while understanding which subsets of th…
▽ More
Family planning has been characterized by highly different strategic programs in India, including method-specific contraceptive targets, coercive sterilization, and more recent target-free approaches. These major changes in family planning policies over time have motivated a considerable interest towards assessing the effectiveness of the different programs, while understanding which subsets of the population have not been properly addressed. Current studies consider specific aspects of the above policies, including, for example, the factors associated with the choice of alternative contraceptive methods other than sterilization, for women using contraceptives. Although these analyses produce relevant insights, they fail to provide a global overview of the different family planning policies, and the determinants underlying the contraceptive choices. Motivated by this consideration, we propose a Bayesian semiparametric model relying on a reparameterization of the multinomial probability mass function via a set of conditional Bernoulli choices. The sequential binary structure is defined to be consistent with the current family planning policies in India, and coherent with a reasonable process characterizing the contraceptive choices. This combination of flexible representations and careful reparameterizations allows a broader and interpretable overview of the different policies and contraceptive preferences in India, within a single model.
△ Less
Submitted 31 July, 2017; v1 submitted 29 May, 2014;
originally announced May 2014.
-
Bayesian dynamic financial networks with time-varying predictors
Authors:
Daniele Durante,
David B. Dunson
Abstract:
We propose a Bayesian nonparametric model including time-varying predictors in dynamic network inference. The model is applied to infer the dependence structure among financial markets during the global financial crisis, estimating effects of verbal and material cooperation efforts. We interestingly learn contagion effects, with increasing influence of verbal relations during the financial crisis…
▽ More
We propose a Bayesian nonparametric model including time-varying predictors in dynamic network inference. The model is applied to infer the dependence structure among financial markets during the global financial crisis, estimating effects of verbal and material cooperation efforts. We interestingly learn contagion effects, with increasing influence of verbal relations during the financial crisis and opposite results during the United States housing bubble.
△ Less
Submitted 10 March, 2014;
originally announced March 2014.
-
Nonparametric Bayes dynamic modeling of relational data
Authors:
Daniele Durante,
David B. Dunson
Abstract:
Symmetric binary matrices representing relations among entities are commonly collected in many areas. Our focus is on dynamically evolving binary relational matrices, with interest being in inference on the relationship structure and prediction. We propose a nonparametric Bayesian dynamic model, which reduces dimensionality in characterizing the binary matrix through a lower-dimensional latent spa…
▽ More
Symmetric binary matrices representing relations among entities are commonly collected in many areas. Our focus is on dynamically evolving binary relational matrices, with interest being in inference on the relationship structure and prediction. We propose a nonparametric Bayesian dynamic model, which reduces dimensionality in characterizing the binary matrix through a lower-dimensional latent space representation, with the latent coordinates evolving in continuous time via Gaussian processes. By using a logistic mapping function from the probability matrix space to the latent relational space, we obtain a flexible and computational tractable formulation. Employing Pòlya-Gamma data augmentation, an efficient Gibbs sampler is developed for posterior computation, with the dimension of the latent space automatically inferred. We provide some theoretical results on flexibility of the model, and illustrate performance via simulation experiments. We also consider an application to co-movements in world financial markets.
△ Less
Submitted 19 November, 2013;
originally announced November 2013.
-
Locally adaptive factor processes for multivariate time series
Authors:
Daniele Durante,
Bruno Scarpa,
David B. Dunson
Abstract:
In modeling multivariate time series, it is important to allow time-varying smoothness in the mean and covariance process. In particular, there may be certain time intervals exhibiting rapid changes and others in which changes are slow. If such time-varying smoothness is not accounted for, one can obtain misleading inferences and predictions, with over-smoothing across erratic time intervals and u…
▽ More
In modeling multivariate time series, it is important to allow time-varying smoothness in the mean and covariance process. In particular, there may be certain time intervals exhibiting rapid changes and others in which changes are slow. If such time-varying smoothness is not accounted for, one can obtain misleading inferences and predictions, with over-smoothing across erratic time intervals and under-smoothing across times exhibiting slow variation. This can lead to mis-calibration of predictive intervals, which can be substantially too narrow or wide depending on the time. We propose a locally adaptive factor process for characterizing multivariate mean-covariance changes in continuous time, allowing locally varying smoothness in both the mean and covariance matrix. This process is constructed utilizing latent dictionary functions evolving in time through nested Gaussian processes and linearly related to the observed data with a sparse mapping. Using a differential equation representation, we bypass usual computational bottlenecks in obtaining MCMC and online algorithms for approximate Bayesian inference. The performance is assessed in simulations and illustrated in a financial application.
△ Less
Submitted 21 June, 2013; v1 submitted 7 October, 2012;
originally announced October 2012.