Skip to main content

Showing 1–17 of 17 results for author: Mesquita, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.00647  [pdf, ps, other

    cs.LG

    Cooperative Sheaf Neural Networks

    Authors: André Ribeiro, Ana Luiza Tenório, Juan Belieni, Amauri H. Souza, Diego Mesquita

    Abstract: Sheaf diffusion has recently emerged as a promising design pattern for graph representation learning due to its inherent ability to handle heterophilic data and avoid oversmoothing. Meanwhile, cooperative message passing has also been proposed as a way to enhance the flexibility of information diffusion by allowing nodes to independently choose whether to propagate/gather information from/to neigh… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  2. arXiv:2504.08086  [pdf, ps, other

    cs.LG cs.CR cs.DB

    Differentially Private Selection using Smooth Sensitivity

    Authors: Iago Chaves, Victor Farias, Amanda Perez, Diego Mesquita, Javam Machado

    Abstract: Differentially private selection mechanisms offer strong privacy guarantees for queries aiming to identify the top-scoring element r from a finite set R, based on a dataset-dependent utility function. While selection queries are fundamental in data science, few mechanisms effectively ensure their privacy. Furthermore, most approaches rely on global sensitivity to achieve differential privacy (DP),… ▽ More

    Submitted 10 April, 2025; originally announced April 2025.

    Comments: This is the full version of our paper "Differentially Private Selection using Smooth Sensitivity", which will appear in IEEE Security & Privacy 2025 as a regular research paper

  3. arXiv:2411.05899  [pdf, other

    cs.LG

    Streaming Bayes GFlowNets

    Authors: Tiago da Silva, Daniel Augusto de Souza, Diego Mesquita

    Abstract: Bayes' rule naturally allows for inference refinement in a streaming fashion, without the need to recompute posteriors from scratch whenever new data arrives. In principle, Bayesian streaming is straightforward: we update our prior with the available data and use the resulting posterior as a prior when processing the next data chunk. In practice, however, this recipe entails i) approximating an in… ▽ More

    Submitted 8 November, 2024; originally announced November 2024.

    Comments: 25 pages, 8 figures

  4. arXiv:2410.09355  [pdf, other

    cs.LG stat.ML

    On Divergence Measures for Training GFlowNets

    Authors: Tiago da Silva, Eliezer de Souza da Silva, Diego Mesquita

    Abstract: Generative Flow Networks (GFlowNets) are amortized inference models designed to sample from unnormalized distributions over composable objects, with applications in generative modeling for tasks in fields such as causal discovery, NLP, and drug discovery. Traditionally, the training procedure for GFlowNets seeks to minimize the expected log-squared difference between a proposal (forward policy) an… ▽ More

    Submitted 21 October, 2024; v1 submitted 11 October, 2024; originally announced October 2024.

    Comments: Accepted at NeurIPS 2024, https://openreview.net/forum?id=N5H4z0Pzvn

    MSC Class: 68T05 ACM Class: G.3; I.5.1; I.2.8; I.2.6

  5. arXiv:2406.03288  [pdf, other

    cs.LG stat.ML

    Embarrassingly Parallel GFlowNets

    Authors: Tiago da Silva, Luiz Max Carvalho, Amauri Souza, Samuel Kaski, Diego Mesquita

    Abstract: GFlowNets are a promising alternative to MCMC sampling for discrete compositional random variables. Training GFlowNets requires repeated evaluations of the unnormalized target distribution or reward function. However, for large-scale posterior sampling, this may be prohibitive since it incurs traversing the data several times. Moreover, if the data are distributed across clients, employing standar… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024

  6. arXiv:2403.04605  [pdf, other

    cs.LG

    In-n-Out: Calibrating Graph Neural Networks for Link Prediction

    Authors: Erik Nascimento, Diego Mesquita, Samuel Kaski, Amauri H Souza

    Abstract: Deep neural networks are notoriously miscalibrated, i.e., their outputs do not reflect the true probability of the event we aim to predict. While networks for tabular or image data are usually overconfident, recent works have shown that graph neural networks (GNNs) show the opposite behavior for node-level classification. But what happens when we are predicting links? We show that, in this case, G… ▽ More

    Submitted 8 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: 18 pages, 4 figures, 8 tables

  7. arXiv:2310.11527  [pdf, other

    stat.ML cs.LG

    Thin and Deep Gaussian Processes

    Authors: Daniel Augusto de Souza, Alexander Nikitin, ST John, Magnus Ross, Mauricio A. Álvarez, Marc Peter Deisenroth, João P. P. Gomes, Diego Mesquita, César Lincoln C. Mattos

    Abstract: Gaussian processes (GPs) can provide a principled approach to uncertainty quantification with easy-to-interpret kernel hyperparameters, such as the lengthscale, which controls the correlation distance of function values. However, selecting an appropriate kernel can be challenging. Deep GPs avoid manual kernel engineering by successively parameterizing kernels with GP layers, allowing them to learn… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted at the Conference on Neural Information Processing Systems (NeurIPS) 2023

  8. arXiv:2309.12032  [pdf, other

    cs.LG stat.ML

    Human-in-the-Loop Causal Discovery under Latent Confounding using Ancestral GFlowNets

    Authors: Tiago da Silva, Eliezer Silva, António Góis, Dominik Heider, Samuel Kaski, Diego Mesquita, Adèle Ribeiro

    Abstract: Structure learning is the crux of causal inference. Notably, causal discovery (CD) algorithms are brittle when data is scarce, possibly inferring imprecise causal relations that contradict expert knowledge -- especially when considering latent confounders. To aggravate the issue, most CD methods do not provide uncertainty estimates, making it hard for users to interpret results and improve the inf… ▽ More

    Submitted 1 November, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

  9. arXiv:2305.07334  [pdf, other

    stat.ML cs.LG

    Locking and Quacking: Stacking Bayesian model predictions by log-pooling and superposition

    Authors: Yuling Yao, Luiz Max Carvalho, Diego Mesquita, Yann McLatchie

    Abstract: Combining predictions from different models is a central problem in Bayesian inference and machine learning more broadly. Currently, these predictive distributions are almost exclusively combined using linear mixtures such as Bayesian model averaging, Bayesian stacking, and mixture of experts. Such linear mixtures impose idiosyncrasies that might be undesirable for some applications, such as multi… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: An earlier version appeared at the NeurIPS 2022 Workshop on Score-Based Methods

  10. arXiv:2303.10139  [pdf, other

    cs.LG cs.AI

    Distill n' Explain: explaining graph neural networks using simple surrogates

    Authors: Tamara Pereira, Erik Nascimento, Lucas E. Resck, Diego Mesquita, Amauri Souza

    Abstract: Explaining node predictions in graph neural networks (GNNs) often boils down to finding graph substructures that preserve predictions. Finding these structures usually implies back-propagating through the GNN, bonding the complexity (e.g., number of layers) of the GNN to the cost of explaining it. This naturally begs the question: Can we break this bond by explaining a simpler surrogate GNN? To an… ▽ More

    Submitted 8 March, 2024; v1 submitted 17 March, 2023; originally announced March 2023.

    Comments: To appear in AISTATS 2023

    Journal ref: PMLR 206 (2023) 6199-6214; AISTATS 2023

  11. arXiv:2209.15059  [pdf, other

    cs.LG

    Provably expressive temporal graph networks

    Authors: Amauri H. Souza, Diego Mesquita, Samuel Kaski, Vikas Garg

    Abstract: Temporal graph networks (TGNs) have gained prominence as models for embedding dynamic interactions, but little is known about their theoretical underpinnings. We establish fundamental results about the representational power and limits of the two main categories of TGNs: those that aggregate temporal walks (WA-TGNs), and those that augment local message passing with recurrent memory modules (MP-TG… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: Accepted to NeurIPS 2022

  12. arXiv:2202.11154  [pdf, other

    stat.ML cs.LG stat.ME

    Parallel MCMC Without Embarrassing Failures

    Authors: Daniel Augusto de Souza, Diego Mesquita, Samuel Kaski, Luigi Acerbi

    Abstract: Embarrassingly parallel Markov Chain Monte Carlo (MCMC) exploits parallel computing to scale Bayesian inference to large datasets by using a two-step approach. First, MCMC is run in parallel on (sub)posteriors defined on data partitions. Then, a server combines local results. While efficient, this framework is very sensitive to the quality of subposterior sampling. Common sampling problems such as… ▽ More

    Submitted 29 March, 2022; v1 submitted 22 February, 2022; originally announced February 2022.

    Comments: To appear in the 25th International Conference on Artificial Intelligence and Statistics (AISTATS 2022). For associated code, see https://github.com/spectraldani/pai/

  13. arXiv:2010.11418  [pdf, other

    cs.LG cs.AI cs.CV

    Rethinking pooling in graph neural networks

    Authors: Diego Mesquita, Amauri H. Souza, Samuel Kaski

    Abstract: Graph pooling is a central component of a myriad of graph neural network (GNN) architectures. As an inheritance from traditional CNNs, most approaches formulate graph pooling as a cluster assignment problem, extending the idea of local patches in regular grids to graphs. Despite the wide adherence to this design choice, no work has rigorously evaluated its influence on the success of GNNs. In this… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

    Comments: Accepted to NeurIPS 2020

  14. arXiv:2004.11231  [pdf, other

    stat.ML cs.LG

    Federated Stochastic Gradient Langevin Dynamics

    Authors: Khaoula El Mekkaoui, Diego Mesquita, Paul Blomstedt, Samuel Kaski

    Abstract: Stochastic gradient MCMC methods, such as stochastic gradient Langevin dynamics (SGLD), employ fast but noisy gradient estimates to enable large-scale posterior sampling. Although we can easily extend SGLD to distributed settings, it suffers from two issues when applied to federated non-IID data. First, the variance of these estimates increases significantly. Second, delaying communication causes… ▽ More

    Submitted 14 June, 2021; v1 submitted 23 April, 2020; originally announced April 2020.

    Comments: Accepted to UAI 2021

  15. arXiv:1907.01867  [pdf, other

    stat.ML cs.LG

    Learning GPLVM with arbitrary kernels using the unscented transformation

    Authors: Daniel Augusto R. M. A. de Souza, Diego Mesquita, César Lincoln C. Mattos, João Paulo P. Gomes

    Abstract: Gaussian Process Latent Variable Model (GPLVM) is a flexible framework to handle uncertain inputs in Gaussian Processes (GPs) and incorporate GPs as components of larger graphical models. Nonetheless, the standard GPLVM variational inference approach is tractable only for a narrow family of kernel functions. The most popular implementations of GPLVM circumvent this limitation using quadrature meth… ▽ More

    Submitted 10 November, 2020; v1 submitted 3 July, 2019; originally announced July 2019.

    Comments: 10 pages, currently under review

  16. arXiv:1905.00332  [pdf, ps, other

    stat.ML cs.LG

    LS-SVR as a Bayesian RBF network

    Authors: Diego P. P. Mesquita, Luis A. Freitas, João P. P. Gomes, César L. C. Mattos

    Abstract: We show theoretical similarities between the Least Squares Support Vector Regression (LS-SVR) model with a Radial Basis Functions (RBF) kernel and maximum a posteriori (MAP) inference on Bayesian RBF networks with a specific Gaussian prior on the regression weights. Although previous works have pointed out similar expressions between those learning approaches, we explicit and formally state the ex… ▽ More

    Submitted 2 August, 2019; v1 submitted 1 May, 2019; originally announced May 2019.

    Comments: 14 pages, currently under review

  17. arXiv:1903.04556  [pdf, other

    cs.LG stat.ML

    Embarrassingly parallel MCMC using deep invertible transformations

    Authors: Diego Mesquita, Paul Blomstedt, Samuel Kaski

    Abstract: While MCMC methods have become a main work-horse for Bayesian inference, scaling them to large distributed datasets is still a challenge. Embarrassingly parallel MCMC strategies take a divide-and-conquer stance to achieve this by writing the target posterior as a product of subposteriors, running MCMC for each of them in parallel and subsequently combining the results. The challenge then lies in d… ▽ More

    Submitted 15 June, 2021; v1 submitted 11 March, 2019; originally announced March 2019.

    Comments: Accepted to UAI 2019