Skip to main content

Showing 1–50 of 134 results for author: Welling, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2410.13821  [pdf, other

    cs.LG cs.AI stat.ML

    Artificial Kuramoto Oscillatory Neurons

    Authors: Takeru Miyato, Sindy Löwe, Andreas Geiger, Max Welling

    Abstract: It has long been known in both neuroscience and AI that ``binding'' between neurons leads to a form of competitive learning where representations are compressed in order to represent more abstract concepts in deeper layers of the network. More recently, it was also hypothesized that dynamic (spatiotemporal) representations play an important role in both neuroscience and AI. Building on these ideas… ▽ More

    Submitted 16 May, 2025; v1 submitted 17 October, 2024; originally announced October 2024.

    Comments: Accepted for Oral presentation at ICLR2025

  2. arXiv:2410.02667  [pdf, other

    cs.LG hep-th stat.ML

    GUD: Generation with Unified Diffusion

    Authors: Mathis Gerdes, Max Welling, Miranda C. N. Cheng

    Abstract: Diffusion generative models transform noise into data by inverting a process that progressively adds noise to data samples. Inspired by concepts from the renormalization group in physics, which analyzes systems across different scales, we revisit diffusion models by exploring three key design aspects: 1) the choice of representation in which the diffusion process operates (e.g. pixel-, PCA-, Fouri… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

    Comments: 11 pages, 8 figures

  3. arXiv:2406.04843  [pdf, other

    cs.LG stat.ML

    Variational Flow Matching for Graph Generation

    Authors: Floor Eijkelboom, Grigory Bartosh, Christian Andersson Naesseth, Max Welling, Jan-Willem van de Meent

    Abstract: We present a formulation of flow matching as variational inference, which we refer to as variational flow matching (VFM). Based on this formulation we develop CatFlow, a flow matching method for categorical data. CatFlow is easy to implement, computationally efficient, and achieves strong results on graph generation tasks. In VFM, the objective is to approximate the posterior probability path, whi… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  4. arXiv:2402.00809  [pdf, other

    cs.LG stat.ML

    Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

    Authors: Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, José Miguel Hernández-Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

    Abstract: In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learni… ▽ More

    Submitted 6 August, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  5. arXiv:2310.10375  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    GTA: A Geometry-Aware Attention Mechanism for Multi-View Transformers

    Authors: Takeru Miyato, Bernhard Jaeger, Max Welling, Andreas Geiger

    Abstract: As transformers are equivariant to the permutation of input tokens, encoding the positional information of tokens is necessary for many tasks. However, since existing positional encoding schemes have been initially designed for NLP tasks, their suitability for vision tasks, which typically exhibit different structural properties in their data, is questionable. We argue that existing positional enc… ▽ More

    Submitted 7 June, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Published as a conference paper at ICLR 2024

  6. arXiv:2203.17003  [pdf, other

    cs.LG q-bio.QM stat.ML

    Equivariant Diffusion for Molecule Generation in 3D

    Authors: Emiel Hoogeboom, Victor Garcia Satorras, Clément Vignac, Max Welling

    Abstract: This work introduces a diffusion model for molecule generation in 3D that is equivariant to Euclidean transformations. Our E(3) Equivariant Diffusion Model (EDM) learns to denoise a diffusion process with an equivariant network that jointly operates on both continuous (atom coordinates) and categorical features (atom types). In addition, we provide a probabilistic analysis which admits likelihood… ▽ More

    Submitted 16 June, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: Accepted at International Conference on Machine Learning (ICML) 2022

  7. arXiv:2111.13772  [pdf, other

    cs.LG stat.ML

    Particle Dynamics for Learning EBMs

    Authors: Kirill Neklyudov, Priyank Jaini, Max Welling

    Abstract: Energy-based modeling is a promising approach to unsupervised learning, which yields many downstream applications from a single model. The main difficulty in learning energy-based models with the "contrastive approaches" is the generation of samples from the current energy function at each iteration. Many advances have been made to accomplish this subroutine cheaply. Nevertheless, all such samplin… ▽ More

    Submitted 26 November, 2021; originally announced November 2021.

  8. arXiv:2111.10192  [pdf, other

    cs.LG stat.ML

    An Expectation-Maximization Perspective on Federated Learning

    Authors: Christos Louizos, Matthias Reisser, Joseph Soriaga, Max Welling

    Abstract: Federated learning describes the distributed training of models across multiple clients while keeping the data private on-device. In this work, we view the server-orchestrated federated learning process as a hierarchical latent variable model where the server provides the parameters of a prior distribution over the client-specific model parameters. We show that with simple Gaussian priors and a ha… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

  9. arXiv:2110.02905  [pdf, other

    cs.LG cs.AI stat.ML

    Geometric and Physical Quantities Improve E(3) Equivariant Message Passing

    Authors: Johannes Brandstetter, Rob Hesselink, Elise van der Pol, Erik J Bekkers, Max Welling

    Abstract: Including covariant information, such as position, force, velocity or spin is important in many tasks in computational physics and chemistry. We introduce Steerable E(3) Equivariant Graph Neural Networks (SEGNNs) that generalise equivariant graph networks, such that node and edge attributes are not restricted to invariant scalars, but can contain covariant information, such as vectors or tensors.… ▽ More

    Submitted 26 March, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: Published at ICLR 2022 (Spotlight paper), Github: https://github.com/RobDHess/Steerable-E3-GNN

  10. arXiv:2109.12561  [pdf, other

    eess.SP cs.IT cs.LG stat.ML

    Neural Augmentation of Kalman Filter with Hypernetwork for Channel Tracking

    Authors: Kumar Pratik, Rana Ali Amjad, Arash Behboodi, Joseph B. Soriaga, Max Welling

    Abstract: We propose Hypernetwork Kalman Filter (HKF) for tracking applications with multiple different dynamics. The HKF combines generalization power of Kalman filters with expressive power of neural networks. Instead of keeping a bank of Kalman filters and choosing one based on approximating the actual dynamics, HKF adapts itself to each dynamics based on the observed sequence. Through extensive experime… ▽ More

    Submitted 26 September, 2021; originally announced September 2021.

    Comments: Accepted at IEEE Globecom 2021. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  11. arXiv:2106.10188  [pdf, other

    stat.CO cs.LG

    Deterministic Gibbs Sampling via Ordinary Differential Equations

    Authors: Kirill Neklyudov, Roberto Bondesan, Max Welling

    Abstract: Deterministic dynamics is an essential part of many MCMC algorithms, e.g. Hybrid Monte Carlo or samplers utilizing normalizing flows. This paper presents a general construction of deterministic measure-preserving dynamics using autonomous ODEs and tools from differential geometry. We show how Hybrid Monte Carlo and other deterministic samplers follow as special cases of our theory. We then demonst… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

  12. arXiv:2106.07832  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Equivariant Energy Based Models with Equivariant Stein Variational Gradient Descent

    Authors: Priyank Jaini, Lars Holdijk, Max Welling

    Abstract: We focus on the problem of efficient sampling and learning of probability densities by incorporating symmetries in probabilistic models. We first introduce Equivariant Stein Variational Gradient Descent algorithm -- an equivariant sampling method based on Stein's identity for sampling from densities with symmetries. Equivariant SVGD explicitly incorporates symmetry information in a density through… ▽ More

    Submitted 29 July, 2021; v1 submitted 14 June, 2021; originally announced June 2021.

  13. arXiv:2106.06020  [pdf, other

    cs.LG cs.CG cs.CV stat.ML

    Coordinate Independent Convolutional Networks -- Isometry and Gauge Equivariant Convolutions on Riemannian Manifolds

    Authors: Maurice Weiler, Patrick Forré, Erik Verlinde, Max Welling

    Abstract: Motivated by the vast success of deep convolutional networks, there is a great interest in generalizing convolutions to non-Euclidean manifolds. A major complication in comparison to flat spaces is that it is unclear in which alignment a convolution kernel should be applied on a manifold. The underlying reason for this ambiguity is that general manifolds do not come with a canonical choice of refe… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: The implementation of orientation independent Möbius convolutions is publicly available at https://github.com/mauriceweiler/MobiusCNNs

  14. arXiv:2105.09016  [pdf, other

    cs.LG physics.chem-ph stat.ML

    E(n) Equivariant Normalizing Flows

    Authors: Victor Garcia Satorras, Emiel Hoogeboom, Fabian B. Fuchs, Ingmar Posner, Max Welling

    Abstract: This paper introduces a generative model equivariant to Euclidean symmetries: E(n) Equivariant Normalizing Flows (E-NFs). To construct E-NFs, we take the discriminative E(n) graph neural networks and integrate them as a differential equation to obtain an invertible equivariant function: a continuous-time normalizing flow. We demonstrate that E-NFs considerably outperform baselines and existing met… ▽ More

    Submitted 14 January, 2022; v1 submitted 19 May, 2021; originally announced May 2021.

    Comments: Accepted at Neural Information Processing Systems (NeurIPS 2021)

  15. arXiv:2104.09459  [pdf, other

    cs.LG math.DS stat.ML

    A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups

    Authors: Marc Finzi, Max Welling, Andrew Gordon Wilson

    Abstract: Symmetries and equivariance are fundamental to the generalization of neural networks on domains such as images, graphs, and point clouds. Existing work has primarily focused on a small number of groups, such as the translation, rotation, and permutation groups. In this work we provide a completely general algorithm for solving for the equivariant layers of matrix groups. In addition to recovering… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Comments: Library: https://github.com/mfinzi/equivariant-MLP, Documentation: https://emlp.readthedocs.io/en/latest/, Examples: https://colab.research.google.com/github/mfinzi/equivariant-MLP/blob/master/docs/notebooks/colabs/all.ipynb

  16. arXiv:2103.06701  [pdf, other

    cs.CR cs.LG stat.ML

    Diagnosing Vulnerability of Variational Auto-Encoders to Adversarial Attacks

    Authors: Anna Kuzina, Max Welling, Jakub M. Tomczak

    Abstract: In this work, we explore adversarial attacks on the Variational Autoencoders (VAE). We show how to modify data point to obtain a prescribed latent code (supervised attack) or just get a drastically different code (unsupervised attack). We examine the influence of model modifications ($β$-VAE, NVAE) on the robustness of VAEs and suggest metrics to quantify it.

    Submitted 6 May, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

  17. arXiv:2103.04786  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Combining Interventional and Observational Data Using Causal Reductions

    Authors: Maximilian Ilse, Patrick Forré, Max Welling, Joris M. Mooij

    Abstract: Unobserved confounding is one of the main challenges when estimating causal effects. We propose a causal reduction method that, given a causal model, replaces an arbitrary number of possibly high-dimensional latent confounders with a single latent confounder that takes values in the same space as the treatment variable, without changing the observational and interventional distributions the causal… ▽ More

    Submitted 22 February, 2023; v1 submitted 8 March, 2021; originally announced March 2021.

  18. arXiv:2102.13382  [pdf, other

    stat.ML cs.LG

    Batch Bayesian Optimization on Permutations using the Acquisition Weighted Kernel

    Authors: Changyong Oh, Roberto Bondesan, Efstratios Gavves, Max Welling

    Abstract: In this work we propose a batch Bayesian optimization method for combinatorial problems on permutations, which is well suited for expensive-to-evaluate objectives. We first introduce LAW, an efficient batch acquisition method based on determinantal point processes using the acquisition weighted kernel. Relying on multiple parallel evaluations, LAW enables accelerated search on combinatorial spaces… ▽ More

    Submitted 25 January, 2023; v1 submitted 26 February, 2021; originally announced February 2021.

    Comments: NeurIPS 2022

  19. arXiv:2102.12792  [pdf, other

    stat.ML cs.LG

    Mixed Variable Bayesian Optimization with Frequency Modulated Kernels

    Authors: Changyong Oh, Efstratios Gavves, Max Welling

    Abstract: The sample efficiency of Bayesian optimization(BO) is often boosted by Gaussian Process(GP) surrogate models. However, on mixed variable spaces, surrogate models other than GPs are prevalent, mainly due to the lack of kernels which can model complex dependencies across different types of variables. In this paper, we propose the frequency modulated (FM) kernel flexibly modeling dependencies among d… ▽ More

    Submitted 18 July, 2022; v1 submitted 25 February, 2021; originally announced February 2021.

    Comments: Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, PMLR 161:950-960, 2021

  20. arXiv:2102.11756  [pdf, other

    cs.LG stat.ML

    Deep Policy Dynamic Programming for Vehicle Routing Problems

    Authors: Wouter Kool, Herke van Hoof, Joaquim Gromicho, Max Welling

    Abstract: Routing problems are a class of combinatorial problems with many practical applications. Recently, end-to-end deep learning methods have been proposed to learn approximate solution heuristics for such problems. In contrast, classical dynamic programming (DP) algorithms guarantee optimal solutions, but scale badly with the problem size. We propose Deep Policy Dynamic Programming (DPDP), which aims… ▽ More

    Submitted 2 December, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

    Comments: 21 pages

  21. arXiv:2102.09844  [pdf, other

    cs.LG stat.ML

    E(n) Equivariant Graph Neural Networks

    Authors: Victor Garcia Satorras, Emiel Hoogeboom, Max Welling

    Abstract: This paper introduces a new model to learn graph neural networks equivariant to rotations, translations, reflections and permutations called E(n)-Equivariant Graph Neural Networks (EGNNs). In contrast with existing methods, our work does not require computationally expensive higher-order representations in intermediate layers while it still achieves competitive or better performance. In addition,… ▽ More

    Submitted 16 February, 2022; v1 submitted 19 February, 2021; originally announced February 2021.

  22. arXiv:2102.05379  [pdf, other

    stat.ML cs.CL cs.LG

    Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions

    Authors: Emiel Hoogeboom, Didrik Nielsen, Priyank Jaini, Patrick Forré, Max Welling

    Abstract: Generative flows and diffusion models have been predominantly trained on ordinal data, for example natural images. This paper introduces two extensions of flows and diffusion for categorical data such as language or image segmentation: Argmax Flows and Multinomial Diffusion. Argmax Flows are defined by a composition of a continuous distribution (such as a normalizing flow), and an argmax function.… ▽ More

    Submitted 22 October, 2021; v1 submitted 10 February, 2021; originally announced February 2021.

    Comments: Accepted at Neural Information Processing Systems (NeurIPS 2021)

  23. arXiv:2011.07248  [pdf, other

    cs.LG cs.NE stat.ML

    Self Normalizing Flows

    Authors: T. Anderson Keller, Jorn W. T. Peters, Priyank Jaini, Emiel Hoogeboom, Patrick Forré, Max Welling

    Abstract: Efficient gradient computation of the Jacobian determinant term is a core problem in many machine learning settings, and especially so in the normalizing flow framework. Most proposed flow models therefore either restrict to a function class with easy evaluation of the Jacobian determinant, or an efficient estimator thereof. However, these restrictions limit the performance of such density models,… ▽ More

    Submitted 9 June, 2021; v1 submitted 14 November, 2020; originally announced November 2020.

  24. arXiv:2010.08047  [pdf, other

    cs.LG stat.CO

    Orbital MCMC

    Authors: Kirill Neklyudov, Max Welling

    Abstract: Markov Chain Monte Carlo (MCMC) algorithms ubiquitously employ complex deterministic transformations to generate proposal points that are then filtered by the Metropolis-Hastings-Green (MHG) test. However, the condition of the target measure invariance puts restrictions on the design of these transformations. In this paper, we first derive the acceptance test for the stochastic Markov kernel consi… ▽ More

    Submitted 7 June, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

  25. arXiv:2007.08349  [pdf, other

    cs.LG cs.CV stat.ML

    Natural Graph Networks

    Authors: Pim de Haan, Taco Cohen, Max Welling

    Abstract: A key requirement for graph neural networks is that they must process a graph in a way that does not depend on how the graph is described. Traditionally this has been taken to mean that a graph network must be equivariant to node permutations. Here we show that instead of equivariance, the more general concept of naturality is sufficient for a graph network to be well-defined, opening up a larger… ▽ More

    Submitted 23 November, 2020; v1 submitted 16 July, 2020; originally announced July 2020.

    Comments: Published at NeurIPS 2020

  26. arXiv:2007.04618  [pdf, other

    cs.LG stat.ML

    Federated Learning of User Authentication Models

    Authors: Hossein Hosseini, Sungrack Yun, Hyunsin Park, Christos Louizos, Joseph Soriaga, Max Welling

    Abstract: Machine learning-based User Authentication (UA) models have been widely deployed in smart devices. UA models are trained to map input data of different users to highly separable embedding vectors, which are then used to accept or reject new inputs at test time. Training UA models requires having direct access to the raw inputs and embedding vectors of users, both of which are privacy-sensitive inf… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

  27. arXiv:2007.02731  [pdf, other

    cs.LG stat.ML

    SurVAE Flows: Surjections to Bridge the Gap between VAEs and Flows

    Authors: Didrik Nielsen, Priyank Jaini, Emiel Hoogeboom, Ole Winther, Max Welling

    Abstract: Normalizing flows and variational autoencoders are powerful generative models that can represent complicated density functions. However, they both impose constraints on the models: Normalizing flows use bijective transformations to model densities whereas VAEs learn stochastic transformations that are non-invertible and thus typically do not provide tractable estimates of the marginal likelihood.… ▽ More

    Submitted 30 October, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

  28. RE-MIMO: Recurrent and Permutation Equivariant Neural MIMO Detection

    Authors: Kumar Pratik, Bhaskar D. Rao, Max Welling

    Abstract: In this paper, we present a novel neural network for MIMO symbol detection. It is motivated by several important considerations in wireless communication systems; permutation equivariance and a variable number of users. The neural detector learns an iterative decoding algorithm that is implemented as a stack of iterative units. Each iterative unit is a neural computation module comprising of 3 sub… ▽ More

    Submitted 23 January, 2021; v1 submitted 30 June, 2020; originally announced July 2020.

    Comments: copyright 2020 IEEE TSP. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  29. arXiv:2006.16908  [pdf, other

    cs.LG stat.ML

    MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning

    Authors: Elise van der Pol, Daniel E. Worrall, Herke van Hoof, Frans A. Oliehoek, Max Welling

    Abstract: This paper introduces MDP homomorphic networks for deep reinforcement learning. MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP. Current approaches to deep reinforcement learning do not usually exploit knowledge about such structure. By building this prior knowledge into policy and value networks using an equivariance con… ▽ More

    Submitted 20 January, 2021; v1 submitted 30 June, 2020; originally announced June 2020.

  30. arXiv:2006.16653  [pdf, other

    cs.LG stat.CO stat.ME stat.ML

    Involutive MCMC: a Unifying Framework

    Authors: Kirill Neklyudov, Max Welling, Evgenii Egorov, Dmitry Vetrov

    Abstract: Markov Chain Monte Carlo (MCMC) is a computational approach to fundamental problems such as inference, integration, optimization, and simulation. The field has developed a broad spectrum of algorithms, varying in the way they are motivated, the way they are applied and how efficiently they sample. Despite all the differences, many of them share the same core principle, which we unify as the Involu… ▽ More

    Submitted 30 June, 2020; originally announced June 2020.

  31. arXiv:2006.10833  [pdf, other

    cs.LG stat.ML

    Amortized Causal Discovery: Learning to Infer Causal Graphs from Time-Series Data

    Authors: Sindy Löwe, David Madras, Richard Zemel, Max Welling

    Abstract: On time-series data, most causal discovery methods fit a new model whenever they encounter samples from a new underlying causal graph. However, these samples often share relevant information which is lost when following this approach. Specifically, different samples may share the dynamics which describe the effects of their causal relations. We propose Amortized Causal Discovery, a novel framework… ▽ More

    Submitted 21 February, 2022; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: Accepted as a conference paper at CLeaR 2022

  32. arXiv:2006.10503  [pdf, other

    cs.LG stat.ML

    SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks

    Authors: Fabian B. Fuchs, Daniel E. Worrall, Volker Fischer, Max Welling

    Abstract: We introduce the SE(3)-Transformer, a variant of the self-attention module for 3D point clouds and graphs, which is equivariant under continuous 3D roto-translations. Equivariance is important to ensure stable and predictable performance in the presence of nuisance transformations of the data input. A positive corollary of equivariance is increased weight-tying within the model. The SE(3)-Transfor… ▽ More

    Submitted 24 November, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

  33. arXiv:2006.01910  [pdf, other

    cs.LG cs.CV stat.ML

    The Convolution Exponential and Generalized Sylvester Flows

    Authors: Emiel Hoogeboom, Victor Garcia Satorras, Jakub M. Tomczak, Max Welling

    Abstract: This paper introduces a new method to build linear flows, by taking the exponential of a linear transformation. This linear transformation does not need to be invertible itself, and the exponential has the following desirable properties: it is guaranteed to be invertible, its inverse is straightforward to compute and the log Jacobian determinant is equal to the trace of the linear transformation.… ▽ More

    Submitted 26 October, 2020; v1 submitted 2 June, 2020; originally announced June 2020.

    Comments: Accepted to Neural Information Processing Systems (NeurIPS) 2020

  34. arXiv:2005.07093  [pdf, other

    cs.LG cs.CV stat.ML

    Bayesian Bits: Unifying Quantization and Pruning

    Authors: Mart van Baalen, Christos Louizos, Markus Nagel, Rana Ali Amjad, Ying Wang, Tijmen Blankevoort, Max Welling

    Abstract: We introduce Bayesian Bits, a practical method for joint mixed precision quantization and pruning through gradient based optimization. Bayesian Bits employs a novel decomposition of the quantization operation, which sequentially considers doubling the bit width. At each new bit width, the residual error between the full precision value and the previously rounded value is quantized. We then decide… ▽ More

    Submitted 27 October, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

  35. arXiv:2004.09691  [pdf, other

    cs.LG eess.IV stat.ML

    A Data and Compute Efficient Design for Limited-Resources Deep Learning

    Authors: Mirgahney Mohamed, Gabriele Cesa, Taco S. Cohen, Max Welling

    Abstract: Thanks to their improved data efficiency, equivariant neural networks have gained increased interest in the deep learning community. They have been successfully applied in the medical domain where symmetries in the data can be effectively exploited to build more accurate and robust models. To be able to reach a much larger body of patients, mobile, on-device implementations of deep learning soluti… ▽ More

    Submitted 8 July, 2020; v1 submitted 20 April, 2020; originally announced April 2020.

    Comments: Accepted for poster presentation at the Practical Machine Learning for Developing Countries (PML4DC) workshop, ICLR 2020

  36. arXiv:2003.05425  [pdf, other

    cs.LG cs.CV stat.ML

    Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs

    Authors: Pim de Haan, Maurice Weiler, Taco Cohen, Max Welling

    Abstract: A common approach to define convolutions on meshes is to interpret them as a graph and apply graph convolutional networks (GCNs). Such GCNs utilize isotropic kernels and are therefore insensitive to the relative orientation of vertices and thus to the geometry of the mesh as a whole. We propose Gauge Equivariant Mesh CNNs which generalize GCNs to apply anisotropic gauge equivariant kernels. Since… ▽ More

    Submitted 19 November, 2021; v1 submitted 11 March, 2020; originally announced March 2020.

    Comments: Published at ICLR 2021

  37. arXiv:2003.01998  [pdf, other

    cs.LG stat.ML

    Neural Enhanced Belief Propagation on Factor Graphs

    Authors: Victor Garcia Satorras, Max Welling

    Abstract: A graphical model is a structured representation of locally dependent random variables. A traditional method to reason over these random variables is to perform inference using belief propagation. When provided with the true data generating process, belief propagation can infer the optimal posterior probability estimates in tree structured factor graphs. However, in many cases we may only have acc… ▽ More

    Submitted 16 March, 2021; v1 submitted 4 March, 2020; originally announced March 2020.

  38. arXiv:2002.11963  [pdf, other

    cs.LG stat.ML

    Plannable Approximations to MDP Homomorphisms: Equivariance under Actions

    Authors: Elise van der Pol, Thomas Kipf, Frans A. Oliehoek, Max Welling

    Abstract: This work exploits action equivariance for representation learning in reinforcement learning. Equivariance under actions states that transitions in the input space are mirrored by equivalent transitions in latent space, while the map and transition functions should also commute. We introduce a contrastive loss function that enforces action equivariance on the learned representations. We prove that… ▽ More

    Submitted 27 February, 2020; originally announced February 2020.

    Comments: To appear in Proceedings of the International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2020)

  39. arXiv:2002.07520  [pdf, other

    cs.LG stat.ML

    Gradient $\ell_1$ Regularization for Quantization Robustness

    Authors: Milad Alizadeh, Arash Behboodi, Mart van Baalen, Christos Louizos, Tijmen Blankevoort, Max Welling

    Abstract: We analyze the effect of quantizing weights and activations of neural networks on their loss and derive a simple regularization scheme that improves robustness against post-training quantization. By training quantization-ready networks, our approach enables storing a single set of weights that can be quantized on-demand to different bit-widths as energy and memory requirements of the application c… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

    Comments: ICLR 2020

  40. arXiv:2002.06043  [pdf, other

    cs.LG stat.ML

    Estimating Gradients for Discrete Random Variables by Sampling without Replacement

    Authors: Wouter Kool, Herke van Hoof, Max Welling

    Abstract: We derive an unbiased estimator for expectations over discrete random variables based on sampling without replacement, which reduces variance as it avoids duplicate samples. We show that our estimator can be derived as the Rao-Blackwellization of three different estimators. Combining our estimator with REINFORCE, we obtain a policy gradient estimator and we reduce its variance using a built-in con… ▽ More

    Submitted 14 February, 2020; originally announced February 2020.

    Comments: ICLR 2020

  41. arXiv:2002.05582  [pdf, other

    cs.LG stat.ML

    Learning to Predict Error for MRI Reconstruction

    Authors: Shi Hu, Nicola Pezzotti, Max Welling

    Abstract: In healthcare applications, predictive uncertainty has been used to assess predictive accuracy. In this paper, we demonstrate that predictive uncertainty estimated by the current methods does not highly correlate with prediction error by decomposing the latter into random and systematic errors, and showing that the former is equivalent to the variance of the random error. In addition, we observe t… ▽ More

    Submitted 6 July, 2021; v1 submitted 13 February, 2020; originally announced February 2020.

    Comments: Accepted to MICCAI 2021

  42. arXiv:1912.09802  [pdf, other

    cs.LG cs.CV stat.ML

    Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks

    Authors: Andrey Kuzmin, Markus Nagel, Saurabh Pitre, Sandeep Pendyam, Tijmen Blankevoort, Max Welling

    Abstract: The success of deep neural networks in many real-world applications is leading to new challenges in building more efficient architectures. One effective way of making networks more efficient is neural network compression. We provide an overview of existing neural network compression methods that can be used to make neural networks more efficient by changing the architecture of the network. First,… ▽ More

    Submitted 20 December, 2019; originally announced December 2019.

  43. arXiv:1912.00042  [pdf, other

    cs.LG cs.CV stat.ML

    Learning Likelihoods with Conditional Normalizing Flows

    Authors: Christina Winkler, Daniel Worrall, Emiel Hoogeboom, Max Welling

    Abstract: Normalizing Flows (NFs) are able to model complicated distributions p(y) with strong inter-dimensional correlations and high multimodality by transforming a simple base density p(z) through an invertible neural network under the change of variables formula. Such behavior is desirable in multivariate structured prediction tasks, where handcrafted per-pixel loss-based methods inadequately capture st… ▽ More

    Submitted 12 November, 2023; v1 submitted 29 November, 2019; originally announced December 2019.

    Comments: 18 pages, 8 Tables, 9 Figures, Preprint

  44. arXiv:1911.12247  [pdf, other

    stat.ML cs.AI cs.LG

    Contrastive Learning of Structured World Models

    Authors: Thomas Kipf, Elise van der Pol, Max Welling

    Abstract: A structured understanding of our world in terms of objects, relations, and hierarchies is an important component of human cognition. Learning such a structured world model from raw sensory data remains a challenge. As a step towards this goal, we introduce Contrastively-trained Structured World Models (C-SWMs). C-SWMs utilize a contrastive approach for representation learning in environments with… ▽ More

    Submitted 5 January, 2020; v1 submitted 27 November, 2019; originally announced November 2019.

    Comments: ICLR 2020

  45. arXiv:1911.10914  [pdf, other

    cs.LG stat.ML

    Invert to Learn to Invert

    Authors: Patrick Putzky, Max Welling

    Abstract: Iterative learning to infer approaches have become popular solvers for inverse problems. However, their memory requirements during training grow linearly with model depth, limiting in practice model expressiveness. In this work, we propose an iterative inverse model with constant memory that relies on invertible networks to avoid storing intermediate activations. As a result, the proposed approach… ▽ More

    Submitted 25 November, 2019; originally announced November 2019.

  46. arXiv:1910.06924  [pdf, other

    cs.LG stat.ML

    DP-MAC: The Differentially Private Method of Auxiliary Coordinates for Deep Learning

    Authors: Frederik Harder, Jonas Köhler, Max Welling, Mijung Park

    Abstract: Developing a differentially private deep learning algorithm is challenging, due to the difficulty in analyzing the sensitivity of objective functions that are typically used to train deep neural networks. Many existing methods resort to the stochastic gradient descent algorithm and apply a pre-defined sensitivity to the gradients for privatizing weights. However, their slow convergence typically y… ▽ More

    Submitted 15 October, 2019; originally announced October 2019.

  47. arXiv:1907.09557  [pdf, other

    cs.LG stat.ML

    Relational Generalized Few-Shot Learning

    Authors: Xiahan Shi, Leonard Salewski, Martin Schiegg, Zeynep Akata, Max Welling

    Abstract: Transferring learned models to novel tasks is a challenging problem, particularly if only very few labeled examples are available. Although this few-shot learning setup has received a lot of attention recently, most proposed methods focus on discriminating novel classes only. Instead, we consider the extended setup of generalized few-shot learning (GFSL), where the model is required to perform cla… ▽ More

    Submitted 15 September, 2020; v1 submitted 22 July, 2019; originally announced July 2019.

  48. arXiv:1907.06627  [pdf, other

    cs.LG cs.CV stat.ML

    Batch-Shaping for Learning Conditional Channel Gated Networks

    Authors: Babak Ehteshami Bejnordi, Tijmen Blankevoort, Max Welling

    Abstract: We present a method that trains large capacity neural networks with significantly improved accuracy and lower dynamic computational cost. We achieve this by gating the deep-learning architecture on a fine-grained-level. Individual convolutional maps are turned on/off conditionally on features in the network. To achieve this, we introduce a new residual block architecture that gates convolutional c… ▽ More

    Submitted 3 April, 2020; v1 submitted 15 July, 2019; originally announced July 2019.

    Comments: Published as a conference paper at ICLR 2020

  49. arXiv:1907.01949  [pdf, other

    cs.LG cs.CV stat.ML

    Supervised Uncertainty Quantification for Segmentation with Multiple Annotations

    Authors: Shi Hu, Daniel Worrall, Stefan Knegt, Bas Veeling, Henkjan Huisman, Max Welling

    Abstract: The accurate estimation of predictive uncertainty carries importance in medical scenarios such as lung node segmentation. Unfortunately, most existing works on predictive uncertainty do not return calibrated uncertainty estimates, which could be used in practice. In this work we exploit multi-grader annotation variability as a source of 'groundtruth' aleatoric uncertainty, which can be treated as… ▽ More

    Submitted 27 May, 2022; v1 submitted 3 July, 2019; originally announced July 2019.

    Comments: MICCAI 2019. Fixed a few typos

  50. arXiv:1906.08324  [pdf, ps, other

    cs.LG stat.ML

    The Functional Neural Process

    Authors: Christos Louizos, Xiahan Shi, Klamer Schutte, Max Welling

    Abstract: We present a new family of exchangeable stochastic processes, the Functional Neural Processes (FNPs). FNPs model distributions over functions by learning a graph of dependencies on top of latent representations of the points in the given dataset. In doing so, they define a Bayesian model without explicitly positing a prior distribution over latent global parameters; they instead adopt priors over… ▽ More

    Submitted 4 November, 2019; v1 submitted 19 June, 2019; originally announced June 2019.

    Comments: Published as a conference paper at NeurIPS 2019