Skip to main content

Showing 1–4 of 4 results for author: Paulus, M B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2206.13414  [pdf, other

    cs.LG math.OC stat.ML

    Learning To Cut By Looking Ahead: Cutting Plane Selection via Imitation Learning

    Authors: Max B. Paulus, Giulia Zarpellon, Andreas Krause, Laurent Charlin, Chris J. Maddison

    Abstract: Cutting planes are essential for solving mixed-integer linear problems (MILPs), because they facilitate bound improvements on the optimal solution value. For selecting cuts, modern solvers rely on manually designed heuristics that are tuned to gauge the potential effectiveness of cuts. We show that a greedy selection rule explicitly looking ahead to select cuts that yield the best bound improvemen… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: ICML 2022

  2. arXiv:2110.01515  [pdf, other

    cs.LG stat.ML

    A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning

    Authors: Iris A. M. Huijben, Wouter Kool, Max B. Paulus, Ruud J. G. van Sloun

    Abstract: The Gumbel-max trick is a method to draw a sample from a categorical distribution, given by its unnormalized (log-)probabilities. Over the past years, the machine learning community has proposed several extensions of this trick to facilitate, e.g., drawing multiple samples, sampling from structured domains, or gradient estimation for error backpropagation in neural network optimization. The goal o… ▽ More

    Submitted 8 March, 2022; v1 submitted 4 October, 2021; originally announced October 2021.

    Comments: Accepted as a survey article in IEEE TPAMI

  3. arXiv:2010.04838  [pdf, other

    stat.ML cs.LG

    Rao-Blackwellizing the Straight-Through Gumbel-Softmax Gradient Estimator

    Authors: Max B. Paulus, Chris J. Maddison, Andreas Krause

    Abstract: Gradient estimation in models with discrete latent variables is a challenging problem, because the simplest unbiased estimators tend to have high variance. To counteract this, modern estimators either introduce bias, rely on multiple function evaluations, or use learned, input-dependent baselines. Thus, there is a need for estimators that require minimal tuning, are computationally cheap, and have… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

  4. arXiv:2006.08063  [pdf, other

    stat.ML cs.LG

    Gradient Estimation with Stochastic Softmax Tricks

    Authors: Max B. Paulus, Dami Choi, Daniel Tarlow, Andreas Krause, Chris J. Maddison

    Abstract: The Gumbel-Max trick is the basis of many relaxed gradient estimators. These estimators are easy to implement and low variance, but the goal of scaling them comprehensively to large combinatorial distributions is still outstanding. Working within the perturbation model framework, we introduce stochastic softmax tricks, which generalize the Gumbel-Softmax trick to combinatorial spaces. Our framewor… ▽ More

    Submitted 28 February, 2021; v1 submitted 14 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020, final copy