Skip to main content

Showing 1–17 of 17 results for author: Dillon, J V

Searching in archive stat. Search in all archives.
.
  1. arXiv:2308.00679  [pdf, other

    math.NA math.OC stat.CO

    Sharp Taylor Polynomial Enclosures in One Dimension

    Authors: Matthew Streeter, Joshua V. Dillon

    Abstract: It is often useful to have polynomial upper or lower bounds on a one-dimensional function that are valid over a finite interval, called a trust region. A classical way to produce polynomial bounds of degree $k$ involves bounding the range of the $k$th derivative over the trust region, but this produces suboptimal bounds. We improve on this by deriving sharp polynomial upper and lower bounds for a… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: 28 pages, 5 figures

  2. arXiv:2212.11429  [pdf, other

    cs.LG math.OC stat.CO

    Automatically Bounding the Taylor Remainder Series: Tighter Bounds and New Applications

    Authors: Matthew Streeter, Joshua V. Dillon

    Abstract: We present a new algorithm for automatically bounding the Taylor remainder series. In the special case of a scalar function $f: \mathbb{R} \to \mathbb{R}$, our algorithm takes as input a reference point $x_0$, trust region $[a, b]$, and integer $k \ge 1$, and returns an interval $I$ such that $f(x) - \sum_{i=0}^{k-1} \frac {1} {i!} f^{(i)}(x_0) (x - x_0)^i \in I (x - x_0)^k$ for all… ▽ More

    Submitted 2 August, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: Previous version has been split into 3 articles: arXiv:2308.00679, arXiv:2308.00190, and this article

  3. arXiv:2211.09981  [pdf, other

    cs.LG cs.AI stat.ML

    Weighted Ensemble Self-Supervised Learning

    Authors: Yangjun Ruan, Saurabh Singh, Warren Morningstar, Alexander A. Alemi, Sergey Ioffe, Ian Fischer, Joshua V. Dillon

    Abstract: Ensembling has proven to be a powerful technique for boosting model performance, uncertainty estimation, and robustness in supervised learning. Advances in self-supervised learning (SSL) enable leveraging large unlabeled corpora for state-of-the-art few-shot and supervised learning performance. In this paper, we explore how ensemble methods can improve recent SSL techniques by developing a framewo… ▽ More

    Submitted 9 April, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: Accepted by ICLR 2023

  4. arXiv:2011.08711  [pdf, other

    stat.ML cs.LG

    VIB is Half Bayes

    Authors: Alexander A Alemi, Warren R Morningstar, Ben Poole, Ian Fischer, Joshua V Dillon

    Abstract: In discriminative settings such as regression and classification there are two random variables at play, the inputs X and the targets Y. Here, we demonstrate that the Variational Information Bottleneck can be viewed as a compromise between fully empirical and fully Bayesian objectives, attempting to minimize the risks due to finite sampling of Y only. We argue that this approach provides some of t… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

  5. arXiv:2010.09629  [pdf, other

    cs.LG stat.ML

    PAC$^m$-Bayes: Narrowing the Empirical Risk Gap in the Misspecified Bayesian Regime

    Authors: Warren R. Morningstar, Alexander A. Alemi, Joshua V. Dillon

    Abstract: The Bayesian posterior minimizes the "inferential risk" which itself bounds the "predictive risk". This bound is tight when the likelihood and prior are well-specified. However since misspecification induces a gap, the Bayesian posterior predictive distribution may have poor generalization performance. This work develops a multi-sample loss (PAC$^m$) which can close the gap by spanning a trade-off… ▽ More

    Submitted 23 May, 2022; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: Accepted at AISTATS2022

    Journal ref: International Conference on Artificial Intelligence and Statistics, 8270-8298, (2022)

  6. arXiv:2006.09273  [pdf, other

    cs.LG stat.ML

    Density of States Estimation for Out-of-Distribution Detection

    Authors: Warren R. Morningstar, Cusuh Ham, Andrew G. Gallagher, Balaji Lakshminarayanan, Alexander A. Alemi, Joshua V. Dillon

    Abstract: Perhaps surprisingly, recent studies have shown probabilistic model likelihoods have poor specificity for out-of-distribution (OOD) detection and often assign higher likelihoods to OOD data than in-distribution data. To ameliorate this issue we propose DoSE, the density of states estimator. Drawing on the statistical physics notion of ``density of states,'' the DoSE decision rule avoids direct com… ▽ More

    Submitted 22 June, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: Submitted to NeurIPS. Corrected footnote from: "34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada" to "Preprint. Under review."

  7. arXiv:2003.01687  [pdf, other

    cs.LG stat.ML

    Automatic Differentiation Variational Inference with Mixtures

    Authors: Warren R. Morningstar, Sharad M. Vikram, Cusuh Ham, Andrew Gallagher, Joshua V. Dillon

    Abstract: Automatic Differentiation Variational Inference (ADVI) is a useful tool for efficiently learning probabilistic models in machine learning. Generally approximate posteriors learned by ADVI are forced to be unimodal in order to facilitate use of the reparameterization trick. In this paper, we show how stratified sampling may be used to enable mixture distributions as the approximate posterior, and d… ▽ More

    Submitted 24 June, 2020; v1 submitted 3 March, 2020; originally announced March 2020.

    Comments: Submitted to NeurIPS 2020, Corrected footnote from: "34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada" to "Preprint. Under review."

  8. arXiv:2002.02655  [pdf, other

    cs.LG stat.ML

    The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks

    Authors: Jakub Swiatkowski, Kevin Roth, Bastiaan S. Veeling, Linh Tran, Joshua V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin

    Abstract: Variational Bayesian Inference is a popular methodology for approximating posterior distributions over Bayesian neural network weights. Recent work developing this class of methods has explored ever richer parameterizations of the approximate posterior in the hope of improving performance. In contrast, here we share a curious experimental finding that suggests instead restricting the variational d… ▽ More

    Submitted 5 July, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

  9. arXiv:2002.01184  [pdf, ps, other

    stat.CO cs.PL stat.ML

    tfp.mcmc: Modern Markov Chain Monte Carlo Tools Built for Modern Hardware

    Authors: Junpeng Lao, Christopher Suter, Ian Langmore, Cyril Chimisov, Ashish Saxena, Pavel Sountsov, Dave Moore, Rif A. Saurous, Matthew D. Hoffman, Joshua V. Dillon

    Abstract: Markov chain Monte Carlo (MCMC) is widely regarded as one of the most important algorithms of the 20th century. Its guarantees of asymptotic convergence, stability, and estimator-variance bounds using only unnormalized probability functions make it indispensable to probabilistic programming. In this paper, we introduce the TensorFlow Probability MCMC toolkit, and discuss some of the considerations… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

    Comments: Based on extended abstract submitted to PROBPROG 2020

  10. arXiv:2001.11819  [pdf, ps, other

    cs.PL cs.LG stat.CO stat.ML

    Joint Distributions for TensorFlow Probability

    Authors: Dan Piponi, Dave Moore, Joshua V. Dillon

    Abstract: A central tenet of probabilistic programming is that a model is specified exactly once in a canonical representation which is usable by inference algorithms. We describe JointDistributions, a family of declarative representations of directed graphical models in TensorFlow Probability.

    Submitted 21 January, 2020; originally announced January 2020.

    Comments: Based on extended abstract submitted to PROBPROG 2020

  11. arXiv:2001.04694  [pdf, other

    cs.LG stat.ML

    Hydra: Preserving Ensemble Diversity for Model Distillation

    Authors: Linh Tran, Bastiaan S. Veeling, Kevin Roth, Jakub Swiatkowski, Joshua V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Sebastian Nowozin, Rodolphe Jenatton

    Abstract: Ensembles of models have been empirically shown to improve predictive performance and to yield robust measures of uncertainty. However, they are expensive in computation and memory. Therefore, recent research has focused on distilling ensembles into a single compact model, reducing the computational and memory burden of the ensemble while trying to preserve its predictive behavior. Most existing d… ▽ More

    Submitted 19 March, 2021; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: Accepted to ICML 2020 Workshop on Uncertainty and Robustness in Deep Learning

  12. arXiv:1906.02845  [pdf, other

    stat.ML cs.LG

    Likelihood Ratios for Out-of-Distribution Detection

    Authors: Jie Ren, Peter J. Liu, Emily Fertig, Jasper Snoek, Ryan Poplin, Mark A. DePristo, Joshua V. Dillon, Balaji Lakshminarayanan

    Abstract: Discriminative neural networks offer little or no performance guarantees when deployed on data not generated by the same process as the training distribution. On such out-of-distribution (OOD) inputs, the prediction may not only be erroneous, but confidently so, limiting the safe deployment of classifiers in real-world applications. One such challenging application is bacteria identification based… ▽ More

    Submitted 5 December, 2019; v1 submitted 6 June, 2019; originally announced June 2019.

    Comments: Accepted to NeurIPS 2019

  13. arXiv:1906.02530  [pdf, other

    stat.ML cs.LG

    Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift

    Authors: Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, D Sculley, Sebastian Nowozin, Joshua V. Dillon, Balaji Lakshminarayanan, Jasper Snoek

    Abstract: Modern machine learning methods including deep learning have achieved great success in predictive accuracy for supervised learning tasks, but may still fall short in giving useful estimates of their predictive {\em uncertainty}. Quantifying uncertainty is especially critical in real-world settings, which often involve input distributions that are shifted from the training distribution due to a var… ▽ More

    Submitted 17 December, 2019; v1 submitted 6 June, 2019; originally announced June 2019.

    Comments: Advances in Neural Information Processing Systems, 2019

  14. arXiv:1903.03704  [pdf, other

    stat.CO stat.ML

    NeuTra-lizing Bad Geometry in Hamiltonian Monte Carlo Using Neural Transport

    Authors: Matthew Hoffman, Pavel Sountsov, Joshua V. Dillon, Ian Langmore, Dustin Tran, Srinivas Vasudevan

    Abstract: Hamiltonian Monte Carlo is a powerful algorithm for sampling from difficult-to-normalize posterior distributions. However, when the geometry of the posterior is unfavorable, it may take many expensive evaluations of the target distribution and its gradient to converge and mix. We propose neural transport (NeuTra) HMC, a technique for learning to correct this sort of unfavorable geometry using inve… ▽ More

    Submitted 8 March, 2019; originally announced March 2019.

  15. arXiv:1807.00906  [pdf, other

    cs.LG stat.ML

    Uncertainty in the Variational Information Bottleneck

    Authors: Alexander A. Alemi, Ian Fischer, Joshua V. Dillon

    Abstract: We present a simple case study, demonstrating that Variational Information Bottleneck (VIB) can improve a network's classification calibration as well as its ability to detect out-of-distribution data. Without explicitly being designed to do so, VIB gives two natural metrics for handling and quantifying uncertainty.

    Submitted 2 July, 2018; originally announced July 2018.

    Comments: 10 pages, 7 figures. Accepted to UAI 2018 - Uncertainty in Deep Learning Workshop

  16. arXiv:1711.10604  [pdf, ps, other

    cs.LG cs.AI cs.PL stat.ML

    TensorFlow Distributions

    Authors: Joshua V. Dillon, Ian Langmore, Dustin Tran, Eugene Brevdo, Srinivas Vasudevan, Dave Moore, Brian Patton, Alex Alemi, Matt Hoffman, Rif A. Saurous

    Abstract: The TensorFlow Distributions library implements a vision of probability theory adapted to the modern deep-learning paradigm of end-to-end differentiable computation. Building on two basic abstractions, it offers flexible building blocks for probabilistic computation. Distributions provide fast, numerically stable methods for generating samples and computing statistics, e.g., log density. Bijectors… ▽ More

    Submitted 28 November, 2017; originally announced November 2017.

  17. arXiv:1711.00464  [pdf, other

    cs.LG stat.ML

    Fixing a Broken ELBO

    Authors: Alexander A. Alemi, Ben Poole, Ian Fischer, Joshua V. Dillon, Rif A. Saurous, Kevin Murphy

    Abstract: Recent work in unsupervised representation learning has focused on learning deep directed latent-variable models. Fitting these models by maximizing the marginal likelihood or evidence is typically intractable, thus a common approximation is to maximize the evidence lower bound (ELBO) instead. However, maximum likelihood training (whether exact or approximate) does not necessarily result in a good… ▽ More

    Submitted 13 February, 2018; v1 submitted 1 November, 2017; originally announced November 2017.

    Comments: 21 pages, 9 figures