Skip to main content

Showing 1–13 of 13 results for author: Sountsov, P

.
  1. arXiv:2411.04260  [pdf, other

    stat.CO

    Running Markov Chain Monte Carlo on Modern Hardware and Software

    Authors: Pavel Sountsov, Colin Carroll, Matthew D. Hoffman

    Abstract: Today, cheap numerical hardware offers huge amounts of parallel computing power, much of which is used for the task of fitting neural networks to data. Adoption of this hardware to accelerate statistical Markov chain Monte Carlo (MCMC) applications has been much slower. In this chapter, we suggest some patterns for speeding up MCMC workloads using the hardware (e.g., GPUs, TPUs) and software (e.g.… ▽ More

    Submitted 6 November, 2024; originally announced November 2024.

  2. arXiv:2402.01915  [pdf, other

    cs.CV stat.CO

    Robust Inverse Graphics via Probabilistic Inference

    Authors: Tuan Anh Le, Pavel Sountsov, Matthew D. Hoffman, Ben Lee, Brian Patton, Rif A. Saurous

    Abstract: How do we infer a 3D scene from a single image in the presence of corruptions like rain, snow or fog? Straightforward domain randomization relies on knowing the family of corruptions ahead of time. Here, we propose a Bayesian approach-dubbed robust inverse graphics (RIG)-that relies on a strong scene prior and an uninformative uniform corruption prior, making it applicable to a wide range of corru… ▽ More

    Submitted 11 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: ICML submission. Reworked main body, new appendix figures

  3. arXiv:2312.02179  [pdf, other

    cs.LG cs.AI cs.CL

    Training Chain-of-Thought via Latent-Variable Inference

    Authors: Du Phan, Matthew D. Hoffman, David Dohan, Sholto Douglas, Tuan Anh Le, Aaron Parisi, Pavel Sountsov, Charles Sutton, Sharad Vikram, Rif A. Saurous

    Abstract: Large language models (LLMs) solve problems more accurately and interpretably when instructed to work out the answer step by step using a ``chain-of-thought'' (CoT) prompt. One can also improve LLMs' performance on a specific task by supervised fine-tuning, i.e., by using gradient ascent on some tunable parameters to maximize the average log-likelihood of correct answers from a labeled training se… ▽ More

    Submitted 28 November, 2023; originally announced December 2023.

    Comments: 23 pages, 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  4. arXiv:2210.17415  [pdf, other

    cs.CV cs.LG

    ProbNeRF: Uncertainty-Aware Inference of 3D Shapes from 2D Images

    Authors: Matthew D. Hoffman, Tuan Anh Le, Pavel Sountsov, Christopher Suter, Ben Lee, Vikash K. Mansinghka, Rif A. Saurous

    Abstract: The problem of inferring object shape from a single 2D image is underconstrained. Prior knowledge about what objects are plausible can help, but even given such prior knowledge there may still be uncertainty about the shapes of occluded parts of objects. Recently, conditional neural radiance field (NeRF) models have been developed that can learn to infer good point estimates of 3D models from sing… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: 18 pages, 18 figures, 1 table; submitted to the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023)

    MSC Class: 62F15 (Primary) 68T45 (Secondary) ACM Class: G.3; I.5.1; I.4.10

  5. arXiv:2210.12200  [pdf, other

    stat.CO

    Adaptive Tuning for Metropolis Adjusted Langevin Trajectories

    Authors: Lionel Riou-Durand, Pavel Sountsov, Jure Vogrinc, Charles C. Margossian, Sam Power

    Abstract: Hamiltonian Monte Carlo (HMC) is a widely used sampler for continuous probability distributions. In many cases, the underlying Hamiltonian dynamics exhibit a phenomenon of resonance which decreases the efficiency of the algorithm and makes it very sensitive to hyperparameter values. This issue can be tackled efficiently, either via the use of trajectory length randomization (RHMC) or via partial m… ▽ More

    Submitted 22 February, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

    Comments: Improve figure colors; clarify text; add extra supplement figure

  6. arXiv:2110.13017  [pdf, other

    stat.ME

    Nested $\hat R$: Assessing the convergence of Markov chain Monte Carlo when running many short chains

    Authors: Charles C. Margossian, Matthew D. Hoffman, Pavel Sountsov, Lionel Riou-Durand, Aki Vehtari, Andrew Gelman

    Abstract: Recent developments in parallel Markov chain Monte Carlo (MCMC) algorithms allow us to run thousands of chains almost as quickly as a single chain, using hardware accelerators such as GPUs. While each chain still needs to forget its initial point during a warmup phase, the subsequent sampling phase can be shorter than in classical settings, where we run only a few chains. To determine if the resul… ▽ More

    Submitted 30 May, 2024; v1 submitted 25 October, 2021; originally announced October 2021.

    Journal ref: Bayesian Analysis 2024

  7. arXiv:2110.11576  [pdf, other

    stat.CO

    Focusing on Difficult Directions for Learning HMC Trajectory Lengths

    Authors: Pavel Sountsov, Matt D. Hoffman

    Abstract: Hamiltonian Monte Carlo (HMC) is a premier Markov Chain Monte Carlo (MCMC) algorithm for continuous target distributions. Its full potential can only be unleashed when its problem-dependent hyperparameters are tuned well. The adaptation of one such hyperparameter, trajectory length ($τ$), has been closely examined by many research programs with the No-U-Turn Sampler (NUTS) coming out as the prefer… ▽ More

    Submitted 6 May, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: Improved exposition. Fixed Figure 2

  8. arXiv:2006.06897  [pdf, other

    stat.ML cs.LG

    MCMC Should Mix: Learning Energy-Based Model with Neural Transport Latent Space MCMC

    Authors: Erik Nijkamp, Ruiqi Gao, Pavel Sountsov, Srinivas Vasudevan, Bo Pang, Song-Chun Zhu, Ying Nian Wu

    Abstract: Learning energy-based model (EBM) requires MCMC sampling of the learned model as an inner loop of the learning algorithm. However, MCMC sampling of EBMs in high-dimensional data space is generally not mixing, because the energy function, which is usually parametrized by a deep network, is highly multi-modal in the data space. This is a serious handicap for both theory and practice of EBMs. In this… ▽ More

    Submitted 16 March, 2022; v1 submitted 11 June, 2020; originally announced June 2020.

  9. arXiv:2002.01184  [pdf, ps, other

    stat.CO cs.PL stat.ML

    tfp.mcmc: Modern Markov Chain Monte Carlo Tools Built for Modern Hardware

    Authors: Junpeng Lao, Christopher Suter, Ian Langmore, Cyril Chimisov, Ashish Saxena, Pavel Sountsov, Dave Moore, Rif A. Saurous, Matthew D. Hoffman, Joshua V. Dillon

    Abstract: Markov chain Monte Carlo (MCMC) is widely regarded as one of the most important algorithms of the 20th century. Its guarantees of asymptotic convergence, stability, and estimator-variance bounds using only unnormalized probability functions make it indispensable to probabilistic programming. In this paper, we introduce the TensorFlow Probability MCMC toolkit, and discuss some of the considerations… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

    Comments: Based on extended abstract submitted to PROBPROG 2020

  10. arXiv:2001.05035  [pdf, ps, other

    stat.CO

    FunMC: A functional API for building Markov Chains

    Authors: Pavel Sountsov, Alexey Radul, Srinivas Vasudevan

    Abstract: Constant-memory algorithms, also loosely called Markov chains, power the vast majority of probabilistic inference and machine learning applications today. A lot of progress has been made in constructing user-friendly APIs around these algorithms. Such APIs, however, rarely make it easy to research new algorithms of this type. In this work we present FunMC, a minimal Python library for doing method… ▽ More

    Submitted 26 May, 2021; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: Updated source code to reflect API; updated link to point to new location

  11. arXiv:2001.05033  [pdf, other

    stat.CO

    Hamiltonian Monte Carlo Swindles

    Authors: Dan Piponi, Matthew D. Hoffman, Pavel Sountsov

    Abstract: Hamiltonian Monte Carlo (HMC) is a powerful Markov chain Monte Carlo (MCMC) algorithm for estimating expectations with respect to continuous un-normalized probability distributions. MCMC estimators typically have higher variance than classical Monte Carlo with i.i.d. samples due to autocorrelations; most MCMC research tries to reduce these autocorrelations. In this work, we explore a complementary… ▽ More

    Submitted 2 March, 2020; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: To be published in AISTATS 2020

  12. arXiv:1903.03704  [pdf, other

    stat.CO stat.ML

    NeuTra-lizing Bad Geometry in Hamiltonian Monte Carlo Using Neural Transport

    Authors: Matthew Hoffman, Pavel Sountsov, Joshua V. Dillon, Ian Langmore, Dustin Tran, Srinivas Vasudevan

    Abstract: Hamiltonian Monte Carlo is a powerful algorithm for sampling from difficult-to-normalize posterior distributions. However, when the geometry of the posterior is unfavorable, it may take many expensive evaluations of the target distribution and its gradient to converge and mix. We propose neural transport (NeuTra) HMC, a technique for learning to correct this sort of unfavorable geometry using inve… ▽ More

    Submitted 8 March, 2019; originally announced March 2019.

  13. arXiv:1606.03402  [pdf, other

    cs.AI cs.CL

    Length bias in Encoder Decoder Models and a Case for Global Conditioning

    Authors: Pavel Sountsov, Sunita Sarawagi

    Abstract: Encoder-decoder networks are popular for modeling sequences probabilistically in many applications. These models use the power of the Long Short-Term Memory (LSTM) architecture to capture the full dependence among variables, unlike earlier models like CRFs that typically assumed conditional independence among non-adjacent variables. However in practice encoder-decoder models exhibit a bias towards… ▽ More

    Submitted 21 September, 2016; v1 submitted 10 June, 2016; originally announced June 2016.