Skip to main content

Showing 1–28 of 28 results for author: Wibisono, A

Searching in archive math. Search in all archives.
.
  1. arXiv:2505.12553  [pdf, ps, other

    math.OC cs.LG stat.ML

    Hamiltonian Descent Algorithms for Optimization: Accelerated Rates via Randomized Integration Time

    Authors: Qiang Fu, Andre Wibisono

    Abstract: We study the Hamiltonian flow for optimization (HF-opt), which simulates the Hamiltonian dynamics for some integration time and resets the velocity to $0$ to decrease the objective function; this is the optimization analogue of the Hamiltonian Monte Carlo algorithm for sampling. For short integration time, HF-opt has the same convergence rates as gradient descent for minimizing strongly and weakly… ▽ More

    Submitted 17 September, 2025; v1 submitted 18 May, 2025; originally announced May 2025.

  2. arXiv:2502.05623  [pdf, ps, other

    cs.IT cs.LG math.OC math.ST

    Mixing Time of the Proximal Sampler in Relative Fisher Information via Strong Data Processing Inequality

    Authors: Andre Wibisono

    Abstract: We study the mixing time guarantee for sampling in relative Fisher information via the Proximal Sampler algorithm, which is an approximate proximal discretization of the Langevin dynamics. We show that when the target probability distribution is strongly log-concave, the relative Fisher information converges exponentially fast along the Proximal Sampler; this matches the exponential convergence ra… ▽ More

    Submitted 27 June, 2025; v1 submitted 8 February, 2025; originally announced February 2025.

    Comments: v2: Extended abstract accepted for presentation at Conference on Learning Theory (COLT) 2025

  3. arXiv:2412.20471  [pdf, ps, other

    cs.GT cs.LG math.OC stat.ML

    On the Convergence of Min-Max Langevin Dynamics and Algorithm

    Authors: Yang Cai, Siddharth Mitra, Xiuyuan Wang, Andre Wibisono

    Abstract: We study zero-sum games in the space of probability distributions over the Euclidean space $\mathbb{R}^d$ with entropy regularization, in the setting when the interaction function between the players is smooth and strongly convex-strongly concave. We prove an exponential convergence guarantee for the mean-field min-max Langevin dynamics to compute the equilibrium distribution of the zero-sum game.… ▽ More

    Submitted 27 June, 2025; v1 submitted 29 December, 2024; originally announced December 2024.

    Comments: v3: Accepted for presentation at the Conference on Learning Theory (COLT) 2025. v2: Revised introduction and presentation of results

  4. arXiv:2412.18701  [pdf, other

    stat.CO math.ST stat.ML

    High-accuracy sampling from constrained spaces with the Metropolis-adjusted Preconditioned Langevin Algorithm

    Authors: Vishwak Srinivasan, Andre Wibisono, Ashia Wilson

    Abstract: In this work, we propose a first-order sampling method called the Metropolis-adjusted Preconditioned Langevin Algorithm for approximate sampling from a target distribution whose support is a proper convex subset of $\mathbb{R}^{d}$. Our proposed method is the result of applying a Metropolis-Hastings filter to the Markov chain formed by a single step of the preconditioned Langevin algorithm with a… ▽ More

    Submitted 25 February, 2025; v1 submitted 24 December, 2024; originally announced December 2024.

    Comments: 55 pages, 5 figures, 2 tables. Shorter version without experiments accepted at ALT 2025. v3: fixes minor typographical errors

  5. arXiv:2410.10699  [pdf, ps, other

    math.ST cs.IT stat.ML

    Fast Convergence of $Φ$-Divergence Along the Unadjusted Langevin Algorithm and Proximal Sampler

    Authors: Siddharth Mitra, Andre Wibisono

    Abstract: We study the mixing time of two popular discrete-time Markov chains in continuous space, the Unadjusted Langevin Algorithm and the Proximal Sampler, which are discretizations of the Langevin dynamics. We extend mixing time analyses for these Markov chains to hold in $Φ$-divergence. We show that any $Φ$-divergence arising from a twice-differentiable strictly convex function $Φ$ converges to $0$ exp… ▽ More

    Submitted 12 February, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

    Comments: v2: Minor changes and reorganization. Accepted at ALT 2025

  6. arXiv:2405.03472  [pdf, other

    math.OC cs.GT cs.LG math.DS math.NA

    A Symplectic Analysis of Alternating Mirror Descent

    Authors: Jonas Katona, Xiuyuan Wang, Andre Wibisono

    Abstract: Motivated by understanding the behavior of the Alternating Mirror Descent (AMD) algorithm for bilinear zero-sum games, we study the discretization of continuous-time Hamiltonian flow via the symplectic Euler method. We provide a framework for analysis using results from Hamiltonian dynamics, Lie algebra, and symplectic numerical integrators, with an emphasis on the existence and properties of a co… ▽ More

    Submitted 17 May, 2025; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: 94 pages, 3 figures

  7. arXiv:2402.17067  [pdf, ps, other

    math.ST cs.IT stat.ML

    Characterizing Dependence of Samples along the Langevin Dynamics and Algorithms via Contraction of $Φ$-Mutual Information

    Authors: Jiaming Liang, Siddharth Mitra, Andre Wibisono

    Abstract: The mixing time of a Markov chain determines how fast the iterates of the Markov chain converge to the stationary distribution; however, it does not control the dependencies between samples along the Markov chain. In this paper, we study the question of how fast the samples become approximately independent along popular Markov chains for continuous-space sampling: the Langevin dynamics in continuo… ▽ More

    Submitted 26 June, 2025; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted at COLT 2025. 54 pages

  8. arXiv:2402.07747  [pdf, ps, other

    math.ST stat.ML

    Optimal score estimation via empirical Bayes smoothing

    Authors: Andre Wibisono, Yihong Wu, Kaylee Yingxi Yang

    Abstract: We study the problem of estimating the score function of an unknown probability distribution $ρ^*$ from $n$ independent and identically distributed observations in $d$ dimensions. Assuming that $ρ^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$, we establish the optimal rate of $\tilde Θ(n^{-\frac{2}{d+4}})$ for this estimation problem under the loss function… ▽ More

    Submitted 12 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: COLT 2024; added the new results on extending to beta-Holder scores with beta <= 1

  9. arXiv:2312.08823  [pdf, other

    stat.CO cs.DS cs.LG math.ST stat.ML

    Fast sampling from constrained spaces using the Metropolis-adjusted Mirror Langevin algorithm

    Authors: Vishwak Srinivasan, Andre Wibisono, Ashia Wilson

    Abstract: We propose a new method called the Metropolis-adjusted Mirror Langevin algorithm for approximate sampling from distributions whose support is a compact and convex set. This algorithm adds an accept-reject filter to the Markov chain induced by a single step of the Mirror Langevin algorithm (Zhang et al., 2020), which is a basic discretisation of the Mirror Langevin dynamics. Due to the inclusion of… ▽ More

    Submitted 21 June, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 49 pages, 6 figures, 2 tables. Shorter version without experiments accepted to COLT 2024

  10. arXiv:2309.14155  [pdf, other

    math.OC cs.LG

    Extragradient Type Methods for Riemannian Variational Inequality Problems

    Authors: Zihao Hu, Guanghui Wang, Xi Wang, Andre Wibisono, Jacob Abernethy, Molei Tao

    Abstract: Riemannian convex optimization and minimax optimization have recently drawn considerable attention. Their appeal lies in their capacity to adeptly manage the non-convexity of the objective function as well as constraints inherent in the feasible set in the Euclidean sense. In this work, we delve into monotone Riemannian Variational Inequality Problems (RVIPs), which encompass both Riemannian conve… ▽ More

    Submitted 1 June, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: Published in Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024)

  11. arXiv:2306.13801  [pdf, ps, other

    math.ST

    On a Class of Gibbs Sampling over Networks

    Authors: Bo Yuan, Jiaojiao Fan, Jiaming Liang, Andre Wibisono, Yongxin Chen

    Abstract: We consider the sampling problem from a composite distribution whose potential (negative log density) is $\sum_{i=1}^n f_i(x_i)+\sum_{j=1}^m g_j(y_j)+\sum_{i=1}^n\sum_{j=1}^m\frac{σ_{ij}}{2η} \Vert x_i-y_j \Vert^2_2$ where each of $x_i$ and $y_j$ is in $\mathbb{R}^d$, $f_1, f_2, \ldots, f_n, g_1, g_2, \ldots, g_m$ are strongly convex functions, and $\{σ_{ij}\}$ encodes a network structure. % mot… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

    Comments: Accepted in COLT 2023

  12. arXiv:2302.07851  [pdf, other

    math.OC cs.LG

    Continuized Acceleration for Quasar Convex Functions in Non-Convex Optimization

    Authors: Jun-Kun Wang, Andre Wibisono

    Abstract: Quasar convexity is a condition that allows some first-order methods to efficiently minimize a function even when the optimization landscape is non-convex. Previous works develop near-optimal accelerated algorithms for minimizing this class of functions, however, they require a subroutine of binary search which results in multiple calls to gradient evaluations in each iteration, and consequently t… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

    Comments: Accepted at ICLR (International Conference on Learning Representations), 2023

  13. arXiv:2211.01512  [pdf, ps, other

    cs.LG math.ST

    Convergence of the Inexact Langevin Algorithm and Score-based Generative Models in KL Divergence

    Authors: Kaylee Yingxi Yang, Andre Wibisono

    Abstract: We study the Inexact Langevin Dynamics (ILD), Inexact Langevin Algorithm (ILA), and Score-based Generative Modeling (SGM) when utilizing estimated score functions for sampling. Our focus lies in establishing stable biased convergence guarantees in terms of the Kullback-Leibler (KL) divergence. To achieve these guarantees, we impose two key assumptions: 1) the target distribution satisfies the log-… ▽ More

    Submitted 2 June, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

  14. arXiv:2206.11872  [pdf, other

    math.OC cs.LG

    Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out

    Authors: Jun-Kun Wang, Chi-Heng Lin, Andre Wibisono, Bin Hu

    Abstract: Heavy Ball (HB) nowadays is one of the most popular momentum methods in non-convex optimization. It has been widely observed that incorporating the Heavy Ball dynamic in gradient-based methods accelerates the training process of modern machine learning models. However, the progress on establishing its theoretical foundation of acceleration is apparently far behind its empirical success. Existing p… ▽ More

    Submitted 29 August, 2023; v1 submitted 22 June, 2022; originally announced June 2022.

    Comments: (ICML 2022) Proceedings of the 39th International Conference on Machine Learning;

  15. arXiv:2206.04160  [pdf, other

    cs.GT cs.LG math.DS

    Alternating Mirror Descent for Constrained Min-Max Games

    Authors: Andre Wibisono, Molei Tao, Georgios Piliouras

    Abstract: In this paper we study two-player bilinear zero-sum games with constrained strategy spaces. An instance of natural occurrences of such constraints is when mixed strategies are used, which correspond to a probability simplex constraint. We propose and analyze the alternating mirror descent algorithm, in which each player takes turns to take action following the mirror descent algorithm for constrai… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

  16. arXiv:2202.06386  [pdf, ps, other

    math.ST stat.ML

    Improved analysis for a proximal algorithm for sampling

    Authors: Yongxin Chen, Sinho Chewi, Adil Salim, Andre Wibisono

    Abstract: We study the proximal sampler of Lee, Shen, and Tian (2021) and obtain new convergence guarantees under weaker assumptions than strong log-concavity: namely, our results hold for (1) weakly log-concave targets, and (2) targets satisfying isoperimetric assumptions which allow for non-log-concavity. We demonstrate our results by obtaining new state-of-the-art sampling guarantees for several classes… ▽ More

    Submitted 13 February, 2022; originally announced February 2022.

    Comments: 34 pages

  17. arXiv:2109.12077  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    The Mirror Langevin Algorithm Converges with Vanishing Bias

    Authors: Ruilin Li, Molei Tao, Santosh S. Vempala, Andre Wibisono

    Abstract: The technique of modifying the geometry of a problem from Euclidean to Hessian metric has proved to be quite effective in optimization, and has been the subject of study for sampling. The Mirror Langevin Diffusion (MLD) is a sampling analogue of mirror flow in continuous time, and it has nice convergence properties under log-Sobolev or Poincare inequalities relative to the Hessian metric, as shown… ▽ More

    Submitted 11 October, 2021; v1 submitted 24 September, 2021; originally announced September 2021.

  18. arXiv:1906.02027  [pdf, other

    math.OC cs.GT cs.LG stat.ML

    Last-iterate convergence rates for min-max optimization

    Authors: Jacob Abernethy, Kevin A. Lai, Andre Wibisono

    Abstract: While classic work in convex-concave min-max optimization relies on average-iterate convergence results, the emergence of nonconvex applications such as training Generative Adversarial Networks has led to renewed interest in last-iterate convergence guarantees. Proving last-iterate convergence is challenging because many natural algorithms, such as Simultaneous Gradient Descent/Ascent, provably di… ▽ More

    Submitted 25 October, 2019; v1 submitted 5 June, 2019; originally announced June 2019.

  19. arXiv:1903.08568  [pdf, other

    cs.DS cs.LG math.PR stat.ML

    Rapid Convergence of the Unadjusted Langevin Algorithm: Isoperimetry Suffices

    Authors: Santosh S. Vempala, Andre Wibisono

    Abstract: We study the Unadjusted Langevin Algorithm (ULA) for sampling from a probability distribution $ν= e^{-f}$ on $\mathbb{R}^n$. We prove a convergence guarantee in Kullback-Leibler (KL) divergence assuming $ν$ satisfies a log-Sobolev inequality and the Hessian of $f$ is bounded. Notably, we do not assume convexity or bounds on higher derivatives. We also prove convergence guarantees in Rényi divergen… ▽ More

    Submitted 2 March, 2022; v1 submitted 20 March, 2019; originally announced March 2019.

    Comments: v4: Updated discussion and added properties of biased limit v3: Simplified analysis of Rényi divergence, improved exposition, and added figures v2: Added analysis of Rényi divergence and Poincaré assumption

  20. arXiv:1902.08825  [pdf, other

    math.OC

    Accelerating Rescaled Gradient Descent: Fast Optimization of Smooth Functions

    Authors: Ashia Wilson, Lester Mackey, Andre Wibisono

    Abstract: We present a family of algorithms, called descent algorithms, for optimizing convex and non-convex functions. We also introduce a new first-order algorithm, called rescaled gradient descent (RGD), and show that RGD achieves a faster convergence rate than gradient descent provided the function is strongly smooth -- a natural generalization of the standard smoothness assumption on the objective func… ▽ More

    Submitted 4 January, 2020; v1 submitted 23 February, 2019; originally announced February 2019.

  21. arXiv:1802.08089  [pdf, ps, other

    math.OC cs.IT cs.LG stat.ML

    Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem

    Authors: Andre Wibisono

    Abstract: We study sampling as optimization in the space of measures. We focus on gradient flow-based optimization with the Langevin dynamics as a case study. We investigate the source of the bias of the unadjusted Langevin algorithm (ULA) in discrete time, and consider how to remove or reduce the bias. We point out the difficulty is that the heat flow is exactly solvable, but neither its forward nor backwa… ▽ More

    Submitted 6 June, 2018; v1 submitted 22 February, 2018; originally announced February 2018.

    Comments: To appear at the Conference on Learning Theory (COLT), July 2018

  22. arXiv:1702.03656  [pdf, ps, other

    cs.IT math.ST

    Information and estimation in Fokker-Planck channels

    Authors: Andre Wibisono, Varun Jog, Po-Ling Loh

    Abstract: We study the relationship between information- and estimation-theoretic quantities in time-evolving systems. We focus on the Fokker-Planck channel defined by a general stochastic differential equation, and show that the time derivatives of entropy, KL divergence, and mutual information are characterized by estimation-theoretic quantities involving an appropriate generalization of the Fisher inform… ▽ More

    Submitted 13 February, 2017; originally announced February 2017.

  23. arXiv:1603.04245  [pdf, ps, other

    math.OC cs.LG stat.ML

    A Variational Perspective on Accelerated Methods in Optimization

    Authors: Andre Wibisono, Ashia C. Wilson, Michael I. Jordan

    Abstract: Accelerated gradient methods play a central role in optimization, achieving optimal rates in many settings. While many generalizations and extensions of Nesterov's original acceleration method have been proposed, it is not yet clear what is the natural scope of the acceleration concept. In this paper, we study accelerated methods from a continuous-time perspective. We show that there is a Lagrangi… ▽ More

    Submitted 14 March, 2016; originally announced March 2016.

    Comments: 38 pages. Subsumes an earlier working draft arXiv:1509.03616

  24. arXiv:1509.03616  [pdf, other

    math.OC

    On Accelerated Methods in Optimization

    Authors: Andre Wibisono, Ashia C. Wilson

    Abstract: In convex optimization, there is an {\em acceleration} phenomenon in which we can boost the convergence rate of certain gradient-based algorithms. We can observe this phenomenon in Nesterov's accelerated gradient descent, accelerated mirror descent, and accelerated cubic-regularized Newton's method, among others. In this paper, we show that the family of higher-order gradient methods in discrete t… ▽ More

    Submitted 11 September, 2015; originally announced September 2015.

    Comments: 42 pages, 2 figures

  25. arXiv:1410.7098  [pdf, ps, other

    stat.ML math.ST

    Concavity of reweighted Kikuchi approximation

    Authors: Po-Ling Loh, Andre Wibisono

    Abstract: We analyze a reweighted version of the Kikuchi approximation for estimating the log partition function of a product distribution defined over a region graph. We establish sufficient conditions for the concavity of our reweighted objective function in terms of weight assignments in the Kikuchi expansion, and show that a reweighted version of the sum product algorithm applied to the Kikuchi region g… ▽ More

    Submitted 26 October, 2014; originally announced October 2014.

    Comments: To appear at the Neural Information Processing Systems (NIPS) conference, December 2014

  26. arXiv:1312.2139  [pdf, ps, other

    math.OC cs.IT stat.ML

    Optimal rates for zero-order convex optimization: the power of two function evaluations

    Authors: John C. Duchi, Michael I. Jordan, Martin J. Wainwright, Andre Wibisono

    Abstract: We consider derivative-free algorithms for stochastic and non-stochastic convex optimization problems that use only function values rather than gradients. Focusing on non-asymptotic bounds on convergence rates, we show that if pairs of function values are available, algorithms for $d$-dimensional optimization that use gradient estimates based on random perturbations suffer a factor of at most… ▽ More

    Submitted 20 August, 2014; v1 submitted 7 December, 2013; originally announced December 2013.

    Comments: 34 pages

  27. arXiv:1301.3321  [pdf, other

    math.ST math.PR q-bio.NC

    Maximum entropy distributions on graphs

    Authors: Christopher Hillar, Andre Wibisono

    Abstract: Inspired by applications to theories of coding and communication in networks of nervous tissue, we study maximum entropy distributions on weighted graphs with a given expected degree sequence. These distributions are characterized by independent edge weights parameterized by a shared vector of vertex potentials. Using the general theory of exponential family distributions, we derive the existence… ▽ More

    Submitted 16 December, 2018; v1 submitted 15 January, 2013; originally announced January 2013.

    Comments: 36 pages

  28. arXiv:1203.6812  [pdf, other

    math.FA

    Inverses of symmetric, diagonally dominant positive matrices and applications

    Authors: Christopher J. Hillar, Shaowei Lin, Andre Wibisono

    Abstract: We prove tight bounds for the $\infty$-norm of the inverse of symmetric, diagonally dominant positive matrices. We also prove a new lower-bound form of Hadamard's inequality for the determinant of diagonally dominant positive matrices and an improved upper bound for diagonally balanced positive matrices. Applications of our results include numerical stability for linear systems, bounds on inverses… ▽ More

    Submitted 8 March, 2013; v1 submitted 29 March, 2012; originally announced March 2012.

    Comments: 18 pages