Skip to main content

Showing 1–50 of 145 results for author: Osher, S

.
  1. arXiv:2506.05723  [pdf, ps, other

    math.OC

    Simulating Fokker-Planck equations via mean field control of score-based normalizing flows

    Authors: Mo Zhou, Stanley Osher, Wuchen Li

    Abstract: The Fokker-Planck (FP) equation governs the evolution of densities for stochastic dynamics of physical systems, such as the Langevin dynamics and the Lorenz system. This work simulates FP equations through a mean field control (MFC) problem. We first formulate the FP equation as a continuity equation, where the velocity field consists of the drift function and the score function, i.e., the gradien… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    MSC Class: 34K35; 37N35; 58E25; 93C15; 93C20 ACM Class: G.1.7

  2. arXiv:2506.00674  [pdf, ps, other

    cs.LO cs.AI cs.LG math.OC

    Thinking Out of the Box: Hybrid SAT Solving by Unconstrained Continuous Optimization

    Authors: Zhiwei Zhang, Samy Wu Fung, Anastasios Kyrillidis, Stanley Osher, Moshe Y. Vardi

    Abstract: The Boolean satisfiability (SAT) problem lies at the core of many applications in combinatorial optimization, software verification, cryptography, and machine learning. While state-of-the-art solvers have demonstrated high efficiency in handling conjunctive normal form (CNF) formulas, numerous applications require non-CNF (hybrid) constraints, such as XOR, cardinality, and Not-All-Equal constraint… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

  3. arXiv:2505.13499  [pdf, ps, other

    cs.LG cs.AI math.OC

    Optimal Control for Transformer Architectures: Enhancing Generalization, Robustness and Efficiency

    Authors: Kelvin Kan, Xingjian Li, Benjamin J. Zhang, Tuhin Sahai, Stanley Osher, Markos A. Katsoulakis

    Abstract: We study Transformers through the perspective of optimal control theory, using tools from continuous-time formulations to derive actionable insights into training and architecture design. This framework improves the performance of existing Transformer models while providing desirable theoretical guarantees, including generalization and robustness. Our framework is designed to be plug-and-play, ena… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

  4. arXiv:2502.20833  [pdf, other

    math.NA

    Recent Advances in Numerical Solutions for Hamilton-Jacobi PDEs

    Authors: Tingwei Meng, Siting Liu, Samy Wu Fung, Stanley Osher

    Abstract: Hamilton-Jacobi partial differential equations (HJ PDEs) play a central role in many applications such as economics, physics, and engineering. These equations describe the evolution of a value function which encodes valuable information about the system, such as action, cost, or level sets of a dynamic process. Their importance lies in their ability to model diverse phenomena, ranging from the pro… ▽ More

    Submitted 28 February, 2025; originally announced February 2025.

  5. arXiv:2502.16773  [pdf, ps, other

    stat.CO

    Splitting Regularized Wasserstein Proximal Algorithms for Nonsmooth Sampling Problems

    Authors: Fuqun Han, Stanley Osher, Wuchen Li

    Abstract: Sampling from nonsmooth target probability distributions is essential in various applications, including the Bayesian Lasso. We propose a splitting-based sampling algorithm for the time-implicit discretization of the probability flow for the Fokker-Planck equation, where the score function defined as the gradient logarithm of the current probability density function, is approximated by the regular… ▽ More

    Submitted 23 February, 2025; originally announced February 2025.

  6. arXiv:2502.06026  [pdf, other

    cs.LG math.NA

    A Multimodal PDE Foundation Model for Prediction and Scientific Text Descriptions

    Authors: Elisa Negrini, Yuxuan Liu, Liu Yang, Stanley J. Osher, Hayden Schaeffer

    Abstract: Neural networks are one tool for approximating non-linear differential equations used in scientific computing tasks such as surrogate modeling, real-time predictions, and optimal control. PDE foundation models utilize neural networks to train approximations to multiple differential equations simultaneously and are thus a general purpose solver that can be adapted to downstream tasks. Current PDE f… ▽ More

    Submitted 9 February, 2025; originally announced February 2025.

  7. arXiv:2501.19351  [pdf, other

    cs.LG

    Neural Implicit Solution Formula for Efficiently Solving Hamilton-Jacobi Equations

    Authors: Yesom Park, Stanley Osher

    Abstract: This paper presents an implicit solution formula for the Hamilton-Jacobi partial differential equation (HJ PDE). The formula is derived using the method of characteristics and is shown to coincide with the Hopf and Lax formulas in the case where either the Hamiltonian or the initial function is convex. It provides a simple and efficient numerical approach for computing the viscosity solution of HJ… ▽ More

    Submitted 31 January, 2025; originally announced January 2025.

  8. arXiv:2501.18793  [pdf, other

    cs.LG cs.AI

    OT-Transformer: A Continuous-time Transformer Architecture with Optimal Transport Regularization

    Authors: Kelvin Kan, Xingjian Li, Stanley Osher

    Abstract: Transformers have achieved state-of-the-art performance in numerous tasks. In this paper, we propose a continuous-time formulation of transformers. Specifically, we consider a dynamical system whose governing equation is parametrized by transformer blocks. We leverage optimal transport theory to regularize the training problem, which enhances stability in training and improves generalization of th… ▽ More

    Submitted 30 January, 2025; originally announced January 2025.

  9. arXiv:2501.15106  [pdf, other

    q-fin.TR cs.LG math.OC q-fin.CP

    In-Context Operator Learning for Linear Propagator Models

    Authors: Tingwei Meng, Moritz Voß, Nils Detering, Giulio Farolfi, Stanley Osher, Georg Menz

    Abstract: We study operator learning in the context of linear propagator models for optimal order execution problems with transient price impact à la Bouchaud et al. (2004) and Gatheral (2010). Transient price impact persists and decays over time according to some propagator kernel. Specifically, we propose to use In-Context Operator Networks (ICON), a novel transformer-based neural network architecture int… ▽ More

    Submitted 25 January, 2025; originally announced January 2025.

    Comments: 25 pages, 10 figures

    MSC Class: 93E20; 91G60; 68T07

  10. arXiv:2412.11485  [pdf, ps, other

    math.OC

    Inexact Proximal Point Algorithms for Zeroth-Order Global Optimization

    Authors: Minxin Zhang, Fuqun Han, Yat Tin Chow, Stanley Osher, Hayden Schaeffer

    Abstract: This work concerns the zeroth-order global minimization of continuous nonconvex functions with a unique global minimizer and possibly multiple local minimizers. We formulate a theoretical framework for inexact proximal point (IPP) methods for global optimization, establishing convergence guarantees under mild assumptions when either deterministic or stochastic estimates of proximal operators are u… ▽ More

    Submitted 2 June, 2025; v1 submitted 16 December, 2024; originally announced December 2024.

    MSC Class: 49M37; 65K05; 90C26; 90C56

  11. arXiv:2411.16063  [pdf, other

    cs.LG math.NA physics.flu-dyn

    VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics Prediction

    Authors: Yadi Cao, Yuxuan Liu, Liu Yang, Rose Yu, Hayden Schaeffer, Stanley Osher

    Abstract: In-Context Operator Networks (ICONs) have demonstrated the ability to learn operators across diverse partial differential equations using few-shot, in-context learning. However, existing ICONs process each spatial point as an individual token, severely limiting computational efficiency when handling dense data in higher spatial dimensions. We propose Vision In-Context Operator Networks (VICON), wh… ▽ More

    Submitted 19 May, 2025; v1 submitted 24 November, 2024; originally announced November 2024.

    Comments: update 1 more baseline + 1 more experiment setup (performance for temporal measurements with dropped frames); updated to Nueral IPS format. Refined writing and presentations

  12. arXiv:2411.06278  [pdf, ps, other

    math.NA cs.LG math.OC

    A Natural Primal-Dual Hybrid Gradient Method for Adversarial Neural Network Training on Solving Partial Differential Equations

    Authors: Shu Liu, Stanley Osher, Wuchen Li

    Abstract: We propose a scalable preconditioned primal-dual hybrid gradient algorithm for solving partial differential equations (PDEs). We multiply the PDE with a dual test function to obtain an inf-sup problem whose loss functional involves lower-order differential operators. The Primal-Dual Hybrid Gradient (PDHG) algorithm is then leveraged for this saddle point problem. By introducing suitable preconditi… ▽ More

    Submitted 24 December, 2024; v1 submitted 9 November, 2024; originally announced November 2024.

    Comments: Several typos have been corrected. We welcome your comments and suggestions

  13. Fried deconvolution

    Authors: Jerome Gilles, Stanley Osher

    Abstract: In this paper we present a new approach to deblur the effect of atmospheric turbulence in the case of long range imaging. Our method is based on an analytical formulation, the Fried kernel, of the atmosphere modulation transfer function (MTF) and a framelet based deconvolution algorithm. An important parameter is the refractive index structure which requires specific measurements to be known. Then… ▽ More

    Submitted 5 November, 2024; originally announced November 2024.

    Journal ref: SPIE Defense, Security and Sensing conference, Baltimore, USA, Proceedings Volume 8355, Infrared Imaging Systems: Design, Analysis, Modeling, and Testing XXIII; 83550G, April 2012

  14. arXiv:2410.23533  [pdf, other

    math.FA cs.CV eess.IV

    2D Empirical Transforms. Wavelets, Ridgelets and Curvelets revisited

    Authors: Jerome Gilles, Giang Tran, Stanley Osher

    Abstract: A recently developed new approach, called ``Empirical Wavelet Transform'', aims to build 1D adaptive wavelet frames accordingly to the analyzed signal. In this paper, we present several extensions of this approach to 2D signals (images). We revisit some well-known transforms (tensor wavelets, Littlewood-Paley wavelets, ridgelets and curvelets) and show that it is possible to build their empirical… ▽ More

    Submitted 30 October, 2024; originally announced October 2024.

    Journal ref: SIAM Journal on Imaging Sciences, Vol.7, No.1, 157--186, January 2014

  15. Wavelet Burst Accumulation for turbulence mitigation

    Authors: Jerome Gilles, Stanley Osher

    Abstract: In this paper, we investigate the extension of the recently proposed weighted Fourier burst accumulation (FBA) method into the wavelet domain. The purpose of FBA is to reconstruct a clean and sharp image from a sequence of blurred frames. This concept lies in the construction of weights to amplify dominant frequencies in the Fourier spectrum of each frame. The reconstructed image is then obtained… ▽ More

    Submitted 30 October, 2024; originally announced October 2024.

    Journal ref: Journal of Electronic Imaging, Vol.25, No.3, 033003-1--033003-9, May 2016

  16. arXiv:2410.22777  [pdf, ps, other

    cs.CV math.FA

    Bregman implementation of Meyer's $G-$norm for cartoon + textures decomposition

    Authors: Jerome Gilles, Stanley Osher

    Abstract: In this paper, we design a very simple algorithm based on Split Bregman iterations to numerically solve the cartoon + textures decomposition model of Meyer. This results in a significant gain in speed compared to Chambolle's nonlinear projectors.

    Submitted 30 October, 2024; originally announced October 2024.

  17. arXiv:2410.08987  [pdf, other

    math.OC math.NA

    Gradient-adjusted underdamped Langevin dynamics for sampling

    Authors: Xinzhe Zuo, Stanley Osher, Wuchen Li

    Abstract: Sampling from a target distribution is a fundamental problem. Traditional Markov chain Monte Carlo (MCMC) algorithms, such as the unadjusted Langevin algorithm (ULA), derived from the overdamped Langevin dynamics, have been extensively studied. From an optimization perspective, the Kolmogorov forward equation of the overdamped Langevin dynamics can be treated as the gradient flow of the relative e… ▽ More

    Submitted 26 October, 2024; v1 submitted 11 October, 2024; originally announced October 2024.

    Comments: added references, discussion on preconditioner

  18. arXiv:2409.16471  [pdf, other

    math.OC cs.LG

    Score-based Neural Ordinary Differential Equations for Computing Mean Field Control Problems

    Authors: Mo Zhou, Stanley Osher, Wuchen Li

    Abstract: Classical neural ordinary differential equations (ODEs) are powerful tools for approximating the log-density functions in high-dimensional spaces along trajectories, where neural networks parameterize the velocity fields. This paper proposes a system of neural differential equations representing first- and second-order score functions along trajectories based on deep neural networks. We reformulat… ▽ More

    Submitted 29 January, 2025; v1 submitted 24 September, 2024; originally announced September 2024.

    MSC Class: 34H05 ACM Class: G.1.7

  19. arXiv:2409.01567  [pdf, ps, other

    math.NA math.ST

    Convergence of Noise-Free Sampling Algorithms with Regularized Wasserstein Proximals

    Authors: Fuqun Han, Stanley Osher, Wuchen Li

    Abstract: In this work, we investigate the convergence properties of the backward regularized Wasserstein proximal (BRWP) method for sampling a target distribution. The BRWP approach can be shown as a semi-implicit time discretization for a probability flow ODE with the score function whose density satisfies the Fokker-Planck equation of the overdamped Langevin dynamics. Specifically, the evolution of the s… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

  20. arXiv:2408.03532  [pdf, other

    math.NA

    Fast Partial Fourier Transforms for Large-Scale Ptychography

    Authors: Ricardo Parada, Samy Wu Fung, Stanley Osher

    Abstract: Ptychography is a popular imaging technique that combines diffractive imaging with scanning microscopy. The technique consists of a coherent beam that is scanned across an object in a series of overlapping positions, leading to reliable and improved reconstructions. Ptychographic microscopes allow for large fields to be imaged at high resolution at additional computational expense. In this work, w… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  21. arXiv:2406.13781  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    A Primal-Dual Framework for Transformers and Neural Networks

    Authors: Tan M. Nguyen, Tam Nguyen, Nhat Ho, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher

    Abstract: Self-attention is key to the remarkable success of transformers in sequence modeling tasks including many applications in natural language processing and computer vision. Like neural network layers, these attention mechanisms are often developed by heuristics and experience. To provide a principled framework for constructing attention layers in transformers, we show that the self-attention corresp… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted to ICLR 2023, 26 pages, 4 figures, 14 tables

  22. arXiv:2406.02003  [pdf, other

    math.OC

    Laplace Meets Moreau: Smooth Approximation to Infimal Convolutions Using Laplace's Method

    Authors: Ryan J. Tibshirani, Samy Wu Fung, Howard Heaton, Stanley Osher

    Abstract: We study approximations to the Moreau envelope -- and infimal convolutions more broadly -- based on Laplace's method, a classical tool in analysis which ties certain integrals to suprema of their integrands. We believe the connection between Laplace's method and infimal convolutions is generally deserving of more attention in the study of optimization and partial differential equations, since it b… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  23. arXiv:2405.10922  [pdf, other

    math.OC

    Kernel Expansions for High-Dimensional Mean-Field Control with Non-local Interactions

    Authors: Alexander Vidal, Samy Wu Fung, Stanley Osher, Luis Tenorio, Levon Nurbekyan

    Abstract: Mean-field control (MFC) problems aim to find the optimal policy to control massive populations of interacting agents. These problems are crucial in areas such as economics, physics, and biology. We consider the non-local setting, where the interactions between agents are governed by a suitable kernel. For $N$ agents, the interaction cost has $\mathcal{O}(N^2)$ complexity, which can be prohibitive… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  24. arXiv:2404.01586  [pdf, other

    math.OC

    Efficient Computation of Mean field Control based Barycenters from Reaction-Diffusion Systems

    Authors: Arjun Vijaywargiya, Guosheng Fu, Stanley Osher, Wuchen Li

    Abstract: We develop a class of barycenter problems based on mean field control problems in three dimensions with associated reactive-diffusion systems of unnormalized multi-species densities. This problem is the generalization of the Wasserstein barycenter problem for single probability density functions. The primary objective is to present a comprehensive framework for efficiently computing the proposed v… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  25. arXiv:2403.02468  [pdf, other

    math.OC math.NA

    A Primal-dual hybrid gradient method for solving optimal control problems and the corresponding Hamilton-Jacobi PDEs

    Authors: Tingwei Meng, Siting Liu, Wuchen Li, Stanley Osher

    Abstract: Optimal control problems are crucial in various domains, including path planning, robotics, and humanoid control, demonstrating their broad applicability. The connection between optimal control and Hamilton-Jacobi (HJ) partial differential equations (PDEs) underscores the need for solving HJ PDEs to address these control problems effectively. While numerous numerical methods exist for tackling HJ… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  26. arXiv:2402.17745  [pdf, other

    physics.comp-ph cs.CV physics.optics

    Low-light phase retrieval with implicit generative priors

    Authors: Raunak Manekar, Elisa Negrini, Minh Pham, Daniel Jacobs, Jaideep Srivastava, Stanley J. Osher, Jianwei Miao

    Abstract: Phase retrieval (PR) is fundamentally important in scientific imaging and is crucial for nanoscale techniques like coherent diffractive imaging (CDI). Low radiation dose imaging is essential for applications involving radiation-sensitive samples. However, most PR methods struggle in low-dose scenarios due to high shot noise. Recent advancements in optical data acquisition setups, such as in-situ C… ▽ More

    Submitted 23 August, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    MSC Class: 68T10 68T07 78A46

  27. arXiv:2402.16821  [pdf, other

    math.NA math.OC

    Numerical Analysis on Neural Network Projected Schemes for Approximating One Dimensional Wasserstein Gradient Flows

    Authors: Xinzhe Zuo, Jiaxi Zhao, Shu Liu, Stanley Osher, Wuchen Li

    Abstract: We provide a numerical analysis and computation of neural network projected schemes for approximating one dimensional Wasserstein gradient flows. We approximate the Lagrangian mapping functions of gradient flows by the class of two-layer neural network functions with ReLU (rectified linear unit) activation functions. The numerical scheme is based on a projected gradient method, namely the Wasserst… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  28. arXiv:2402.06162  [pdf, other

    stat.ML cs.LG

    Wasserstein proximal operators describe score-based generative models and resolve memorization

    Authors: Benjamin J. Zhang, Siting Liu, Wuchen Li, Markos A. Katsoulakis, Stanley J. Osher

    Abstract: We focus on the fundamental mathematical structure of score-based generative models (SGMs). We first formulate SGMs in terms of the Wasserstein proximal operator (WPO) and demonstrate that, via mean-field games (MFGs), the WPO formulation reveals mathematical structure that describes the inductive bias of diffusion and score-based models. In particular, MFGs yield optimality conditions in the form… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  29. arXiv:2401.14602  [pdf, other

    math.NA math.OC

    Numerical analysis of a first-order computational algorithm for reaction-diffusion equations via the primal-dual hybrid gradient method

    Authors: Shu Liu, Xinzhe Zuo, Stanley Osher, Wuchen Li

    Abstract: In arXiv:2305.03945 [math.NA], a first-order optimization algorithm has been introduced to solve time-implicit schemes of reaction-diffusion equations. In this research, we conduct theoretical studies on this first-order algorithm equipped with a quadratic regularization term. We provide sufficient conditions under which the proposed algorithm and its time-continuous limit converge exponentially f… ▽ More

    Submitted 28 March, 2025; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: Revised version, comments and suggestions are welcome

  30. arXiv:2401.13125  [pdf, ps, other

    math.OC math.NA

    Tensor train based sampling algorithms for approximating regularized Wasserstein proximal operators

    Authors: Fuqun Han, Stanley Osher, Wuchen Li

    Abstract: We present a tensor train (TT) based algorithm designed for sampling from a target distribution and employ TT approximation to capture the high-dimensional probability density evolution of overdamped Langevin dynamics. This involves utilizing the regularized Wasserstein proximal operator, which exhibits a simple kernel integration formulation, i.e., the softmax formula of the traditional proximal… ▽ More

    Submitted 12 March, 2025; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: Revised version

  31. arXiv:2401.09547  [pdf, other

    math.OC

    A deep learning algorithm for computing mean field control problems via forward-backward score dynamics

    Authors: Mo Zhou, Stanley Osher, Wuchen Li

    Abstract: We propose a deep learning approach to compute mean field control problems with individual noises. The problem consists of the Fokker-Planck (FP) equation and the Hamilton-Jacobi-Bellman (HJB) equation. Using the differential of the entropy, namely the score function, we first formulate the deterministic forward-backward characteristics for the mean field control system, which is different from th… ▽ More

    Submitted 17 May, 2025; v1 submitted 17 January, 2024; originally announced January 2024.

    MSC Class: 49N80 (Primary) 35Q89 (Secondary) ACM Class: G.1.6; G.1.8

  32. arXiv:2401.07364  [pdf, other

    cs.LG cs.AI math.NA

    PDE Generalization of In-Context Operator Networks: A Study on 1D Scalar Nonlinear Conservation Laws

    Authors: Liu Yang, Stanley J. Osher

    Abstract: Can we build a single large model for a wide range of PDE-related scientific learning tasks? Can this model generalize to new PDEs, even of new forms, without any fine-tuning? In-context operator learning and the corresponding model In-Context Operator Networks (ICON) represent an initial exploration of these questions. The capability of ICON regarding the first question has been demonstrated prev… ▽ More

    Submitted 21 January, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

  33. arXiv:2310.12513  [pdf, other

    physics.med-ph

    Real space iterative reconstruction for vector tomography (RESIRE-V)

    Authors: Minh Pham, Xingyuan Lu, Arjun Rana, Stanley Osher, Jianwei Miao

    Abstract: Tomography has had an important impact on the physical, biological, and medical sciences. To date, most tomographic applications have been focused on 3D scalar reconstructions. However, in some crucial applications, vector tomography is required to reconstruct 3D vector fields such as the electric and magnetic fields. Over the years, several vector tomography methods have been developed. Here, we… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

  34. arXiv:2310.01605  [pdf, other

    math.NA math.OC

    Primal-dual hybrid gradient algorithms for computing time-implicit Hamilton-Jacobi equations

    Authors: Tingwei Meng, Wenbo Hao, Siting Liu, Stanley J. Osher, Wuchen Li

    Abstract: Hamilton-Jacobi (HJ) partial differential equations (PDEs) have diverse applications spanning physics, optimal control, game theory, and imaging sciences. This research introduces a first-order optimization-based technique for HJ PDEs, which formulates the time-implicit update of HJ PDEs as saddle point problems. We remark that the saddle point formulation for HJ equations is aligned with the prim… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  35. arXiv:2308.14945  [pdf, other

    stat.ML cs.LG stat.CO

    Noise-Free Sampling Algorithms via Regularized Wasserstein Proximals

    Authors: Hong Ye Tan, Stanley Osher, Wuchen Li

    Abstract: We consider the problem of sampling from a distribution governed by a potential function. This work proposes an explicit score based MCMC method that is deterministic, resulting in a deterministic evolution for particles rather than a stochastic differential equation evolution. The score term is given in closed form by a regularized Wasserstein proximal, using a kernel convolution that is approxim… ▽ More

    Submitted 2 October, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

    MSC Class: 65C05; 62G07

  36. arXiv:2308.05061  [pdf, other

    cs.LG math.NA stat.ML

    Fine-Tune Language Models as Multi-Modal Differential Equation Solvers

    Authors: Liu Yang, Siting Liu, Stanley J. Osher

    Abstract: In the growing domain of scientific machine learning, in-context operator learning has shown notable potential in building foundation models, as in this framework the model is trained to learn operators and solve differential equations using prompted data, during the inference stage without weight updates. However, the current model's overdependence on function data overlooks the invaluable human… ▽ More

    Submitted 1 February, 2024; v1 submitted 9 August, 2023; originally announced August 2023.

  37. arXiv:2306.11283  [pdf

    physics.optics cond-mat.soft physics.app-ph physics.bio-ph

    Computational Microscopy beyond Perfect Lenses

    Authors: Xingyuan Lu, Minh Pham, Elisa Negrini, Damek Davis, Stanley J. Osher, Jianwei Miao

    Abstract: We demonstrate that in situ coherent diffractive imaging (CDI), which harnesses the coherent interference between a strong and a weak beam illuminating a static and dynamic structure, can be a very dose-efficient imaging method. At low doses, in situ CDI can achieve higher resolution than perfect lenses with the point spread function as a delta function. Both our numerical simulation and experimen… ▽ More

    Submitted 3 May, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

  38. arXiv:2306.06287  [pdf, other

    math.OC math.NA

    Generalized optimal transport and mean field control problems for reaction-diffusion systems with high-order finite element computation

    Authors: Guosheng Fu, Stanley Osher, Will Pazner, Wuchen Li

    Abstract: We design and compute a class of optimal control problems for reaction-diffusion systems. They form mean field control problems related to multi-density reaction-diffusion systems. To solve proposed optimal control problems numerically, we first apply high-order finite element methods to discretize the space-time domain and then solve the optimal control problem using augmented Lagrangian methods… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: 40 pages, 12 figures

  39. arXiv:2305.03945  [pdf, other

    math.NA

    A first-order computational algorithm for reaction-diffusion type equations via primal-dual hybrid gradient method

    Authors: Shu Liu, Siting Liu, Stanley Osher, Wuchen Li

    Abstract: We propose an easy-to-implement iterative method for resolving the implicit (or semi-implicit) schemes arising in solving reaction-diffusion (RD) type equations. We formulate the nonlinear time implicit scheme as a min-max saddle point problem and then apply the primal-dual hybrid gradient (PDHG) method. Suitable precondition matrices are applied to the PDHG method to accelerate the convergence of… ▽ More

    Submitted 6 May, 2023; originally announced May 2023.

    Comments: Any feedbacks or comments are welcome

  40. Primal-Dual Damping algorithms for optimization

    Authors: X. Zuo, S. Osher, W. Li

    Abstract: We propose an unconstrained optimization method based on the well-known primal-dual hybrid gradient (PDHG) algorithm. We first formulate the optimality condition of the unconstrained optimization problem as a saddle point problem. We then compute the minimizer by applying generalized primal-dual hybrid gradient algorithms. Theoretically, we demonstrate the continuous-time limit of the proposed alg… ▽ More

    Submitted 8 May, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: fixed typo in eq 2.6(b)

    Journal ref: Annals of Mathematical Sciences and Applications, vol. 9, no. 2, pp. 467-504, 2024

  41. arXiv:2304.07993  [pdf, other

    cs.LG math.NA stat.ML

    In-Context Operator Learning with Data Prompts for Differential Equation Problems

    Authors: Liu Yang, Siting Liu, Tingwei Meng, Stanley J. Osher

    Abstract: This paper introduces a new neural-network-based approach, namely In-Context Operator Networks (ICON), to simultaneously learn operators from the prompted data and apply it to new questions during the inference stage, without any weight update. Existing methods are limited to using a neural network to approximate a specific equation solution or a specific operator, requiring retraining when switch… ▽ More

    Submitted 19 September, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: The second and third authors contributed equally. This is an outdated preprint. Please refer to the updated version published in PNAS: www.pnas.org/doi/10.1073/pnas.2310142120 See code in https://github.com/LiuYangMage/in-context-operator-networks

  42. High order spatial discretization for variational time implicit schemes: Wasserstein gradient flows and reaction-diffusion systems

    Authors: Guosheng Fu, Stanley Osher, Wuchen Li

    Abstract: We design and compute first-order implicit-in-time variational schemes with high-order spatial discretization for initial value gradient flows in generalized optimal transport metric spaces. We first review some examples of gradient flows in generalized optimal transport spaces from the Onsager principle. We then use a one-step time relaxation optimization problem for time-implicit schemes, namely… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

  43. High order computation of optimal transport, mean field planning, and mean field games

    Authors: Guosheng Fu, Siting Liu, Stanley Osher, Wuchen Li

    Abstract: Mean-field games (MFGs) have shown strong modeling capabilities for large systems in various fields, driving growth in computational methods for mean-field game problems. However, high order methods have not been thoroughly investigated. In this work, we explore applying general high-order numerical schemes with finite element methods in the space-time domain for computing the optimal transport (O… ▽ More

    Submitted 5 February, 2023; originally announced February 2023.

    MSC Class: 65M60

  44. arXiv:2301.10301  [pdf, other

    math.OC math.NA

    A kernel formula for regularized Wasserstein proximal operators

    Authors: Wuchen Li, Siting Liu, Stanley Osher

    Abstract: We study a class of regularized proximal operators in Wasserstein-2 space. We derive their solutions by kernel integration formulas. We obtain the Wasserstein proximal operator using a pair of forward-backward partial differential equations consisting of a continuity equation and a Hamilton-Jacobi equation with a terminal time potential function and an initial time density function. We regularize… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

  45. arXiv:2301.00437  [pdf, other

    cs.LG stat.ML

    Neural Collapse in Deep Linear Networks: From Balanced to Imbalanced Data

    Authors: Hien Dang, Tho Tran, Stanley Osher, Hung Tran-The, Nhat Ho, Tan Nguyen

    Abstract: Modern deep neural networks have achieved impressive performance on tasks from image classification to natural language processing. Surprisingly, these complex systems with massive amounts of parameters exhibit the same structural properties in their last-layer features and classifiers across canonical datasets when training until convergence. In particular, it has been observed that the last-laye… ▽ More

    Submitted 18 June, 2023; v1 submitted 1 January, 2023; originally announced January 2023.

    Comments: 75 pages, 20 figures, 4 tables. Hien Dang and Tho Tran contributed equally to this work

  46. arXiv:2211.16757  [pdf, other

    math.OC cs.LG

    Taming Hyperparameter Tuning in Continuous Normalizing Flows Using the JKO Scheme

    Authors: Alexander Vidal, Samy Wu Fung, Luis Tenorio, Stanley Osher, Levon Nurbekyan

    Abstract: A normalizing flow (NF) is a mapping that transforms a chosen probability distribution to a normal distribution. Such flows are a common technique used for data generation and density estimation in machine learning and data science. The density estimate obtained with a NF requires a change of variables formula that involves the computation of the Jacobian determinant of the NF transformation. In o… ▽ More

    Submitted 30 November, 2022; originally announced November 2022.

  47. arXiv:2211.15779  [pdf, other

    cs.LG stat.ML

    Revisiting Over-smoothing and Over-squashing Using Ollivier-Ricci Curvature

    Authors: Khang Nguyen, Hieu Nong, Vinh Nguyen, Nhat Ho, Stanley Osher, Tan Nguyen

    Abstract: Graph Neural Networks (GNNs) had been demonstrated to be inherently susceptible to the problems of over-smoothing and over-squashing. These issues prohibit the ability of GNNs to model complex graph interactions by limiting their effectiveness in taking into account distant information. Our study reveals the key connection between the local graph geometry and the occurrence of both of these issues… ▽ More

    Submitted 31 May, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted at ICML 2023; 24 pages, 4 figures

  48. A Hamilton-Jacobi-based Proximal Operator

    Authors: Stanley Osher, Howard Heaton, Samy Wu Fung

    Abstract: First-order optimization algorithms are widely used today. Two standard building blocks in these algorithms are proximal operators (proximals) and gradients. Although gradients can be computed for a wide array of functions, explicit proximal formulas are only known for limited classes of functions. We provide an algorithm, HJ-Prox, for accurately approximating such proximals. This is derived from… ▽ More

    Submitted 28 May, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

  49. arXiv:2209.15092  [pdf, other

    cs.LG stat.ML

    Improving Generative Flow Networks with Path Regularization

    Authors: Anh Do, Duy Dinh, Tan Nguyen, Khuong Nguyen, Stanley Osher, Nhat Ho

    Abstract: Generative Flow Networks (GFlowNets) are recently proposed models for learning stochastic policies that generate compositional objects by sequences of actions with the probability proportional to a given reward function. The central problem of GFlowNets is to improve their exploration and generalization. In this work, we propose a novel path regularization method based on optimal transport theory… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: 28 pages, 2 figures, 5 tables. Anh Do, Duy Dinh, and Tan Nguyen contributed equally to this work

  50. arXiv:2208.00579  [pdf, other

    cs.LG math.NA

    Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization

    Authors: Tan Nguyen, Richard G. Baraniuk, Robert M. Kirby, Stanley J. Osher, Bao Wang

    Abstract: Transformers have achieved remarkable success in sequence modeling and beyond but suffer from quadratic computational and memory complexities with respect to the length of the input sequence. Leveraging techniques include sparse and linear attention and hashing tricks; efficient transformers have been proposed to reduce the quadratic complexity of transformers but significantly degrade the accurac… ▽ More

    Submitted 31 July, 2022; originally announced August 2022.

    Comments: 22 pages, 5 figures. arXiv admin note: substantial text overlap with arXiv:2110.07034

    MSC Class: 65Pxx