Search | arXiv e-print repository

Inexact JKO and proximal-gradient algorithms in the Wasserstein space

Authors: Simone Di Marino, Emanuele Naldi, Silvia Villa

Abstract: This paper studies the convergence properties of the inexact Jordan-Kinderlehrer-Otto (JKO) scheme and proximal-gradient algorithm in the context of Wasserstein spaces. The JKO scheme, a widely-used method for approximating solutions to gradient flows in Wasserstein spaces, typically assumes exact solutions to iterative minimization problems. However, practical applications often require approxima… ▽ More This paper studies the convergence properties of the inexact Jordan-Kinderlehrer-Otto (JKO) scheme and proximal-gradient algorithm in the context of Wasserstein spaces. The JKO scheme, a widely-used method for approximating solutions to gradient flows in Wasserstein spaces, typically assumes exact solutions to iterative minimization problems. However, practical applications often require approximate solutions due to computational limitations. This work focuses on the convergence of the scheme to minimizers for the underlying functional and addresses these challenges by analyzing two types of inexactness: errors in Wasserstein distance and errors in energy functional evaluations. The paper provides rigorous convergence guarantees under controlled error conditions, demonstrating that weak convergence can still be achieved with inexact steps. The analysis is further extended to proximal-gradient algorithms, showing that convergence is preserved under inexact evaluations. △ Less

Submitted 29 May, 2025; originally announced May 2025.

arXiv:2504.10999 [pdf, other]

Splitting the Forward-Backward Algorithm: A Full Characterization

Authors: Anton Åkerman, Enis Chenchene, Pontus Giselsson, Emanuele Naldi

Abstract: We study frugal splitting algorithms with minimal lifting for solving monotone inclusion problems involving sums of maximal monotone and cocoercive operators. Building on a foundational result by Ryu, we fully characterize all methods that use only individual resolvent evaluations, direct evaluations of cocoercive operators, and minimal memory resources while ensuring convergence via averaged fixe… ▽ More We study frugal splitting algorithms with minimal lifting for solving monotone inclusion problems involving sums of maximal monotone and cocoercive operators. Building on a foundational result by Ryu, we fully characterize all methods that use only individual resolvent evaluations, direct evaluations of cocoercive operators, and minimal memory resources while ensuring convergence via averaged fixed-point iterations. We show that all such methods are captured by a unified framework, which includes known schemes and enables new ones with promising features. Systematic numerical experiments lead us to propose three design heuristics to achieve excellent performances in practice, yielding significant gains over existing methods. △ Less

Submitted 15 April, 2025; originally announced April 2025.

MSC Class: 47N10; 47H05; 47H09; 65K10; 90C25

arXiv:2503.24141 [pdf, other]

The Influence of an Adjoint Mismatch on the Primal-Dual Douglas-Rachford Method

Authors: Emanuele Naldi, Felix Schneppe

Abstract: The primal-dual Douglas-Rachford method is a well-known algorithm to solve optimization problems written as convex-concave saddle-point problems. Each iteration involves solving a linear system involving a linear operator and its adjoint. However, in practical applications it is often computationally favorable to replace the adjoint operator by a computationally more efficient approximation. This… ▽ More The primal-dual Douglas-Rachford method is a well-known algorithm to solve optimization problems written as convex-concave saddle-point problems. Each iteration involves solving a linear system involving a linear operator and its adjoint. However, in practical applications it is often computationally favorable to replace the adjoint operator by a computationally more efficient approximation. This leads to an adjoint mismatch. In this paper, we analyze the convergence of the primal-dual Douglas-Rachford method under the presence of an adjoint mismatch. We provide mild conditions that guarantee the existence of a fixed point and find an upper bound on the error of the primal solution. Furthermore, we establish step sizes in the strongly convex setting that guarantee linear convergence under mild conditions. Additionally, we provide an alternative method that can also be derived from the Douglas-Rachford method and is also guaranteed to converge in this setting. Moreover, we illustrate our results both for an academic and a real-world inspired example. △ Less

Submitted 31 March, 2025; originally announced March 2025.

Comments: 37 pages, 10 figures

MSC Class: 49M29; 90C25; 65K10

arXiv:2410.14591 [pdf, ps, other]

A Lipschitz spaces view of infinitely wide shallow neural networks

Authors: Francesca Bartolucci, Marcello Carioni, José A. Iglesias, Yury Korolev, Emanuele Naldi, Stefano Vigogna

Abstract: We revisit the mean field parametrization of shallow neural networks, using signed measures on unbounded parameter spaces and duality pairings that take into account the regularity and growth of activation functions. This setting directly leads to the use of unbalanced Kantorovich-Rubinstein norms defined by duality with Lipschitz functions, and of spaces of measures dual to those of continuous fu… ▽ More We revisit the mean field parametrization of shallow neural networks, using signed measures on unbounded parameter spaces and duality pairings that take into account the regularity and growth of activation functions. This setting directly leads to the use of unbalanced Kantorovich-Rubinstein norms defined by duality with Lipschitz functions, and of spaces of measures dual to those of continuous functions with controlled growth. These allow to make transparent the need for total variation and moment bounds or penalization to obtain existence of minimizers of variational formulations, under which we prove a compactness result in strong Kantorovich-Rubinstein norm, and in the absence of which we show several examples demonstrating undesirable behavior. Further, the Kantorovich-Rubinstein setting enables us to combine the advantages of a completely linear parametrization and ensuing reproducing kernel Banach space framework with optimal transport insights. We showcase this synergy with representer theorems and uniform large data limits for empirical risk minimization, and in proposed formulations for distillation and fusion applications. △ Less

Submitted 18 October, 2024; originally announced October 2024.

Comments: 39 pages, 1 table

MSC Class: 68T07; 46E27; 46B20

arXiv:2407.14156 [pdf, other]

Learning Firmly Nonexpansive Operators

Authors: Kristian Bredies, Jonathan Chirinos-Rodriguez, Emanuele Naldi

Abstract: This paper proposes a data-driven approach for constructing firmly nonexpansive operators. We demonstrate its applicability in Plug-and-Play methods, where classical algorithms such as forward-backward splitting, Chambolle--Pock primal-dual iteration, Douglas--Rachford iteration or alternating directions method of multipliers (ADMM), are modified by replacing one proximal map by a learned firmly n… ▽ More This paper proposes a data-driven approach for constructing firmly nonexpansive operators. We demonstrate its applicability in Plug-and-Play methods, where classical algorithms such as forward-backward splitting, Chambolle--Pock primal-dual iteration, Douglas--Rachford iteration or alternating directions method of multipliers (ADMM), are modified by replacing one proximal map by a learned firmly nonexpansive operator. We provide sound mathematical background to the problem of learning such an operator via expected and empirical risk minimization. We prove that, as the number of training points increases, the empirical risk minimization problem converges (in the sense of Gamma-convergence) to the expected risk minimization problem. Further, we derive a solution strategy that ensures firmly nonexpansive and piecewise affine operators within the convex envelope of the training set. We show that this operator converges to the best empirical solution as the number of points in the envelope increases in an appropriate sense. Finally, the experimental section details practical implementations of the method and presents an application in image denoising. △ Less

Submitted 19 July, 2024; originally announced July 2024.

MSC Class: 65J20; 65K10; 46N10; 52A05

arXiv:2407.05893 [pdf, other]

A general framework for inexact splitting algorithms with relative errors and applications to Chambolle-Pock and Davis-Yin methods

Authors: M. Marques Alves, Dirk A. Lorenz, Emanuele Naldi

Abstract: In this work we apply the recently introduced framework of degenerate preconditioned proximal point algorithms to the hybrid proximal extragradient (HPE) method for maximal monotone inclusions. The latter is a method that allows inexact proximal (or resolvent) steps where the error is controlled by a relative-error criterion. Recently the HPE framework has been extended to the Douglas-Rachford met… ▽ More In this work we apply the recently introduced framework of degenerate preconditioned proximal point algorithms to the hybrid proximal extragradient (HPE) method for maximal monotone inclusions. The latter is a method that allows inexact proximal (or resolvent) steps where the error is controlled by a relative-error criterion. Recently the HPE framework has been extended to the Douglas-Rachford method by Eckstein and Yao. In this paper we further extend the applicability of the HPE framework to splitting methods. To this end we use the framework of degenerate preconditioners that allows to write a large class of splitting methods as preconditioned proximal point algorithms. In this way, we modify many splitting methods such that one or more of the resolvents can be computed inexactly with an error that is controlled by an adaptive criterion. Further, we illustrate the algorithmic framework in the case of Chambolle-Pock's primal dual hybrid gradient method and the Davis-Yin's forward Douglas-Rachford method. In both cases, the inexact computation of the resolvent shows clear advantages in computing time and accuracy. △ Less

Submitted 1 April, 2025; v1 submitted 8 July, 2024; originally announced July 2024.

Comments: 34 pages, 6 figures

arXiv:2304.14039 [pdf, other]

On extreme points and representer theorems for the Lipschitz unit ball on finite metric spaces

Authors: Kristian Bredies, Jonathan Chirinos Rodriguez, Emanuele Naldi

Abstract: In this note, we provide a characterization for the set of extreme points of the Lipschitz unit ball in a specific vectorial setting. While the analysis of the case of real-valued functions is covered extensively in the literature, no information about the vectorial case has been provided up to date. Here, we aim at partially filling this gap by considering functions mapping from a finite metric s… ▽ More In this note, we provide a characterization for the set of extreme points of the Lipschitz unit ball in a specific vectorial setting. While the analysis of the case of real-valued functions is covered extensively in the literature, no information about the vectorial case has been provided up to date. Here, we aim at partially filling this gap by considering functions mapping from a finite metric space to a strictly convex Banach space that satisfy the Lipschitz condition. As a consequence, we present a representer theorem for such functions. In this setting, the number of extreme points needed to express any point inside the ball is independent of the dimension, improving the classical result from Carathéodory. △ Less

Submitted 27 April, 2023; originally announced April 2023.

Comments: 6 pages

MSC Class: 46A55 (Primary) 46N10; 52A05 (Secondary)

arXiv:2302.13128 [pdf, other]

The Degenerate Variable Metric Proximal Point Algorithm and Adaptive Stepsizes for Primal-Dual Douglas-Rachford

Authors: Dirk A. Lorenz, Jannis Marquardt, Emanuele Naldi

Abstract: In this paper the degenerate preconditioned proximal point algorithm will be combined with the idea of varying preconditioners leading to the degenerate variable metric proximal point algorithm. The weak convergence of the resulting iteration will be proven. From the perspective of the degenerate variable metric proximal point algorithm, a version of the primal-dual Douglas-Rachford method with va… ▽ More In this paper the degenerate preconditioned proximal point algorithm will be combined with the idea of varying preconditioners leading to the degenerate variable metric proximal point algorithm. The weak convergence of the resulting iteration will be proven. From the perspective of the degenerate variable metric proximal point algorithm, a version of the primal-dual Douglas-Rachford method with varying preconditioners will be derived and a proof of its weak convergence which is based on the previous results for the proximal point algorithm, is provided, too. After that, we derive a heuristic on how to choose those varying preconditioners in order to increase the convergence speed of the method. △ Less

Submitted 25 February, 2023; originally announced February 2023.

MSC Class: 47H05; 65K05; 90C25

arXiv:2211.04782 [pdf, other]

Graph and distributed extensions of the Douglas-Rachford method

Authors: Kristian Bredies, Enis Chenchene, Emanuele Naldi

Abstract: In this paper, we propose several graph-based extensions of the Douglas-Rachford splitting (DRS) method to solve monotone inclusion problems involving the sum of $N$ maximal monotone operators. Our construction is based on a two-layer architecture that we refer to as bilevel graphs, to which we associate a generalization of the DRS algorithm that presents the prescribed structure. The resulting sc… ▽ More In this paper, we propose several graph-based extensions of the Douglas-Rachford splitting (DRS) method to solve monotone inclusion problems involving the sum of $N$ maximal monotone operators. Our construction is based on a two-layer architecture that we refer to as bilevel graphs, to which we associate a generalization of the DRS algorithm that presents the prescribed structure. The resulting schemes can be understood as unconditionally stable frugal resolvent splitting methods with a minimal lifting in the sense of Ryu [Math Program 182(1):233-273, 2020], as well as instances of the (degenerate) Preconditioned Proximal Point method, which provides robust convergence guarantees. We further describe how the graph-based extensions of the DRS method can be leveraged to design new fully distributed protocols. Applications to a congested optimal transport problem and to distributed Support Vector Machines show interesting connections with the underlying graph topology and highly competitive performances with state-of-the-art distributed optimization approaches. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: 23 pages, 4 figures

arXiv:2109.11481 [pdf, other]

Degenerate Preconditioned Proximal Point algorithms

Authors: Kristian Bredies, Enis Chenchene, Dirk A. Lorenz, Emanuele Naldi

Abstract: In this paper we describe a systematic procedure to analyze the convergence of degenerate preconditioned proximal point algorithms. We establish weak convergence results under mild assumptions that can be easily employed in the context of splitting methods for monotone inclusion and convex minimization problems. Moreover, we show that the degeneracy of the preconditioner allows for a reduction of… ▽ More In this paper we describe a systematic procedure to analyze the convergence of degenerate preconditioned proximal point algorithms. We establish weak convergence results under mild assumptions that can be easily employed in the context of splitting methods for monotone inclusion and convex minimization problems. Moreover, we show that the degeneracy of the preconditioner allows for a reduction of the variables involved in the iteration updates. We show the strength of the proposed framework in the context of splitting algorithms, providing new simplified proofs of convergence and highlighting the link between existing schemes, such as Chambolle-Pock, Forward Douglas-Rachford and Peaceman-Rachford, that we study from a preconditioned proximal point perspective. The proposed framework allows to devise new flexible schemes and provides new ways to generalize existing splitting schemes to the case of the sum of many terms. As an example, we present a new sequential generalization of Forward Douglas-Rachford along with numerical experiments that demonstrate its interest in the context of nonsmooth convex optimization. △ Less

Submitted 23 September, 2021; originally announced September 2021.

MSC Class: 47H05; 47H09; 47N10; 90C25

arXiv:2104.06121 [pdf, other]

Weak topology and Opial property in Wasserstein spaces, with applications to Gradient Flows and Proximal Point Algorithms of geodesically convex functionals

Authors: Emanuele Naldi, Giuseppe Savaré

Abstract: In this paper we discuss how to define an appropriate notion of weak topology in the Wasserstein space $(\mathcal{P}_2(H),W_2)$ of Borel probability measures with finite quadratic moment on a separable Hilbert space $H$. We will show that such a topology inherits many features of the usual weak topology in Hilbert spaces, in particular the weak closedness of geodesically convex closed sets and the… ▽ More In this paper we discuss how to define an appropriate notion of weak topology in the Wasserstein space $(\mathcal{P}_2(H),W_2)$ of Borel probability measures with finite quadratic moment on a separable Hilbert space $H$. We will show that such a topology inherits many features of the usual weak topology in Hilbert spaces, in particular the weak closedness of geodesically convex closed sets and the Opial property characterizing weakly convergent sequences. We apply this notion to the approximation of fixed points for a non-expansive map in a weakly closed subset of $\mathcal{P}_2(H)$ and of minimizers of a lower semicontinuous and geodesically convex functional $φ:\mathcal{P}_2(H)\to(-\infty,+\infty]$ attaining its minimum. In particular, we will show that every solution to the Wasserstein gradient flow of $φ$ weakly converge to a minimizer of $φ$ as the time goes to $+\infty$. Similarly, if $φ$ is also convex along generalized geodesics, every sequence generated by the proximal point algorithm converges to a minimizer of $φ$ with respect to the weak topology of $\mathcal{P}_2(H)$. △ Less

Submitted 13 April, 2021; originally announced April 2021.

Comments: $\textit{Dedicated to the memory of Claudio Baiocchi, outstanding mathematician and beloved mentor}$

Showing 1–11 of 11 results for author: Naldi, E