-
Sinkhorn Algorithm for Sequentially Composed Optimal Transports
Authors:
Kazuki Watanabe,
Noboru Isobe
Abstract:
Sinkhorn algorithm is the de-facto standard approximation algorithm for optimal transport, which has been applied to a variety of applications, including image processing and natural language processing. In theory, the proof of its convergence follows from the convergence of the Sinkhorn--Knopp algorithm for the matrix scaling problem, and Altschuler et al. show that its worst-case time complexity…
▽ More
Sinkhorn algorithm is the de-facto standard approximation algorithm for optimal transport, which has been applied to a variety of applications, including image processing and natural language processing. In theory, the proof of its convergence follows from the convergence of the Sinkhorn--Knopp algorithm for the matrix scaling problem, and Altschuler et al. show that its worst-case time complexity is in near-linear time. Very recently, sequentially composed optimal transports were proposed by Watanabe and Isobe as a hierarchical extension of optimal transports. In this paper, we present an efficient approximation algorithm, namely Sinkhorn algorithm for sequentially composed optimal transports, for its entropic regularization. Furthermore, we present a theoretical analysis of the Sinkhorn algorithm, namely (i) its exponential convergence to the optimal solution with respect to the Hilbert pseudometric, and (ii) a worst-case complexity analysis for the case of one sequential composition.
△ Less
Submitted 12 January, 2025; v1 submitted 4 December, 2024;
originally announced December 2024.
-
Last Iterate Convergence in Monotone Mean Field Games
Authors:
Noboru Isobe,
Kenshi Abe,
Kaito Ariu
Abstract:
Mean Field Game (MFG) is a framework for modeling and approximating the behavior of large numbers of agents. Computing equilibria in MFG has been of interest in multi-agent reinforcement learning. The theoretical guarantee that the last updated policy converges to an equilibrium has been limited. We propose the use of a simple, proximal-point (PP) type method to compute equilibria for MFGs. We the…
▽ More
Mean Field Game (MFG) is a framework for modeling and approximating the behavior of large numbers of agents. Computing equilibria in MFG has been of interest in multi-agent reinforcement learning. The theoretical guarantee that the last updated policy converges to an equilibrium has been limited. We propose the use of a simple, proximal-point (PP) type method to compute equilibria for MFGs. We then provide the first last-iterate convergence (LIC) guarantee under the Lasry--Lions-type monotonicity condition. We also propose an approximation of the update rule of PP ($\mathtt{APP}$) based on the observation that it is equivalent to solving the regularized MFG, which can be solved by mirror descent. We further establish that the regularized mirror descent achieves LIC at an exponential rate. Our numerical experiment demonstrates that $\mathtt{APP}$ efficiently computes the equilibrium.
△ Less
Submitted 31 January, 2025; v1 submitted 7 October, 2024;
originally announced October 2024.
-
String Diagram of Optimal Transports
Authors:
Kazuki Watanabe,
Noboru Isobe
Abstract:
We present a novel hierarchical framework for optimal transport (OT) using string diagrams, namely string diagrams of optimal transports. This framework reduces complex hierarchical OT problems to standard OT problems, allowing efficient synthesis of optimal hierarchical transportation plans. Our approach uses algebraic compositions of cost matrices to effectively model hierarchical structures. We…
▽ More
We present a novel hierarchical framework for optimal transport (OT) using string diagrams, namely string diagrams of optimal transports. This framework reduces complex hierarchical OT problems to standard OT problems, allowing efficient synthesis of optimal hierarchical transportation plans. Our approach uses algebraic compositions of cost matrices to effectively model hierarchical structures. We also study an adversarial situation with multiple choices in the cost matrices, where we present a polynomial-time algorithm for a relaxation of the problem. Experimental results confirm the efficiency and performance advantages of our proposed algorithm over the naive method.
△ Less
Submitted 24 January, 2025; v1 submitted 16 August, 2024;
originally announced August 2024.
-
Flow matching achieves almost minimax optimal convergence
Authors:
Kenji Fukumizu,
Taiji Suzuki,
Noboru Isobe,
Kazusato Oko,
Masanori Koyama
Abstract:
Flow matching (FM) has gained significant attention as a simulation-free generative model. Unlike diffusion models, which are based on stochastic differential equations, FM employs a simpler approach by solving an ordinary differential equation with an initial condition from a normal distribution, thus streamlining the sample generation process. This paper discusses the convergence properties of F…
▽ More
Flow matching (FM) has gained significant attention as a simulation-free generative model. Unlike diffusion models, which are based on stochastic differential equations, FM employs a simpler approach by solving an ordinary differential equation with an initial condition from a normal distribution, thus streamlining the sample generation process. This paper discusses the convergence properties of FM for large sample size under the $p$-Wasserstein distance, a measure of distributional discrepancy. We establish that FM can achieve an almost minimax optimal convergence rate for $1 \leq p \leq 2$, presenting the first theoretical evidence that FM can reach convergence rates comparable to those of diffusion models. Our analysis extends existing frameworks by examining a broader class of mean and variance functions for the vector fields and identifies specific conditions necessary to attain almost optimal rates.
△ Less
Submitted 10 October, 2024; v1 submitted 31 May, 2024;
originally announced May 2024.
-
Extended Flow Matching: a Method of Conditional Generation with Generalized Continuity Equation
Authors:
Noboru Isobe,
Masanori Koyama,
Jinzhe Zhang,
Kohei Hayashi,
Kenji Fukumizu
Abstract:
The task of conditional generation is one of the most important applications of generative models, and numerous methods have been developed to date based on the celebrated flow-based models. However, many flow-based models in use today are not built to allow one to introduce an explicit inductive bias to how the conditional distribution to be generated changes with respect to conditions. This can…
▽ More
The task of conditional generation is one of the most important applications of generative models, and numerous methods have been developed to date based on the celebrated flow-based models. However, many flow-based models in use today are not built to allow one to introduce an explicit inductive bias to how the conditional distribution to be generated changes with respect to conditions. This can result in unexpected behavior in the task of style transfer, for example. In this research, we introduce extended flow matching (EFM), a direct extension of flow matching that learns a "matrix field" corresponding to the continuous map from the space of conditions to the space of distributions. We show that we can introduce inductive bias to the conditional generation through the matrix field and demonstrate this fact with MMOT-EFM, a version of EFM that aims to minimize the Dirichlet energy or the sensitivity of the distribution with respect to conditions. We will present our theory along with experimental results that support the competitiveness of EFM in conditional generation.
△ Less
Submitted 5 July, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
A convergence result of a continuous model of deep learning via Łojasiewicz--Simon inequality
Authors:
Noboru Isobe
Abstract:
This study focuses on a Wasserstein-type gradient flow, which represents an optimization process of a continuous model of a Deep Neural Network (DNN). First, we establish the existence of a minimizer for an average loss of the model under $L^2$-regularization. Subsequently, we show the existence of a curve of maximal slope of the loss. Our main result is the convergence of flow to a critical point…
▽ More
This study focuses on a Wasserstein-type gradient flow, which represents an optimization process of a continuous model of a Deep Neural Network (DNN). First, we establish the existence of a minimizer for an average loss of the model under $L^2$-regularization. Subsequently, we show the existence of a curve of maximal slope of the loss. Our main result is the convergence of flow to a critical point of the loss as time goes to infinity. An essential aspect of proving this result involves the establishment of the Łojasiewicz--Simon gradient inequality for the loss. We derive this inequality by assuming the analyticity of NNs and loss functions. Our proofs offer a new approach for analyzing the asymptotic behavior of Wasserstein-type gradient flows for nonconvex functionals.
△ Less
Submitted 14 April, 2024; v1 submitted 26 November, 2023;
originally announced November 2023.
-
Variational formulations of ODE-Net as a mean-field optimal control problem and existence results
Authors:
Noboru Isobe,
Mizuho Okumura
Abstract:
This paper presents a mathematical analysis of ODE-Net, a continuum model of deep neural networks (DNNs). In recent years, Machine Learning researchers have introduced ideas of replacing the deep structure of DNNs with ODEs as a continuum limit. These studies regard the "learning" of ODE-Net as the minimization of a "loss" constrained by a parametric ODE. Although the existence of a minimizer for…
▽ More
This paper presents a mathematical analysis of ODE-Net, a continuum model of deep neural networks (DNNs). In recent years, Machine Learning researchers have introduced ideas of replacing the deep structure of DNNs with ODEs as a continuum limit. These studies regard the "learning" of ODE-Net as the minimization of a "loss" constrained by a parametric ODE. Although the existence of a minimizer for this minimization problem needs to be assumed, only a few studies have investigated its existence analytically in detail. In the present paper, the existence of a minimizer is discussed based on a formulation of ODE-Net as a measure-theoretic mean-field optimal control problem. The existence result is proved when a neural network, which describes a vector field of ODE-Net, is linear with respect to learnable parameters. The proof employs the measure-theoretic formulation combined with the direct method of Calculus of Variations. Secondly, an idealized minimization problem is proposed to remove the above linearity assumption. Such a problem is inspired by a kinetic regularization associated with the Benamou--Brenier formula and universal approximation theorems for neural networks. The proofs of these existence results use variational methods, differential equations, and mean-field optimal control theory. They will stand for a new analytic way to investigate the learning process of deep neural networks.
△ Less
Submitted 20 October, 2024; v1 submitted 8 March, 2023;
originally announced March 2023.