-
Nonlinear Distributionally Robust Optimization
Authors:
Mohammed Rayyan Sheriff,
Peyman Mohajerin Esfahani
Abstract:
This article focuses on a class of distributionally robust optimization (DRO) problems where, unlike the growing body of the literature, the objective function is potentially nonlinear in the distribution. Existing methods to optimize nonlinear functions in probability space use the Frechet derivatives, which present theoretical and computational challenges. Motivated by this, we propose an altern…
▽ More
This article focuses on a class of distributionally robust optimization (DRO) problems where, unlike the growing body of the literature, the objective function is potentially nonlinear in the distribution. Existing methods to optimize nonlinear functions in probability space use the Frechet derivatives, which present theoretical and computational challenges. Motivated by this, we propose an alternative notion for the derivative and corresponding smoothness based on Gateaux (G)-derivative for generic risk measures. These concepts are explained via three running risk measure examples of variance, entropic risk, and risk on finite support sets. We then propose a G-derivative-based Frank-Wolfe (FW) algorithm for generic nonlinear optimization problems in probability spaces and establish its convergence under the proposed notion of smoothness in a completely norm-independent manner. We use the set-up of the FW algorithm to devise a methodology to compute a saddle point of the nonlinear DRO problem. Finally, we validate our theoretical results on two cases of the $entropic$ and $variance$ risk measures in the context of portfolio selection problems. In particular, we analyze their regularity conditions and "sufficient statistic", compute the respective FW-oracle in various settings, and confirm the theoretical outcomes through numerical validation.
△ Less
Submitted 5 November, 2024; v1 submitted 5 June, 2023;
originally announced June 2023.
-
Fast Algorithm for Constrained Linear Inverse Problems
Authors:
Mohammed Rayyan Sheriff,
Floor Fenne Redel,
Peyman Mohajerin Esfahani
Abstract:
We consider the constrained Linear Inverse Problem (LIP), where a certain atomic norm (like the $\ell_1 $ norm) is minimized subject to a quadratic constraint. Typically, such cost functions are non-differentiable which makes them not amenable to the fast optimization methods existing in practice. We propose two equivalent reformulations of the constrained LIP with improved convex regularity: (i)…
▽ More
We consider the constrained Linear Inverse Problem (LIP), where a certain atomic norm (like the $\ell_1 $ norm) is minimized subject to a quadratic constraint. Typically, such cost functions are non-differentiable which makes them not amenable to the fast optimization methods existing in practice. We propose two equivalent reformulations of the constrained LIP with improved convex regularity: (i) a smooth convex minimization problem, and (ii) a strongly convex min-max problem. These problems could be solved by applying existing acceleration-based convex optimization methods which provide better $ O \left( \frac{1}{k^2} \right) $ theoretical convergence guarantee, improving upon the current best rate of $ O \left( \frac{1}{k} \right) $. We also provide a novel algorithm named the Fast Linear Inverse Problem Solver (FLIPS), which is tailored to maximally exploit the structure of the reformulations. We demonstrate the performance of FLIPS on the classical problems of Binary Selection, Compressed Sensing, and Image Denoising. We also provide open source \texttt{MATLAB} package for these three examples, which can be easily adapted to other LIPs.
△ Less
Submitted 24 January, 2024; v1 submitted 2 December, 2022;
originally announced December 2022.
-
Novel min-max reformulations of Linear Inverse Problems
Authors:
Mohammed Rayyan Sheriff,
Debasish Chatterjee
Abstract:
In this article, we dwell into the class of so-called ill-posed Linear Inverse Problems (LIP) which simply refers to the task of recovering the entire signal from its relatively few random linear measurements. Such problems arise in a variety of settings with applications ranging from medical image processing, recommender systems, etc. We propose a slightly generalized version of the error constra…
▽ More
In this article, we dwell into the class of so-called ill-posed Linear Inverse Problems (LIP) which simply refers to the task of recovering the entire signal from its relatively few random linear measurements. Such problems arise in a variety of settings with applications ranging from medical image processing, recommender systems, etc. We propose a slightly generalized version of the error constrained linear inverse problem and obtain a novel and equivalent convex-concave min-max reformulation by providing an exposition to its convex geometry. Saddle points of the min-max problem are completely characterized in terms of a solution to the LIP, and vice versa. Applying simple saddle point seeking ascend-descent type algorithms to solve the min-max problems provides novel and simple algorithms to find a solution to the LIP. Moreover, the reformulation of an LIP as the min-max problem provided in this article is crucial in developing methods to solve the dictionary learning problem with almost sure recovery constraints.
△ Less
Submitted 5 July, 2020;
originally announced July 2020.
-
Dictionary Learning with Almost Sure Error Constraints
Authors:
Mohammed Rayyan Sheriff,
Debasish Chatterjee
Abstract:
A dictionary is a database of standard vectors, so that other vectors / signals are expressed as linear combinations of dictionary vectors, and the task of learning a dictionary for a given data is to find a good dictionary so that the representation of data points has desirable features. Dictionary learning and the related matrix factorization methods have gained significant prominence recently d…
▽ More
A dictionary is a database of standard vectors, so that other vectors / signals are expressed as linear combinations of dictionary vectors, and the task of learning a dictionary for a given data is to find a good dictionary so that the representation of data points has desirable features. Dictionary learning and the related matrix factorization methods have gained significant prominence recently due to their applications in Wide variety of fields like machine learning, signal processing, statistics etc. In this article we study the dictionary learning problem for achieving desirable features in the representation of a given data with almost sure recovery constraints. We impose the constraint that every sample is reconstructed properly to within a predefined threshold. This problem formulation is more challenging than the conventional dictionary learning, which is done by minimizing a regularised cost function. We make use of the duality results for linear inverse problems to obtain an equivalent reformulation in the form of a convex-concave min-max problem. The resulting min-max problem is then solved using gradient descent-ascent like algorithms.
△ Less
Submitted 7 July, 2020; v1 submitted 19 October, 2019;
originally announced October 2019.
-
On Convex Duality in Linear Inverse Problems
Authors:
Mohammed Rayyan Sheriff,
Debasish Chatterjee
Abstract:
In this article we dwell into the class of so called ill posed Linear Inverse Problems (LIP) in machine learning, which has become almost a classic in recent times. The fundamental task in an LIP is to recover the entire signal / data from its relatively few random linear measurements. Such problems arise in variety of settings with applications ranging from medical image processing, recommender s…
▽ More
In this article we dwell into the class of so called ill posed Linear Inverse Problems (LIP) in machine learning, which has become almost a classic in recent times. The fundamental task in an LIP is to recover the entire signal / data from its relatively few random linear measurements. Such problems arise in variety of settings with applications ranging from medical image processing, recommender systems etc. We provide an exposition to the convex duality of the linear inverse problems, and obtain a novel and equivalent convex-concave min-max reformulation that gives rise to simple ascend-descent type algorithms to solve an LIP. Moreover, such a reformulation is crucial in developing methods to solve the dictionary learning problem with almost sure recovery constraints.
△ Less
Submitted 8 January, 2020; v1 submitted 16 August, 2019;
originally announced August 2019.
-
A complete characterization of optimal dictionaries for least squares representation
Authors:
Mohammed Rayyan Sheriff,
Debasish Chatterjee
Abstract:
Dictionaries are collections of vectors used for representations of elements in Euclidean spaces. While recent research on optimal dictionaries is focussed on providing sparse (i.e., $\ell_0$-optimal,) representations, here we consider the problem of finding optimal dictionaries such that representations of samples of a random vector are optimal in an $\ell_2$-sense. For us, optimality of represen…
▽ More
Dictionaries are collections of vectors used for representations of elements in Euclidean spaces. While recent research on optimal dictionaries is focussed on providing sparse (i.e., $\ell_0$-optimal,) representations, here we consider the problem of finding optimal dictionaries such that representations of samples of a random vector are optimal in an $\ell_2$-sense. For us, optimality of representation is equivalent to minimization of the average $\ell_2$-norm of the coefficients used to represent the random vector, with the lengths of the dictionary vectors being specified a priori. With the help of recent results on rank-$1$ decompositions of symmetric positive semidefinite matrices and the theory of majorization, we provide a complete characterization of $\ell_2$-optimal dictionaries. Our results are accompanied by polynomial time algorithms that construct $\ell_2$-optimal dictionaries from given data.
△ Less
Submitted 18 October, 2017;
originally announced October 2017.
-
Optimal dictionary for least squares representation
Authors:
Mohammed Rayyan Sheriff,
Debasish Chatterjee
Abstract:
Dictionaries are collections of vectors used for representations of random vectors in Euclidean spaces. Recent research on optimal dictionaries is focused on constructing dictionaries that offer sparse representations, i.e., $\ell_0$-optimal representations. Here we consider the problem of finding optimal dictionaries with which representations of samples of a random vector are optimal in an…
▽ More
Dictionaries are collections of vectors used for representations of random vectors in Euclidean spaces. Recent research on optimal dictionaries is focused on constructing dictionaries that offer sparse representations, i.e., $\ell_0$-optimal representations. Here we consider the problem of finding optimal dictionaries with which representations of samples of a random vector are optimal in an $\ell_2$-sense: optimality of representation is defined as attaining the minimal average $\ell_2$-norm of the coefficients used to represent the random vector. With the help of recent results on rank-$1$ decompositions of symmetric positive semidefinite matrices, we provide an explicit description of $\ell_2$-optimal dictionaries as well as their algorithmic constructions in polynomial time.
△ Less
Submitted 1 May, 2017; v1 submitted 7 March, 2016;
originally announced March 2016.