-
Infinity Search: Approximate Vector Search with Projections on q-Metric Spaces
Authors:
Antonio Pariente,
Ignacio Hounie,
Santiago Segarra,
Alejandro Ribeiro
Abstract:
Despite the ubiquity of vector search applications, prevailing search algorithms overlook the metric structure of vector embeddings, treating it as a constraint rather than exploiting its underlying properties. In this paper, we demonstrate that in $q$-metric spaces, metric trees can leverage a stronger version of the triangle inequality to reduce comparisons for exact search. Notably, as $q$ appr…
▽ More
Despite the ubiquity of vector search applications, prevailing search algorithms overlook the metric structure of vector embeddings, treating it as a constraint rather than exploiting its underlying properties. In this paper, we demonstrate that in $q$-metric spaces, metric trees can leverage a stronger version of the triangle inequality to reduce comparisons for exact search. Notably, as $q$ approaches infinity, the search complexity becomes logarithmic. Therefore, we propose a novel projection method that embeds vector datasets with arbitrary dissimilarity measures into $q$-metric spaces while preserving the nearest neighbor. We propose to learn an approximation of this projection to efficiently transform query points to a space where euclidean distances satisfy the desired properties. Our experimental results with text and image vector embeddings show that learning $q$-metric approximations enables classic metric tree algorithms -- which typically underperform with high-dimensional data -- to achieve competitive performance against state-of-the-art search methods.
△ Less
Submitted 6 June, 2025;
originally announced June 2025.
-
Alignment of large language models with constrained learning
Authors:
Botong Zhang,
Shuo Li,
Ignacio Hounie,
Osbert Bastani,
Dongsheng Ding,
Alejandro Ribeiro
Abstract:
We study the problem of computing an optimal large language model (LLM) policy for a constrained alignment problem, where the goal is to maximize a primary reward objective while satisfying constraints on secondary utilities. Despite the popularity of Lagrangian-based LLM policy search in constrained alignment, iterative primal-dual methods often fail to converge, and non-iterative dual-based meth…
▽ More
We study the problem of computing an optimal large language model (LLM) policy for a constrained alignment problem, where the goal is to maximize a primary reward objective while satisfying constraints on secondary utilities. Despite the popularity of Lagrangian-based LLM policy search in constrained alignment, iterative primal-dual methods often fail to converge, and non-iterative dual-based methods do not achieve optimality in the LLM parameter space. To address these challenges, we employ Lagrangian duality to develop an iterative dual-based alignment method that alternates between updating the LLM policy via Lagrangian maximization and updating the dual variable via dual descent. In theory, we characterize the primal-dual gap between the primal value in the distribution space and the dual value in the LLM parameter space. We further quantify the optimality gap of the learned LLM policies at near-optimal dual variables with respect to both the objective and the constraint functions. These results prove that dual-based alignment methods can find an optimal constrained LLM policy, up to an LLM parametrization gap. We demonstrate the effectiveness and merits of our approach through extensive experiments conducted on the PKU-SafeRLHF dataset.
△ Less
Submitted 25 May, 2025;
originally announced May 2025.
-
Efficient Optimization Algorithms for Linear Adversarial Training
Authors:
Antônio H. RIbeiro,
Thomas B. Schön,
Dave Zahariah,
Francis Bach
Abstract:
Adversarial training can be used to learn models that are robust against perturbations. For linear models, it can be formulated as a convex optimization problem. Compared to methods proposed in the context of deep learning, leveraging the optimization structure allows significantly faster convergence rates. Still, the use of generic convex solvers can be inefficient for large-scale problems. Here,…
▽ More
Adversarial training can be used to learn models that are robust against perturbations. For linear models, it can be formulated as a convex optimization problem. Compared to methods proposed in the context of deep learning, leveraging the optimization structure allows significantly faster convergence rates. Still, the use of generic convex solvers can be inefficient for large-scale problems. Here, we propose tailored optimization algorithms for the adversarial training of linear models, which render large-scale regression and classification problems more tractable. For regression problems, we propose a family of solvers based on iterative ridge regression and, for classification, a family of solvers based on projected gradient descent. The methods are based on extended variable reformulations of the original problem. We illustrate their efficiency in numerical examples.
△ Less
Submitted 19 March, 2025; v1 submitted 16 October, 2024;
originally announced October 2024.
-
Fejér* monotonicity in optimization algorithms
Authors:
Roger Behling,
Yunier Bello-Cruz,
Alfredo Noel Iusem,
Ademir Alves Ribeiro,
Luiz-Rafael Santos
Abstract:
Fejér monotonicity is a well-established property commonly observed in sequences generated by optimization algorithms. In this paper, we introduce an extension of this property, called Fejér* monotonicity, which was initially proposed in [SIAM J. Optim., 34(3), 2535-2556 (2024)]. We discuss and build upon the concept by exploring its behavior within Hilbert spaces, presenting an illustrative examp…
▽ More
Fejér monotonicity is a well-established property commonly observed in sequences generated by optimization algorithms. In this paper, we introduce an extension of this property, called Fejér* monotonicity, which was initially proposed in [SIAM J. Optim., 34(3), 2535-2556 (2024)]. We discuss and build upon the concept by exploring its behavior within Hilbert spaces, presenting an illustrative example and insightful results regarding weak and strong convergence. We also compare Fejér* monotonicity with other weak notions of Fejér-like monotonicity, to better establish the role of Fejér* monotonicity in optimization algorithms.
△ Less
Submitted 10 October, 2024;
originally announced October 2024.
-
Constrained Diffusion Models via Dual Training
Authors:
Shervin Khalafi,
Dongsheng Ding,
Alejandro Ribeiro
Abstract:
Diffusion models have attained prominence for their ability to synthesize a probability distribution for a given dataset via a diffusion process, enabling the generation of new data points with high fidelity. However, diffusion processes are prone to generating samples that reflect biases in a training dataset. To address this issue, we develop constrained diffusion models by imposing diffusion co…
▽ More
Diffusion models have attained prominence for their ability to synthesize a probability distribution for a given dataset via a diffusion process, enabling the generation of new data points with high fidelity. However, diffusion processes are prone to generating samples that reflect biases in a training dataset. To address this issue, we develop constrained diffusion models by imposing diffusion constraints based on desired distributions that are informed by requirements. Specifically, we cast the training of diffusion models under requirements as a constrained distribution optimization problem that aims to reduce the distribution difference between original and generated data while obeying constraints on the distribution of generated data. We show that our constrained diffusion models generate new data from a mixture data distribution that achieves the optimal trade-off among objective and constraints. To train constrained diffusion models, we develop a dual training algorithm and characterize the optimality of the trained constrained diffusion model. We empirically demonstrate the effectiveness of our constrained models in two constrained generation tasks: (i) we consider a dataset with one or more underrepresented classes where we train the model with constraints to ensure fairly sampling from all classes during inference; (ii) we fine-tune a pre-trained diffusion model to sample from a new dataset while avoiding overfitting.
△ Less
Submitted 22 November, 2024; v1 submitted 27 August, 2024;
originally announced August 2024.
-
Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs
Authors:
Sergio Rozada,
Dongsheng Ding,
Antonio G. Marques,
Alejandro Ribeiro
Abstract:
We study the problem of computing deterministic optimal policies for constrained Markov decision processes (MDPs) with continuous state and action spaces, which are widely encountered in constrained dynamical systems. Designing deterministic policy gradient methods in continuous state and action spaces is particularly challenging due to the lack of enumerable state-action pairs and the adoption of…
▽ More
We study the problem of computing deterministic optimal policies for constrained Markov decision processes (MDPs) with continuous state and action spaces, which are widely encountered in constrained dynamical systems. Designing deterministic policy gradient methods in continuous state and action spaces is particularly challenging due to the lack of enumerable state-action pairs and the adoption of deterministic policies, hindering the application of existing policy gradient methods. To this end, we develop a deterministic policy gradient primal-dual method to find an optimal deterministic policy with non-asymptotic convergence. Specifically, we leverage regularization of the Lagrangian of the constrained MDP to propose a deterministic policy gradient primal-dual (D-PGPD) algorithm that updates the deterministic policy via a quadratic-regularized gradient ascent step and the dual variable via a quadratic-regularized gradient descent step. We prove that the primal-dual iterates of D-PGPD converge at a sub-linear rate to an optimal regularized primal-dual pair. We instantiate D-PGPD with function approximation and prove that the primal-dual iterates of D-PGPD converge at a sub-linear rate to an optimal regularized primal-dual pair, up to a function approximation error. Furthermore, we demonstrate the effectiveness of our method in two continuous control problems: robot navigation and fluid control. This appears to be the first work that proposes a deterministic policy search method for continuous-space constrained MDPs.
△ Less
Submitted 4 April, 2025; v1 submitted 19 August, 2024;
originally announced August 2024.
-
Neural Optimization with Adaptive Heuristics for Intelligent Marketing System
Authors:
Changshuai Wei,
Benjamin Zelditch,
Joyce Chen,
Andre Assuncao Silva T Ribeiro,
Jingyi Kenneth Tay,
Borja Ocejo Elizondo,
Keerthi Selvaraj,
Aman Gupta,
Licurgo Benemann De Almeida
Abstract:
Computational marketing has become increasingly important in today's digital world, facing challenges such as massive heterogeneous data, multi-channel customer journeys, and limited marketing budgets. In this paper, we propose a general framework for marketing AI systems, the Neural Optimization with Adaptive Heuristics (NOAH) framework. NOAH is the first general framework for marketing optimizat…
▽ More
Computational marketing has become increasingly important in today's digital world, facing challenges such as massive heterogeneous data, multi-channel customer journeys, and limited marketing budgets. In this paper, we propose a general framework for marketing AI systems, the Neural Optimization with Adaptive Heuristics (NOAH) framework. NOAH is the first general framework for marketing optimization that considers both to-business (2B) and to-consumer (2C) products, as well as both owned and paid channels. We describe key modules of the NOAH framework, including prediction, optimization, and adaptive heuristics, providing examples for bidding and content optimization. We then detail the successful application of NOAH to LinkedIn's email marketing system, showcasing significant wins over the legacy ranking system. Additionally, we share details and insights that are broadly useful, particularly on: (i) addressing delayed feedback with lifetime value, (ii) performing large-scale linear programming with randomization, (iii) improving retrieval with audience expansion, (iv) reducing signal dilution in targeting tests, and (v) handling zero-inflated heavy-tail metrics in statistical testing.
△ Less
Submitted 25 June, 2024; v1 submitted 16 May, 2024;
originally announced May 2024.
-
Near-Optimal Solutions of Constrained Learning Problems
Authors:
Juan Elenter,
Luiz F. O. Chamon,
Alejandro Ribeiro
Abstract:
With the widespread adoption of machine learning systems, the need to curtail their behavior has become increasingly apparent. This is evidenced by recent advancements towards developing models that satisfy robustness, safety, and fairness requirements. These requirements can be imposed (with generalization guarantees) by formulating constrained learning problems that can then be tackled by dual a…
▽ More
With the widespread adoption of machine learning systems, the need to curtail their behavior has become increasingly apparent. This is evidenced by recent advancements towards developing models that satisfy robustness, safety, and fairness requirements. These requirements can be imposed (with generalization guarantees) by formulating constrained learning problems that can then be tackled by dual ascent algorithms. Yet, though these algorithms converge in objective value, even in non-convex settings, they cannot guarantee that their outcome is feasible. Doing so requires randomizing over all iterates, which is impractical in virtually any modern applications. Still, final iterates have been observed to perform well in practice. In this work, we address this gap between theory and practice by characterizing the constraint violation of Lagrangian minimizers associated with optimal dual variables, despite lack of convexity. To do this, we leverage the fact that non-convex, finite-dimensional constrained learning problems can be seen as parametrizations of convex, functional problems. Our results show that rich parametrizations effectively mitigate the issue of feasibility in dual methods, shedding light on prior empirical successes of dual learning. We illustrate our findings in fair learning tasks.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Resilient Constrained Reinforcement Learning
Authors:
Dongsheng Ding,
Zhengyan Huan,
Alejandro Ribeiro
Abstract:
We study a class of constrained reinforcement learning (RL) problems in which multiple constraint specifications are not identified before training. It is challenging to identify appropriate constraint specifications due to the undefined trade-off between the reward maximization objective and the constraint satisfaction, which is ubiquitous in constrained decision-making. To tackle this issue, we…
▽ More
We study a class of constrained reinforcement learning (RL) problems in which multiple constraint specifications are not identified before training. It is challenging to identify appropriate constraint specifications due to the undefined trade-off between the reward maximization objective and the constraint satisfaction, which is ubiquitous in constrained decision-making. To tackle this issue, we propose a new constrained RL approach that searches for policy and constraint specifications together. This method features the adaptation of relaxing the constraint according to a relaxation cost introduced in the learning objective. Since this feature mimics how ecological systems adapt to disruptions by altering operation, our approach is termed as resilient constrained RL. Specifically, we provide a set of sufficient conditions that balance the constraint satisfaction and the reward maximization in notion of resilient equilibrium, propose a tractable formulation of resilient constrained policy optimization that takes this equilibrium as an optimal solution, and advocate two resilient constrained policy search algorithms with non-asymptotic convergence guarantees on the optimality gap and constraint satisfaction. Furthermore, we demonstrate the merits and the effectiveness of our approach in computational experiments.
△ Less
Submitted 29 December, 2023; v1 submitted 28 December, 2023;
originally announced December 2023.
-
Regularization properties of adversarially-trained linear regression
Authors:
Antônio H. Ribeiro,
Dave Zachariah,
Francis Bach,
Thomas B. Schön
Abstract:
State-of-the-art machine learning models can be vulnerable to very small input perturbations that are adversarially constructed. Adversarial training is an effective approach to defend against it. Formulated as a min-max problem, it searches for the best solution when the training data were corrupted by the worst-case attacks. Linear models are among the simple models where vulnerabilities can be…
▽ More
State-of-the-art machine learning models can be vulnerable to very small input perturbations that are adversarially constructed. Adversarial training is an effective approach to defend against it. Formulated as a min-max problem, it searches for the best solution when the training data were corrupted by the worst-case attacks. Linear models are among the simple models where vulnerabilities can be observed and are the focus of our study. In this case, adversarial training leads to a convex optimization problem which can be formulated as the minimization of a finite sum. We provide a comparative analysis between the solution of adversarial training in linear regression and other regularization methods. Our main findings are that: (A) Adversarial training yields the minimum-norm interpolating solution in the overparameterized regime (more parameters than data), as long as the maximum disturbance radius is smaller than a threshold. And, conversely, the minimum-norm interpolator is the solution to adversarial training with a given radius. (B) Adversarial training can be equivalent to parameter shrinking methods (ridge regression and Lasso). This happens in the underparametrized region, for an appropriate choice of adversarial radius and zero-mean symmetrically distributed covariates. (C) For $\ell_\infty$-adversarial training -- as in square-root Lasso -- the choice of adversarial radius for optimal bounds does not depend on the additive noise variance. We confirm our theoretical findings with numerical examples.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Navigation with shadow prices to optimize multi-commodity flow rates
Authors:
Ignacio Boero,
Igor Spasojevic,
Mariana del Castillo,
George Pappas,
Vijay Kumar,
Alejandro Ribeiro
Abstract:
We propose a method for providing communication network infrastructure in autonomous multi-agent teams. In particular, we consider a set of communication agents that are placed alongside regular agents from the system in order to improve the rate of information transfer between the latter. In order to find the optimal positions to place such agents, we define a flexible performance function that a…
▽ More
We propose a method for providing communication network infrastructure in autonomous multi-agent teams. In particular, we consider a set of communication agents that are placed alongside regular agents from the system in order to improve the rate of information transfer between the latter. In order to find the optimal positions to place such agents, we define a flexible performance function that adapts to network requirements for different systems. We provide an algorithm based on shadow prices of a related convex optimization problem in order to drive the configuration of the complete system towards a local maximum. We apply our method to three different performance functions associated with three practical scenarios in which we show both the performance of the algorithm and the flexibility it allows for optimizing different network requirements.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Revisited convexity notions for $L^\infty$ variational problems
Authors:
Ana Margarida Ribeiro,
Elvira Zappale
Abstract:
We address a deep study of the convexity notions that arise in the study of weak* lower semicontinuity of supremal functionals as well as those raised by the power-law approximation of such functionals. Our quest is motivated by the knowledge we have on the analogous integral functionals and aims at establishing a solid groundwork to ease any research in the $L^\infty$ context.
We address a deep study of the convexity notions that arise in the study of weak* lower semicontinuity of supremal functionals as well as those raised by the power-law approximation of such functionals. Our quest is motivated by the knowledge we have on the analogous integral functionals and aims at establishing a solid groundwork to ease any research in the $L^\infty$ context.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs
Authors:
Dongsheng Ding,
Chen-Yu Wei,
Kaiqing Zhang,
Alejandro Ribeiro
Abstract:
We study the problem of computing an optimal policy of an infinite-horizon discounted constrained Markov decision process (constrained MDP). Despite the popularity of Lagrangian-based policy search methods used in practice, the oscillation of policy iterates in these methods has not been fully understood, bringing out issues such as violation of constraints and sensitivity to hyper-parameters. To…
▽ More
We study the problem of computing an optimal policy of an infinite-horizon discounted constrained Markov decision process (constrained MDP). Despite the popularity of Lagrangian-based policy search methods used in practice, the oscillation of policy iterates in these methods has not been fully understood, bringing out issues such as violation of constraints and sensitivity to hyper-parameters. To fill this gap, we employ the Lagrangian method to cast a constrained MDP into a constrained saddle-point problem in which max/min players correspond to primal/dual variables, respectively, and develop two single-time-scale policy-based primal-dual algorithms with non-asymptotic convergence of their policy iterates to an optimal constrained policy. Specifically, we first propose a regularized policy gradient primal-dual (RPG-PD) method that updates the policy using an entropy-regularized policy gradient, and the dual variable via a quadratic-regularized gradient ascent, simultaneously. We prove that the policy primal-dual iterates of RPG-PD converge to a regularized saddle point with a sublinear rate, while the policy iterates converge sublinearly to an optimal constrained policy. We further instantiate RPG-PD in large state or action spaces by including function approximation in policy parametrization, and establish similar sublinear last-iterate policy convergence. Second, we propose an optimistic policy gradient primal-dual (OPG-PD) method that employs the optimistic gradient method to update primal/dual variables, simultaneously. We prove that the policy primal-dual iterates of OPG-PD converge to a saddle point that contains an optimal constrained policy, with a linear rate. To the best of our knowledge, this work appears to be the first non-asymptotic policy last-iterate convergence result for single-time-scale algorithms in constrained MDPs.
△ Less
Submitted 16 January, 2024; v1 submitted 20 June, 2023;
originally announced June 2023.
-
A Networked Multi-Agent System for Mobile Wireless Infrastructure on Demand
Authors:
Miguel Calvo-Fullana,
Mikhail Gerasimenko,
Daniel Mox,
Leopoldo Agorio,
Mariana del Castillo,
Vijay Kumar,
Alejandro Ribeiro,
Juan Andres Bazerque
Abstract:
Despite the prevalence of wireless connectivity in urban areas around the globe, there remain numerous and diverse situations where connectivity is insufficient or unavailable. To address this, we introduce mobile wireless infrastructure on demand, a system of UAVs that can be rapidly deployed to establish an ad-hoc wireless network. This network has the capability of reconfiguring itself dynamica…
▽ More
Despite the prevalence of wireless connectivity in urban areas around the globe, there remain numerous and diverse situations where connectivity is insufficient or unavailable. To address this, we introduce mobile wireless infrastructure on demand, a system of UAVs that can be rapidly deployed to establish an ad-hoc wireless network. This network has the capability of reconfiguring itself dynamically to satisfy and maintain the required quality of communication. The system optimizes the positions of the UAVs and the routing of data flows throughout the network to achieve this quality of service (QoS). By these means, task agents using the network simply request a desired QoS, and the system adapts accordingly while allowing them to move freely. We have validated this system both in simulation and in real-world experiments. The results demonstrate that our system effectively offers mobile wireless infrastructure on demand, extending the operational range of task agents and supporting complex mobility patterns, all while ensuring connectivity and being resilient to agent failures.
△ Less
Submitted 16 September, 2024; v1 submitted 14 June, 2023;
originally announced June 2023.
-
A Robust Scientific Machine Learning for Optimization: A Novel Robustness Theorem
Authors:
Luana P. Queiroz,
Carine M. Rebello,
Erber A. Costa,
Vinicius V. Santana,
Alirio E. Rodrigues,
Ana M. Ribeiro,
Idelfonso B. R. Nogueira
Abstract:
Scientific machine learning (SciML) is a field of increasing interest in several different application fields. In an optimization context, SciML-based tools have enabled the development of more efficient optimization methods. However, implementing SciML tools for optimization must be rigorously evaluated and performed with caution. This work proposes the deductions of a robustness test that guaran…
▽ More
Scientific machine learning (SciML) is a field of increasing interest in several different application fields. In an optimization context, SciML-based tools have enabled the development of more efficient optimization methods. However, implementing SciML tools for optimization must be rigorously evaluated and performed with caution. This work proposes the deductions of a robustness test that guarantees the robustness of multiobjective SciML-based optimization by showing that its results respect the universal approximator theorem. The test is applied in the framework of a novel methodology which is evaluated in a series of benchmarks illustrating its consistency. Moreover, the proposed methodology results are compared with feasible regions of rigorous optimization, which requires a significantly higher computational effort. Hence, this work provides a robustness test for guaranteed robustness in applying SciML tools in multiobjective optimization with lower computational effort than the existent alternative.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Surprises in adversarially-trained linear regression
Authors:
Antônio H. Ribeiro,
Dave Zachariah,
Thomas B. Schön
Abstract:
State-of-the-art machine learning models can be vulnerable to very small input perturbations that are adversarially constructed. Adversarial training is an effective approach to defend against such examples. It is formulated as a min-max problem, searching for the best solution when the training data was corrupted by the worst-case attacks. For linear regression problems, adversarial training can…
▽ More
State-of-the-art machine learning models can be vulnerable to very small input perturbations that are adversarially constructed. Adversarial training is an effective approach to defend against such examples. It is formulated as a min-max problem, searching for the best solution when the training data was corrupted by the worst-case attacks. For linear regression problems, adversarial training can be formulated as a convex problem. We use this reformulation to make two technical contributions: First, we formulate the training problem as an instance of robust regression to reveal its connection to parameter-shrinking methods, specifically that $\ell_\infty$-adversarial training produces sparse solutions. Secondly, we study adversarial training in the overparameterized regime, i.e. when there are more parameters than data. We prove that adversarial training with small disturbances gives the solution with the minimum-norm that interpolates the training data. Ridge regression and lasso approximate such interpolating solutions as their regularization parameter vanishes. By contrast, for adversarial training, the transition into the interpolation regime is abrupt and for non-zero values of disturbance. This result is proved and illustrated with numerical examples.
△ Less
Submitted 20 October, 2022; v1 submitted 25 May, 2022;
originally announced May 2022.
-
On strong second-order optimality conditions under relaxed constant rank constraint qualification
Authors:
Ademir Alves Ribeiro,
Mael Sachine
Abstract:
We discuss the (first- and second-order) optimality conditions for nonlinear programming under the relaxed constant rank constraint qualification. This condition generalizes the so-called linear independence constraint qualification. Although the optimality conditions are well established in the literature, the proofs presented here are based solely on the well-known inverse function theorem. This…
▽ More
We discuss the (first- and second-order) optimality conditions for nonlinear programming under the relaxed constant rank constraint qualification. This condition generalizes the so-called linear independence constraint qualification. Although the optimality conditions are well established in the literature, the proofs presented here are based solely on the well-known inverse function theorem. This is the only prerequisite from real analysis used to establish two auxiliary results needed to prove the optimality conditions, thereby making this paper totally self-contained.
△ Less
Submitted 26 April, 2022;
originally announced April 2022.
-
Overparameterized Linear Regression under Adversarial Attacks
Authors:
Antônio H. Ribeiro,
Thomas B. Schön
Abstract:
We study the error of linear regression in the face of adversarial attacks. In this framework, an adversary changes the input to the regression model in order to maximize the prediction error. We provide bounds on the prediction error in the presence of an adversary as a function of the parameter norm and the error in the absence of such an adversary. We show how these bounds make it possible to s…
▽ More
We study the error of linear regression in the face of adversarial attacks. In this framework, an adversary changes the input to the regression model in order to maximize the prediction error. We provide bounds on the prediction error in the presence of an adversary as a function of the parameter norm and the error in the absence of such an adversary. We show how these bounds make it possible to study the adversarial error using analysis from non-adversarial setups. The obtained results shed light on the robustness of overparameterized linear models to adversarial attacks. Adding features might be either a source of additional robustness or brittleness. On the one hand, we use asymptotic results to illustrate how double-descent curves can be obtained for the adversarial error. On the other hand, we derive conditions under which the adversarial error can grow to infinity as more features are added, while at the same time, the test error goes to zero. We show this behavior is caused by the fact that the norm of the parameter vector grows with the number of features. It is also established that $\ell_\infty$ and $\ell_2$-adversarial attacks might behave fundamentally differently due to how the $\ell_1$ and $\ell_2$-norms of random projections concentrate. We also show how our reformulation allows for solving adversarial training as a convex optimization problem. This fact is then exploited to establish similarities between adversarial training and parameter-shrinking methods and to study how the training might affect the robustness of the estimated models.
△ Less
Submitted 27 January, 2023; v1 submitted 13 April, 2022;
originally announced April 2022.
-
A Lagrangian Duality Approach to Active Learning
Authors:
Juan Elenter,
Navid NaderiAlizadeh,
Alejandro Ribeiro
Abstract:
We consider the pool-based active learning problem, where only a subset of the training data is labeled, and the goal is to query a batch of unlabeled samples to be labeled so as to maximally improve model performance. We formulate the problem using constrained learning, where a set of constraints bounds the performance of the model on labeled samples. Considering a primal-dual approach, we optimi…
▽ More
We consider the pool-based active learning problem, where only a subset of the training data is labeled, and the goal is to query a batch of unlabeled samples to be labeled so as to maximally improve model performance. We formulate the problem using constrained learning, where a set of constraints bounds the performance of the model on labeled samples. Considering a primal-dual approach, we optimize the primal variables, corresponding to the model parameters, as well as the dual variables, corresponding to the constraints. As each dual variable indicates how significantly the perturbation of the respective constraint affects the optimal value of the objective function, we use it as a proxy of the informativeness of the corresponding training sample. Our approach, which we refer to as Active Learning via Lagrangian dualitY, or ALLY, leverages this fact to select a diverse set of unlabeled samples with the highest estimated dual variables as our query set. We demonstrate the benefits of our approach in a variety of classification and regression tasks and discuss its limitations depending on the capacity of the model used and the degree of redundancy in the dataset. We also examine the impact of the distribution shift induced by active sampling and show that ALLY can be used in a generative mode to create novel, maximally-informative samples.
△ Less
Submitted 29 October, 2022; v1 submitted 8 February, 2022;
originally announced February 2022.
-
Linear Quadratic Control with Risk Constraints
Authors:
Anastasios Tsiamis,
Dionysios S. Kalogerias,
Alejandro Ribeiro,
George J. Pappas
Abstract:
We propose a new risk-constrained formulation of the classical Linear Quadratic (LQ) stochastic control problem for general partially-observed systems. Our framework is motivated by the fact that the risk-neutral LQ controllers, although optimal in expectation, might be ineffective under relatively infrequent, yet statistically significant extreme events. To effectively trade between average and e…
▽ More
We propose a new risk-constrained formulation of the classical Linear Quadratic (LQ) stochastic control problem for general partially-observed systems. Our framework is motivated by the fact that the risk-neutral LQ controllers, although optimal in expectation, might be ineffective under relatively infrequent, yet statistically significant extreme events. To effectively trade between average and extreme event performance, we introduce a new risk constraint, which explicitly restricts the total expected predictive variance of the state penalty by a user-prescribed level. We show that, under certain conditions on the process noise, the optimal risk-aware controller can be evaluated explicitly and in closed form. In fact, it is affine relative to the minimum mean square error (mmse) state estimate. The affine term pushes the state away from directions where the noise exhibits heavy tails, by exploiting the third-order moment~(skewness) of the noise. The linear term regulates the state more strictly in riskier directions, where both the prediction error (conditional) covariance and the state penalty are simultaneously large; this is achieved by inflating the state penalty within a new filtered Riccati difference equation. We also prove that the new risk-aware controller is internally stable, regardless of parameter tuning, in the special cases of i) fully-observed systems, and ii) partially-observed systems with Gaussian noise. The properties of the proposed risk-aware LQ framework are lastly illustrated via indicative numerical examples.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
A note on gradient Ricci soliton warped metrics
Authors:
José N. V. Gomes,
Marcus A. M. Marrocos,
Adrian V. C. Ribeiro
Abstract:
In this note, we prove triviality and nonexistence results for gradient Ricci soliton warped metrics. The proofs stem from the construction of gradient Ricci solitons that are realized as warped products, from which we know that the base spaces of these products are Ricci-Hessian type manifolds. We study this latter class of manifolds as the most appropriate setting to prove our results.
In this note, we prove triviality and nonexistence results for gradient Ricci soliton warped metrics. The proofs stem from the construction of gradient Ricci solitons that are realized as warped products, from which we know that the base spaces of these products are Ricci-Hessian type manifolds. We study this latter class of manifolds as the most appropriate setting to prove our results.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Constrained Learning with Non-Convex Losses
Authors:
Luiz F. O. Chamon,
Santiago Paternain,
Miguel Calvo-Fullana,
Alejandro Ribeiro
Abstract:
Though learning has become a core component of modern information processing, there is now ample evidence that it can lead to biased, unsafe, and prejudiced systems. The need to impose requirements on learning is therefore paramount, especially as it reaches critical applications in social, industrial, and medical domains. However, the non-convexity of most modern statistical problems is only exac…
▽ More
Though learning has become a core component of modern information processing, there is now ample evidence that it can lead to biased, unsafe, and prejudiced systems. The need to impose requirements on learning is therefore paramount, especially as it reaches critical applications in social, industrial, and medical domains. However, the non-convexity of most modern statistical problems is only exacerbated by the introduction of constraints. Whereas good unconstrained solutions can often be learned using empirical risk minimization, even obtaining a model that satisfies statistical constraints can be challenging. All the more so, a good one. In this paper, we overcome this issue by learning in the empirical dual domain, where constrained statistical learning problems become unconstrained and deterministic. We analyze the generalization properties of this approach by bounding the empirical duality gap -- i.e., the difference between our approximate, tractable solution and the solution of the original (non-convex) statistical problem -- and provide a practical constrained learning algorithm. These results establish a constrained counterpart to classical learning theory, enabling the explicit use of constraints in learning. We illustrate this theory and algorithm in rate-constrained learning applications arising in fairness and adversarial robustness.
△ Less
Submitted 19 October, 2022; v1 submitted 8 March, 2021;
originally announced March 2021.
-
State Augmented Constrained Reinforcement Learning: Overcoming the Limitations of Learning with Rewards
Authors:
Miguel Calvo-Fullana,
Santiago Paternain,
Luiz F. O. Chamon,
Alejandro Ribeiro
Abstract:
A common formulation of constrained reinforcement learning involves multiple rewards that must individually accumulate to given thresholds. In this class of problems, we show a simple example in which the desired optimal policy cannot be induced by any weighted linear combination of rewards. Hence, there exist constrained reinforcement learning problems for which neither regularized nor classical…
▽ More
A common formulation of constrained reinforcement learning involves multiple rewards that must individually accumulate to given thresholds. In this class of problems, we show a simple example in which the desired optimal policy cannot be induced by any weighted linear combination of rewards. Hence, there exist constrained reinforcement learning problems for which neither regularized nor classical primal-dual methods yield optimal policies. This work addresses this shortcoming by augmenting the state with Lagrange multipliers and reinterpreting primal-dual methods as the portion of the dynamics that drives the multipliers evolution. This approach provides a systematic state augmentation procedure that is guaranteed to solve reinforcement learning problems with constraints. Thus, as we illustrate by an example, while previous methods can fail at finding optimal policies, running the dual dynamics while executing the augmented policy yields an algorithm that provably samples actions from the optimal policy.
△ Less
Submitted 21 September, 2023; v1 submitted 23 February, 2021;
originally announced February 2021.
-
Trust but Verify: Assigning Prediction Credibility by Counterfactual Constrained Learning
Authors:
Luiz F. O. Chamon,
Santiago Paternain,
Alejandro Ribeiro
Abstract:
Prediction credibility measures, in the form of confidence intervals or probability distributions, are fundamental in statistics and machine learning to characterize model robustness, detect out-of-distribution samples (outliers), and protect against adversarial attacks. To be effective, these measures should (i) account for the wide variety of models used in practice, (ii) be computable for train…
▽ More
Prediction credibility measures, in the form of confidence intervals or probability distributions, are fundamental in statistics and machine learning to characterize model robustness, detect out-of-distribution samples (outliers), and protect against adversarial attacks. To be effective, these measures should (i) account for the wide variety of models used in practice, (ii) be computable for trained models or at least avoid modifying established training procedures, (iii) forgo the use of data, which can expose them to the same robustness issues and attacks as the underlying model, and (iv) be followed by theoretical guarantees. These principles underly the framework developed in this work, which expresses the credibility as a risk-fit trade-off, i.e., a compromise between how much can fit be improved by perturbing the model input and the magnitude of this perturbation (risk). Using a constrained optimization formulation and duality theory, we analyze this compromise and show that this balance can be determined counterfactually, without having to test multiple perturbations. This results in an unsupervised, a posteriori method of assigning prediction credibility for any (possibly non-convex) differentiable model, from RKHS-based solutions to any architecture of (feedforward, convolutional, graph) neural network. Its use is illustrated in data filtering and defense against adversarial attacks.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
A sequential optimality condition for Mathematical Programs with Cardinality Constraints
Authors:
Evelin H. M. Krulikovski,
Ademir A. Ribeiro,
Mael Sachine
Abstract:
In this paper we propose an Approximate Weak stationarity ($AW$-stationarity) concept designed to deal with {\em Mathematical Programs with Cardinality Constraints} (MPCaC), and we proved that it is a legitimate optimality condition independently of any constraint qualification. Such a sequential optimality condition improves weaker stationarity conditions, presented in a previous work. Many resea…
▽ More
In this paper we propose an Approximate Weak stationarity ($AW$-stationarity) concept designed to deal with {\em Mathematical Programs with Cardinality Constraints} (MPCaC), and we proved that it is a legitimate optimality condition independently of any constraint qualification. Such a sequential optimality condition improves weaker stationarity conditions, presented in a previous work. Many research on sequential optimality conditions has been addressed for nonlinear constrained optimization in the last few years, some works in the context of MPCC and, as far as we know, no sequential optimality condition has been proposed for MPCaC problems. We also establish some relationships between our $AW$-stationarity and other usual sequential optimality conditions, such as AKKT, CAKKT and PAKKT. We point out that, despite the computational appeal of the sequential optimality conditions, in this work we are not concerned with algorithmic consequences. Our aim is purely to discuss theoretical aspects of such conditions for MPCaC problems.
△ Less
Submitted 5 August, 2020;
originally announced August 2020.
-
On the weak stationarity conditions for Mathematical Programs with Cardinality Constraints: a unified approach
Authors:
Evelin H. M. Krulikovski,
Ademir A. Ribeiro,
Mael Sachine
Abstract:
In this paper, we study a class of optimization problems, called Mathematical Programs with Cardinality Constraints (MPCaC). This kind of problem is generally difficult to deal with, because it involves a constraint that is not continuous neither convex, but provides sparse solutions. Thereby we reformulate MPCaC in a suitable way, by modeling it as mixed-integer problem and then addressing its co…
▽ More
In this paper, we study a class of optimization problems, called Mathematical Programs with Cardinality Constraints (MPCaC). This kind of problem is generally difficult to deal with, because it involves a constraint that is not continuous neither convex, but provides sparse solutions. Thereby we reformulate MPCaC in a suitable way, by modeling it as mixed-integer problem and then addressing its continuous counterpart, which will be referred to as relaxed problem. We investigate the relaxed problem by analyzing the classical constraints in two cases: linear and nonlinear. In the linear case, we propose a general approach and present a discussion of the Guignard and Abadie constraint qualifications, proving in this case that every minimizer of the relaxed problem satisfies the Karush-Kuhn-Tucker (KKT) conditions. On the other hand, in the nonlinear case, we show that some standard constraint qualifications may be violated. Therefore, we cannot assert about KKT points. Motivated to find a minimizer for the MPCaC problem, we define new and weaker stationarity conditions, by proposing a unified approach that goes from the weakest to the strongest stationarity.
△ Less
Submitted 31 July, 2020;
originally announced August 2020.
-
Zeroth-order Deterministic Policy Gradient
Authors:
Harshat Kumar,
Dionysios S. Kalogerias,
George J. Pappas,
Alejandro Ribeiro
Abstract:
Deterministic Policy Gradient (DPG) removes a level of randomness from standard randomized-action Policy Gradient (PG), and demonstrates substantial empirical success for tackling complex dynamic problems involving Markov decision processes. At the same time, though, DPG loses its ability to learn in a model-free (i.e., actor-only) fashion, frequently necessitating the use of critics in order to o…
▽ More
Deterministic Policy Gradient (DPG) removes a level of randomness from standard randomized-action Policy Gradient (PG), and demonstrates substantial empirical success for tackling complex dynamic problems involving Markov decision processes. At the same time, though, DPG loses its ability to learn in a model-free (i.e., actor-only) fashion, frequently necessitating the use of critics in order to obtain consistent estimates of the associated policy-reward gradient. In this work, we introduce Zeroth-order Deterministic Policy Gradient (ZDPG), which approximates policy-reward gradients via two-point stochastic evaluations of the $Q$-function, constructed by properly designed low-dimensional action-space perturbations. Exploiting the idea of random horizon rollouts for obtaining unbiased estimates of the $Q$-function, ZDPG lifts the dependence on critics and restores true model-free policy learning, while enjoying built-in and provable algorithmic stability. Additionally, we present new finite sample complexity bounds for ZDPG, which improve upon existing results by up to two orders of magnitude. Our findings are supported by several numerical experiments, which showcase the effectiveness of ZDPG in a practical setting, and its advantages over both PG and Baseline PG.
△ Less
Submitted 11 July, 2020; v1 submitted 12 June, 2020;
originally announced June 2020.
-
Probably Approximately Correct Constrained Learning
Authors:
Luiz F. O. Chamon,
Alejandro Ribeiro
Abstract:
As learning solutions reach critical applications in social, industrial, and medical domains, the need to curtail their behavior has become paramount. There is now ample evidence that without explicit tailoring, learning can lead to biased, unsafe, and prejudiced solutions. To tackle these problems, we develop a generalization theory of constrained learning based on the probably approximately corr…
▽ More
As learning solutions reach critical applications in social, industrial, and medical domains, the need to curtail their behavior has become paramount. There is now ample evidence that without explicit tailoring, learning can lead to biased, unsafe, and prejudiced solutions. To tackle these problems, we develop a generalization theory of constrained learning based on the probably approximately correct (PAC) learning framework. In particular, we show that imposing requirements does not make a learning problem harder in the sense that any PAC learnable class is also PAC constrained learnable using a constrained counterpart of the empirical risk minimization (ERM) rule. For typical parametrized models, however, this learner involves solving a constrained non-convex optimization program for which even obtaining a feasible solution is challenging. To overcome this issue, we prove that under mild conditions the empirical dual problem of constrained learning is also a PAC constrained learner that now leads to a practical constrained learning algorithm based solely on solving unconstrained problems. We analyze the generalization properties of this solution and use it to illustrate how constrained learning can address problems in fair and robust classification.
△ Less
Submitted 17 February, 2021; v1 submitted 9 June, 2020;
originally announced June 2020.
-
Risk-Constrained Linear-Quadratic Regulators
Authors:
Anastasios Tsiamis,
Dionysios S. Kalogerias,
Luiz F. O. Chamon,
Alejandro Ribeiro,
George J. Pappas
Abstract:
We propose a new risk-constrained reformulation of the standard Linear Quadratic Regulator (LQR) problem. Our framework is motivated by the fact that the classical (risk-neutral) LQR controller, although optimal in expectation, might be ineffective under relatively infrequent, yet statistically significant (risky) events. To effectively trade between average and extreme event performance, we intro…
▽ More
We propose a new risk-constrained reformulation of the standard Linear Quadratic Regulator (LQR) problem. Our framework is motivated by the fact that the classical (risk-neutral) LQR controller, although optimal in expectation, might be ineffective under relatively infrequent, yet statistically significant (risky) events. To effectively trade between average and extreme event performance, we introduce a new risk constraint, which explicitly restricts the total expected predictive variance of the state penalty by a user-prescribed level. We show that, under rather minimal conditions on the process noise (i.e., finite fourth-order moments), the optimal risk-aware controller can be evaluated explicitly and in closed form. In fact, it is affine relative to the state, and is always internally stable regardless of parameter tuning. Our new risk-aware controller: i) pushes the state away from directions where the noise exhibits heavy tails, by exploiting the third-order moment (skewness) of the noise; ii) inflates the state penalty in riskier directions, where both the noise covariance and the state penalty are simultaneously large. The properties of the proposed risk-aware LQR framework are also illustrated via indicative numerical examples.
△ Less
Submitted 28 October, 2020; v1 submitted 9 April, 2020;
originally announced April 2020.
-
Resilient Control: Compromising to Adapt
Authors:
Luiz F. O. Chamon,
Alexandre Amice,
Santiago Paternain,
Alejandro Ribeiro
Abstract:
In optimal control problems, disturbances are typically dealt with using robust solutions, such as H-infinity or tube model predictive control, that plan control actions feasible for the worst-case disturbance. Yet, planning for every contingency can lead to over-conservative, poorly performing solutions or even, in extreme cases, to infeasibility. Resilience addresses these shortcomings by adapti…
▽ More
In optimal control problems, disturbances are typically dealt with using robust solutions, such as H-infinity or tube model predictive control, that plan control actions feasible for the worst-case disturbance. Yet, planning for every contingency can lead to over-conservative, poorly performing solutions or even, in extreme cases, to infeasibility. Resilience addresses these shortcomings by adapting the underlying control problem, e.g., by relaxing its specifications, to obtain a feasible, possibly still valuable trajectory. Despite their different aspects, robustness and resilience are often conflated in the context of dynamical systems and control. The goal of this paper is to formalize, in the context of optimal control, the concept of resilience understood as above, i.e., in terms of adaptation. To do so, we introduce a resilient formulation of optimal control by allowing disruption-dependent modifications of the requirements that induce the desired resilient behavior. We then propose a framework to design these behaviors automatically by trading off control performance and requirement violations. We analyze this resilience-by-compromise method to obtain inverse optimality results and quantify the effect of disturbances on the induced requirement relaxations. By proving that robustness and resilience optimize different objectives, we show that these are in fact distinct system properties. We conclude by illustrating the effect of resilience in different control problems.
△ Less
Submitted 25 August, 2020; v1 submitted 7 April, 2020;
originally announced April 2020.
-
Approximately Supermodular Scheduling Subject to Matroid Constraints
Authors:
Luiz F. O. Chamon,
Alexandre Amice,
Alejandro Ribeiro
Abstract:
Control scheduling refers to the problem of assigning agents or actuators to act upon a dynamical system at specific times so as to minimize a quadratic control cost, such as the objective of the Linear-quadratic-Gaussian (LQG) or the Linear Quadratic Regulator (LQR). When budget or operational constraints are imposed on the schedule, this problem is in general NP-hard and its solution can therefo…
▽ More
Control scheduling refers to the problem of assigning agents or actuators to act upon a dynamical system at specific times so as to minimize a quadratic control cost, such as the objective of the Linear-quadratic-Gaussian (LQG) or the Linear Quadratic Regulator (LQR). When budget or operational constraints are imposed on the schedule, this problem is in general NP-hard and its solution can therefore only be approximated even for moderately sized systems. The quality of this approximation depends on the structure of both the constraints and the objective. This work shows that greedy scheduling is near-optimal when the constraints can be represented as an intersection of matroids, algebraic structures that encode requirements such as limits on the number of agents deployed per time slot, total number of actuator uses, and duty cycle restrictions. To do so, it proves that the LQG cost function is alpha-supermodular and provides a new alpha/(alpha + P)-optimality certificates for the greedy minimization of such functions over an intersections of P matroids. These certificates are shown to approach the 1/(1+P) guarantee of supermodular functions in relevant settings. These results support the use of greedy algorithms in non-supermodular quadratic control problems as opposed to typical heuristics such as convex relaxations and surrogate figures of merit, e.g., the logdet of the controllability Gramian.
△ Less
Submitted 29 March, 2021; v1 submitted 19 March, 2020;
originally announced March 2020.
-
The empirical duality gap of constrained statistical learning
Authors:
Luiz F. O. Chamon,
Santiago Paternain,
Miguel Calvo-Fullana,
Alejandro Ribeiro
Abstract:
This paper is concerned with the study of constrained statistical learning problems, the unconstrained version of which are at the core of virtually all of modern information processing. Accounting for constraints, however, is paramount to incorporate prior knowledge and impose desired structural and statistical properties on the solutions. Still, solving constrained statistical problems remains c…
▽ More
This paper is concerned with the study of constrained statistical learning problems, the unconstrained version of which are at the core of virtually all of modern information processing. Accounting for constraints, however, is paramount to incorporate prior knowledge and impose desired structural and statistical properties on the solutions. Still, solving constrained statistical problems remains challenging and guarantees scarce, leaving them to be tackled using regularized formulations. Though practical and effective, selecting regularization parameters so as to satisfy requirements is challenging, if at all possible, due to the lack of a straightforward relation between parameters and constraints. In this work, we propose to directly tackle the constrained statistical problem overcoming its infinite dimensionality, unknown distributions, and constraints by leveraging finite dimensional parameterizations, sample averages, and duality theory. Aside from making the problem tractable, these tools allow us to bound the empirical duality gap, i.e., the difference between our approximate tractable solutions and the actual solutions of the original statistical problem. We demonstrate the effectiveness and usefulness of this constrained formulation in a fair learning application.
△ Less
Submitted 12 February, 2020;
originally announced February 2020.
-
Counterfactual Programming for Optimal Control
Authors:
Luiz F. O. Chamon,
Santiago Paternain,
Alejandro Ribeiro
Abstract:
In recent years, considerable work has been done to tackle the issue of designing control laws based on observations to allow unknown dynamical systems to perform pre-specified tasks. At least as important for autonomy, however, is the issue of learning which tasks can be performed in the first place. This is particularly critical in situations where multiple (possibly conflicting) tasks and requi…
▽ More
In recent years, considerable work has been done to tackle the issue of designing control laws based on observations to allow unknown dynamical systems to perform pre-specified tasks. At least as important for autonomy, however, is the issue of learning which tasks can be performed in the first place. This is particularly critical in situations where multiple (possibly conflicting) tasks and requirements are demanded from the agent, resulting in infeasible specifications. Such situations arise due to over-specification or dynamic operating conditions and are only aggravated when the dynamical system model is learned through simulations. Often, these issues are tackled using regularization and penalties tuned based on application-specific expert knowledge. Nevertheless, this solution becomes impractical for large-scale systems, unknown operating conditions, and/or in online settings where expert input would be needed during the system operation. Instead, this work enables agents to autonomously pose, tune, and solve optimal control problems by compromising between performance and specification costs. Leveraging duality theory, it puts forward a counterfactual optimization algorithm that directly determines the specification trade-off while solving the optimal control problem.
△ Less
Submitted 5 May, 2020; v1 submitted 29 January, 2020;
originally announced January 2020.
-
Approximate Supermodularity of Kalman Filter Sensor Selection
Authors:
Luiz F. O. Chamon,
George J. Pappas,
Alejandro Ribeiro
Abstract:
This work considers the problem of selecting sensors in a large scale system to minimize the error in estimating its states. More specifically, the state estimation mean-square error(MSE) and worst-case error for Kalman filtering and smoothing. Such selection problems are in general NP-hard, i.e., their solution can only be approximated in practice even for moderately large problems. Due to its lo…
▽ More
This work considers the problem of selecting sensors in a large scale system to minimize the error in estimating its states. More specifically, the state estimation mean-square error(MSE) and worst-case error for Kalman filtering and smoothing. Such selection problems are in general NP-hard, i.e., their solution can only be approximated in practice even for moderately large problems. Due to its low complexity and iterative nature, greedy algorithms are often used to obtain these approximations by selecting one sensor at a time choosing at each step the one that minimizes the estimation performance metric. When this metric is supermodular, this solution is guaranteed to be (1-1/e)-optimal. This is however not the case for the MSE or the worst-case error. This issue is often circumvented by using supermodular surrogates, such as the logdet, despite the fact that minimizing the logdet is not equivalent to minimizing the MSE. Here, this issue is addressed by leveraging the concept of approximate supermodularity to derive near-optimality certificates for greedily minimizing the estimation mean-square and worst-case error. In typical application scenarios, these certificates approach the (1-1/e) guarantee obtained for supermodular functions, thus demonstrating that no change to the original problem is needed to obtain guaranteed good performance.
△ Less
Submitted 21 February, 2020; v1 submitted 8 December, 2019;
originally announced December 2019.
-
Risk-Aware MMSE Estimation
Authors:
Dionysios S. Kalogerias,
Luiz F. O. Chamon,
George J. Pappas,
Alejandro Ribeiro
Abstract:
Despite the simplicity and intuitive interpretation of Minimum Mean Squared Error (MMSE) estimators, their effectiveness in certain scenarios is questionable. Indeed, minimizing squared errors on average does not provide any form of stability, as the volatility of the estimation error is left unconstrained. When this volatility is statistically significant, the difference between the average and r…
▽ More
Despite the simplicity and intuitive interpretation of Minimum Mean Squared Error (MMSE) estimators, their effectiveness in certain scenarios is questionable. Indeed, minimizing squared errors on average does not provide any form of stability, as the volatility of the estimation error is left unconstrained. When this volatility is statistically significant, the difference between the average and realized performance of the MMSE estimator can be drastically different. To address this issue, we introduce a new risk-aware MMSE formulation which trades between mean performance and risk by explicitly constraining the expected predictive variance of the involved squared error. We show that, under mild moment boundedness conditions, the corresponding risk-aware optimal solution can be evaluated explicitly, and has the form of an appropriately biased nonlinear MMSE estimator. We further illustrate the effectiveness of our approach via several numerical examples, which also showcase the advantages of risk-aware MMSE estimation against risk-neutral MMSE estimation, especially in models involving skewed, heavy-tailed distributions.
△ Less
Submitted 5 December, 2019;
originally announced December 2019.
-
Safe Policies for Reinforcement Learning via Primal-Dual Methods
Authors:
Santiago Paternain,
Miguel Calvo-Fullana,
Luiz F. O. Chamon,
Alejandro Ribeiro
Abstract:
In this paper, we study the learning of safe policies in the setting of reinforcement learning problems. This is, we aim to control a Markov Decision Process (MDP) of which we do not know the transition probabilities, but we have access to sample trajectories through experience. We define safety as the agent remaining in a desired safe set with high probability during the operation time. We theref…
▽ More
In this paper, we study the learning of safe policies in the setting of reinforcement learning problems. This is, we aim to control a Markov Decision Process (MDP) of which we do not know the transition probabilities, but we have access to sample trajectories through experience. We define safety as the agent remaining in a desired safe set with high probability during the operation time. We therefore consider a constrained MDP where the constraints are probabilistic. Since there is no straightforward way to optimize the policy with respect to the probabilistic constraint in a reinforcement learning framework, we propose an ergodic relaxation of the problem. The advantages of the proposed relaxation are threefold. (i) The safety guarantees are maintained in the case of episodic tasks and they are kept up to a given time horizon for continuing tasks. (ii) The constrained optimization problem despite its non-convexity has arbitrarily small duality gap if the parametrization of the policy is rich enough. (iii) The gradients of the Lagrangian associated with the safe-learning problem can be easily computed using standard policy gradient results and stochastic approximation tools. Leveraging these advantages, we establish that primal-dual algorithms are able to find policies that are safe and optimal. We test the proposed approach in a navigation task in a continuous domain. The numerical results show that our algorithm is capable of dynamically adapting the policy to the environment and the required safety levels.
△ Less
Submitted 12 January, 2022; v1 submitted 20 November, 2019;
originally announced November 2019.
-
Model-Free Learning of Optimal Ergodic Policies in Wireless Systems
Authors:
Dionysios S. Kalogerias,
Mark Eisen,
George J. Pappas,
Alejandro Ribeiro
Abstract:
Learning optimal resource allocation policies in wireless systems can be effectively achieved by formulating finite dimensional constrained programs which depend on system configuration, as well as the adopted learning parameterization. The interest here is in cases where system models are unavailable, prompting methods that probe the wireless system with candidate policies, and then use observed…
▽ More
Learning optimal resource allocation policies in wireless systems can be effectively achieved by formulating finite dimensional constrained programs which depend on system configuration, as well as the adopted learning parameterization. The interest here is in cases where system models are unavailable, prompting methods that probe the wireless system with candidate policies, and then use observed performance to determine better policies. This generic procedure is difficult because of the need to cull accurate gradient estimates out of these limited system queries. This paper constructs and exploits smoothed surrogates of constrained ergodic resource allocation problems, the gradients of the former being representable exactly as averages of finite differences that can be obtained through limited system probing. Leveraging this unique property, we develop a new model-free primal-dual algorithm for learning optimal ergodic resource allocations, while we rigorously analyze the relationships between original policy search problems and their surrogates, in both primal and dual domains. First, we show that both primal and dual domain surrogates are uniformly consistent approximations of their corresponding original finite dimensional counterparts. Upon further assuming the use of near-universal policy parameterizations, we also develop explicit bounds on the gap between optimal values of initial, infinite dimensional resource allocation problems, and dual values of their parameterized smoothed surrogates. In fact, we show that this duality gap decreases at a linear rate relative to smoothing and universality parameters. Thus, it can be made arbitrarily small at will, also justifying our proposed primal-dual algorithmic recipe. Numerical simulations confirm the effectiveness of our approach.
△ Less
Submitted 10 November, 2019;
originally announced November 2019.
-
Metric Representations of Networks: A Uniqueness Result
Authors:
Santiago Segarra,
T. Mitchell Roddenberry,
Facundo Memoli,
Alejandro Ribeiro
Abstract:
In this paper, we consider the problem of projecting networks onto metric spaces. Networks are structures that encode relationships between pairs of elements or nodes. However, these relationships can be independent of each other, and need not be defined for every pair of nodes. This is in contrast to a metric space, which requires that a distance between every pair of elements in the space be def…
▽ More
In this paper, we consider the problem of projecting networks onto metric spaces. Networks are structures that encode relationships between pairs of elements or nodes. However, these relationships can be independent of each other, and need not be defined for every pair of nodes. This is in contrast to a metric space, which requires that a distance between every pair of elements in the space be defined. To understand how to project networks onto metric spaces, we take an axiomatic approach: we first state two axioms for projective maps from the set of all networks to the set of finite metric spaces, then show that only one projection satisfies these requirements. The developed technique is shown to be an effective method for finding approximate solutions to combinatorial optimization problems. Finally, we illustrate the use of metric trees for efficient search in projected networks.
△ Less
Submitted 31 October, 2019;
originally announced November 2019.
-
Constrained Reinforcement Learning Has Zero Duality Gap
Authors:
Santiago Paternain,
Luiz F. O. Chamon,
Miguel Calvo-Fullana,
Alejandro Ribeiro
Abstract:
Autonomous agents must often deal with conflicting requirements, such as completing tasks using the least amount of time/energy, learning multiple tasks, or dealing with multiple opponents. In the context of reinforcement learning~(RL), these problems are addressed by (i)~designing a reward function that simultaneously describes all requirements or (ii)~combining modular value functions that encod…
▽ More
Autonomous agents must often deal with conflicting requirements, such as completing tasks using the least amount of time/energy, learning multiple tasks, or dealing with multiple opponents. In the context of reinforcement learning~(RL), these problems are addressed by (i)~designing a reward function that simultaneously describes all requirements or (ii)~combining modular value functions that encode them individually. Though effective, these methods have critical downsides. Designing good reward functions that balance different objectives is challenging, especially as the number of objectives grows. Moreover, implicit interference between goals may lead to performance plateaus as they compete for resources, particularly when training on-policy. Similarly, selecting parameters to combine value functions is at least as hard as designing an all-encompassing reward, given that the effect of their values on the overall policy is not straightforward. The later is generally addressed by formulating the conflicting requirements as a constrained RL problem and solved using Primal-Dual methods. These algorithms are in general not guaranteed to converge to the optimal solution since the problem is not convex. This work provides theoretical support to these approaches by establishing that despite its non-convexity, this problem has zero duality gap, i.e., it can be solved exactly in the dual domain, where it becomes convex. Finally, we show this result basically holds if the policy is described by a good parametrization~(e.g., neural networks) and we connect this result with primal-dual algorithms present in the literature and we establish the convergence to the optimal solution.
△ Less
Submitted 29 October, 2019;
originally announced October 2019.
-
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation
Authors:
Harshat Kumar,
Alec Koppel,
Alejandro Ribeiro
Abstract:
Reinforcement learning, mathematically described by Markov Decision Problems, may be approached either through dynamic programming or policy search. Actor-critic algorithms combine the merits of both approaches by alternating between steps to estimate the value function and policy gradient updates. Due to the fact that the updates exhibit correlated noise and biased gradient updates, only the asym…
▽ More
Reinforcement learning, mathematically described by Markov Decision Problems, may be approached either through dynamic programming or policy search. Actor-critic algorithms combine the merits of both approaches by alternating between steps to estimate the value function and policy gradient updates. Due to the fact that the updates exhibit correlated noise and biased gradient updates, only the asymptotic behavior of actor-critic is known by connecting its behavior to dynamical systems. This work puts forth a new variant of actor-critic that employs Monte Carlo rollouts during the policy search updates, which results in controllable bias that depends on the number of critic evaluations. As a result, we are able to provide for the first time the convergence rate of actor-critic algorithms when the policy search step employs policy gradient, agnostic to the choice of policy evaluation technique. In particular, we establish conditions under which the sample complexity is comparable to stochastic gradient method for non-convex problems or slower as a result of the critic estimation error, which is the main complexity bottleneck. These results hold in continuous state and action spaces with linear function approximation for the value function. We then specialize these conceptual results to the case where the critic is estimated by Temporal Difference, Gradient Temporal Difference, and Accelerated Gradient Temporal Difference. These learning rates are then corroborated on a navigation problem involving an obstacle and the pendulum problem which provide insight into the interplay between optimization and generalization in reinforcement learning.
△ Less
Submitted 27 January, 2023; v1 submitted 18 October, 2019;
originally announced October 2019.
-
Graph Policy Gradients for Large Scale Unlabeled Motion Planning with Constraints
Authors:
Arbaaz Khan,
Vijay Kumar,
Alejandro Ribeiro
Abstract:
In this paper, we present a learning method to solve the unlabelled motion problem with motion constraints and space constraints in 2D space for a large number of robots. To solve the problem of arbitrary dynamics and constraints we propose formulating the problem as a multi-agent problem. In contrast to previous works that propose using learning solutions for unlabelled motion planning with const…
▽ More
In this paper, we present a learning method to solve the unlabelled motion problem with motion constraints and space constraints in 2D space for a large number of robots. To solve the problem of arbitrary dynamics and constraints we propose formulating the problem as a multi-agent problem. In contrast to previous works that propose using learning solutions for unlabelled motion planning with constraints, we are able to demonstrate the scalability of our methods for a large number of robots. The curse of dimensionality one encounters when working with a large number of robots is mitigated by employing a graph convolutional neural (GCN) network to parametrize policies for the robots. The GCN reduces the dimensionality of the problem by learning filters that aggregate information among robots locally, similar to how a convolutional neural network is able to learn local features in an image. Additionally, by employing a GCN we are also able to overcome the computational overhead of training policies for a large number of robots by first training graph filters for a small number of robots followed by zero-shot policy transfer to a larger number of robots. We demonstrate the effectiveness of our framework through various simulations.
△ Less
Submitted 24 September, 2019;
originally announced September 2019.
-
Source Seeking in Unknown Environments with Convex Obstacles
Authors:
Bruno A. Angélico,
Luiz F. O. Chamon,
Santiago Paternain,
Alejandro Ribeiro,
George J. Pappas
Abstract:
Navigation tasks often cannot be defined in terms of a target, either because global position information is unavailable or unreliable or because target location is not explicitly known a priori. This task is then often defined indirectly as a source seeking problem in which the autonomous agent navigates so as to minimize the convex potential induced by a source while avoiding obstacles. This wor…
▽ More
Navigation tasks often cannot be defined in terms of a target, either because global position information is unavailable or unreliable or because target location is not explicitly known a priori. This task is then often defined indirectly as a source seeking problem in which the autonomous agent navigates so as to minimize the convex potential induced by a source while avoiding obstacles. This work addresses this problem when only scalar measurements of the potential are available, i.e., without gradient information. To do so, it construct an artificial potential over which an exact gradient dynamics would generate a collision-free trajectory to the target in a world with convex obstacles. Then, leveraging extremum seeking control loops, it minimizes this artificial potential to navigate smoothly to the source location. We prove that the proposed solution not only finds the source, but does so while avoiding any obstacle. Numerical results with velocity-actuated particles, simulations with an omni-directional robot in ROS+Gazebo, and a robot-in-the-loop experiment are used to illustrate the performance of this approach.
△ Less
Submitted 16 September, 2019;
originally announced September 2019.
-
Navigation of a Quadratic Potential with Ellipsoidal Obstacles
Authors:
Harshat Kumar,
Santiago Paternain,
Alejandro Ribeiro
Abstract:
Given a convex quadratic potential of which its minimum is the agent's goal and a Euclidean space populated with ellipsoidal obstacles, one can construct a Rimon-Koditschek (RK) artificial potential to navigate. Its negative gradient attracts the agent toward the goal and repels the agent away from the boundary of the obstacles. This is a popular approach to navigation problems since it can be imp…
▽ More
Given a convex quadratic potential of which its minimum is the agent's goal and a Euclidean space populated with ellipsoidal obstacles, one can construct a Rimon-Koditschek (RK) artificial potential to navigate. Its negative gradient attracts the agent toward the goal and repels the agent away from the boundary of the obstacles. This is a popular approach to navigation problems since it can be implemented with local spatial information that is acquired during operation time. However, navigation is only successful in situations where the obstacles are not too eccentric (flat). This paper proposes a modification to gradient dynamics that allows successful navigation of an environment with a quadratic cost and ellipsoidal obstacles regardless of their eccentricity. This is accomplished by altering gradient dynamics with a Hessian correction that is intended to imitate worlds with spherical obstacles in which RK potentials are known to work. The resulting dynamics simplify by the quadratic form of the obstacles. Convergence to the goal and obstacle avoidance is established from almost every initial position (up to a set of measure one) in the free space, with mild conditions on the location of the target. Results are corroborated empirically with numerical simulations.
△ Less
Submitted 12 September, 2022; v1 submitted 22 August, 2019;
originally announced August 2019.
-
Gradient flow formulations of discrete and continuous evolutionary models: a unifying perspective
Authors:
Fabio A. C. C. Chalub,
Léonard Monsaingeon,
Ana Margarida Ribeiro,
Max O. Souza
Abstract:
We consider three classical models of biological evolution: (i) the Moran process, an example of a reducible Markov Chain; (ii) the Kimura Equation, a particular case of a degenerated Fokker-Planck Diffusion; (iii) the Replicator Equation, a paradigm in Evolutionary Game Theory. While these approaches are not completely equivalent, they are intimately connected, since (ii) is the diffusion approxi…
▽ More
We consider three classical models of biological evolution: (i) the Moran process, an example of a reducible Markov Chain; (ii) the Kimura Equation, a particular case of a degenerated Fokker-Planck Diffusion; (iii) the Replicator Equation, a paradigm in Evolutionary Game Theory. While these approaches are not completely equivalent, they are intimately connected, since (ii) is the diffusion approximation of (i), and (iii) is obtained from (ii) in an appropriate limit. It is well known that the Replicator Dynamics for two strategies is a gradient flow with respect to the celebrated Shahshahani distance. We reformulate the Moran process and the Kimura Equation as gradient flows and in the sequel we discuss conditions such that the associated gradient structures converge: (i) to (ii) and (ii) to (iii). This provides a geometric characterisation of these evolutionary processes and provides a reformulation of the above examples as time minimization of free energy functionals.
△ Less
Submitted 8 October, 2020; v1 submitted 2 July, 2019;
originally announced July 2019.
-
Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness
Authors:
Antônio H. Ribeiro,
Koen Tiels,
Luis A. Aguirre,
Thomas B. Schön
Abstract:
The exploding and vanishing gradient problem has been the major conceptual principle behind most architecture and training improvements in recurrent neural networks (RNNs) during the last decade. In this paper, we argue that this principle, while powerful, might need some refinement to explain recent developments. We refine the concept of exploding gradients by reformulating the problem in terms o…
▽ More
The exploding and vanishing gradient problem has been the major conceptual principle behind most architecture and training improvements in recurrent neural networks (RNNs) during the last decade. In this paper, we argue that this principle, while powerful, might need some refinement to explain recent developments. We refine the concept of exploding gradients by reformulating the problem in terms of the cost function smoothness, which gives insight into higher-order derivatives and the existence of regions with many close local minima. We also clarify the distinction between vanishing gradients and the need for the RNN to learn attractors to fully use its expressive power. Through the lens of these refinements, we shed new light on recent developments in the RNN field, namely stable RNN and unitary (or orthogonal) RNNs.
△ Less
Submitted 5 March, 2020; v1 submitted 20 June, 2019;
originally announced June 2019.
-
On the smoothness of nonlinear system identification
Authors:
Antônio H. Ribeiro,
Koen Tiels,
Jack Umenberger,
Thomas B. Schön,
Luis A. Aguirre
Abstract:
We shed new light on the \textit{smoothness} of optimization problems arising in prediction error parameter estimation of linear and nonlinear systems. We show that for regions of the parameter space where the model is not contractive, the Lipschitz constant and $β$-smoothness of the objective function might blow up exponentially with the simulation length, making it hard to numerically find minim…
▽ More
We shed new light on the \textit{smoothness} of optimization problems arising in prediction error parameter estimation of linear and nonlinear systems. We show that for regions of the parameter space where the model is not contractive, the Lipschitz constant and $β$-smoothness of the objective function might blow up exponentially with the simulation length, making it hard to numerically find minima within those regions or, even, to escape from them. In addition to providing theoretical understanding of this problem, this paper also proposes the use of multiple shooting as a viable solution. The proposed method minimizes the error between a prediction model and the observed values. Rather than running the prediction model over the entire dataset, multiple shooting splits the data into smaller subsets and runs the prediction model over each subset, making the simulation length a design parameter and making it possible to solve problems that would be infeasible using a standard approach. The equivalence to the original problem is obtained by including constraints in the optimization. The new method is illustrated by estimating the parameters of nonlinear systems with chaotic or unstable behavior, as well as neural networks. We also present a comparative analysis of the proposed method with multi-step-ahead prediction error minimization.
△ Less
Submitted 7 August, 2020; v1 submitted 2 May, 2019;
originally announced May 2019.
-
Calculus, constrained minimization and Lagrange multipliers: Is the optimal critical point a local minimizer?
Authors:
Ademir Alves Ribeiro,
Jose Renato Ramos Barbosa
Abstract:
In this short note, we discuss how the optimality conditions for the problem of minimizing a multivariate function subject to equality constraints have been dealt with in undergraduate Calculus. We are particularly interested in the 2 or 3-dimensional cases, which are the most common cases in Calculus courses. Besides giving sufficient conditions to a critical point to be a local minimizer, we als…
▽ More
In this short note, we discuss how the optimality conditions for the problem of minimizing a multivariate function subject to equality constraints have been dealt with in undergraduate Calculus. We are particularly interested in the 2 or 3-dimensional cases, which are the most common cases in Calculus courses. Besides giving sufficient conditions to a critical point to be a local minimizer, we also present and discuss counterexamples to some statements encountered in the undergraduate literature on Lagrange Multipliers, such as `among the critical points, the ones which have the smallest image (under the function) are minimizers' or `a single critical point (which is a local minimizer) is a global minimizer'.
△ Less
Submitted 10 April, 2019;
originally announced April 2019.
-
Distributed Constrained Online Learning
Authors:
Santiago Paternain,
Soomin Lee,
Michael M. Zavlanos,
Alejandro Ribeiro
Abstract:
In this paper, we consider groups of agents in a network that select actions in order to satisfy a set of constraints that vary arbitrarily over time and minimize a time-varying function of which they have only local observations. The selection of actions, also called a strategy, is causal and decentralized, i.e., the dynamical system that determines the actions of a given agent depends only on th…
▽ More
In this paper, we consider groups of agents in a network that select actions in order to satisfy a set of constraints that vary arbitrarily over time and minimize a time-varying function of which they have only local observations. The selection of actions, also called a strategy, is causal and decentralized, i.e., the dynamical system that determines the actions of a given agent depends only on the constraints at the current time and on its own actions and those of its neighbors. To determine such a strategy, we propose a decentralized saddle point algorithm and show that the corresponding global fit and regret are bounded by functions of the order of $\sqrt{T}$. Specifically, we define the global fit of a strategy as a vector that integrates over time the global constraint violations as seen by a given node. The fit is a performance loss associated with online operation as opposed to offline clairvoyant operation which can always select an action if one exists, that satisfies the constraints at all times. If this fit grows sublinearly with the time horizon it suggests that the strategy approaches the feasible set of actions. Likewise, we define the regret of a strategy as the difference between its accumulated cost and that of the best fixed action that one could select knowing beforehand the time evolution of the objective function. Numerical examples support the theoretical conclusions.
△ Less
Submitted 14 March, 2019;
originally announced March 2019.
-
A Stochastic Trust Region Method for Non-convex Minimization
Authors:
Zebang Shen,
Pan Zhou,
Cong Fang,
Alejandro Ribeiro
Abstract:
We target the problem of finding a local minimum in non-convex finite-sum minimization. Towards this goal, we first prove that the trust region method with inexact gradient and Hessian estimation can achieve a convergence rate of order $\mathcal{O}(1/{k^{2/3}})$ as long as those differential estimations are sufficiently accurate. Combining such result with a novel Hessian estimator, we propose the…
▽ More
We target the problem of finding a local minimum in non-convex finite-sum minimization. Towards this goal, we first prove that the trust region method with inexact gradient and Hessian estimation can achieve a convergence rate of order $\mathcal{O}(1/{k^{2/3}})$ as long as those differential estimations are sufficiently accurate. Combining such result with a novel Hessian estimator, we propose the sample-efficient stochastic trust region (STR) algorithm which finds an $(ε, \sqrtε)$-approximate local minimum within $\mathcal{O}({\sqrt{n}}/{ε^{1.5}})$ stochastic Hessian oracle queries. This improves state-of-the-art result by $\mathcal{O}(n^{1/6})$. Experiments verify theoretical conclusions and the efficiency of STR.
△ Less
Submitted 4 March, 2019;
originally announced March 2019.
-
Functional Nonlinear Sparse Models
Authors:
Luiz F. O. Chamon,
Yonina C. Eldar,
Alejandro Ribeiro
Abstract:
Signal processing is rich in inherently continuous and often nonlinear applications, such as spectral estimation, optical imaging, and super-resolution microscopy, in which sparsity plays a key role in obtaining state-of-the-art results. Coping with the infinite dimensionality and non-convexity of these problems typically involves discretization and convex relaxations, e.g., using atomic norms. Ne…
▽ More
Signal processing is rich in inherently continuous and often nonlinear applications, such as spectral estimation, optical imaging, and super-resolution microscopy, in which sparsity plays a key role in obtaining state-of-the-art results. Coping with the infinite dimensionality and non-convexity of these problems typically involves discretization and convex relaxations, e.g., using atomic norms. Nevertheless, grid mismatch and other coherence issues often lead to discretized versions of sparse signals that are not sparse. Even if they are, recovering sparse solutions using convex relaxations requires assumptions that may be hard to meet in practice. What is more, problems involving nonlinear measurements remain non-convex even after relaxing the sparsity objective. We address these issues by directly tackling the continuous, nonlinear problem cast as a sparse functional optimization program. We prove that when these problems are non-atomic, they have no duality gap and can therefore be solved efficiently using duality and~(stochastic) convex optimization methods. We illustrate the wide range of applications of this approach by formulating and solving problems from nonlinear spectral estimation and robust classification.
△ Less
Submitted 20 March, 2020; v1 submitted 1 November, 2018;
originally announced November 2018.