Search | arXiv e-print repository

On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization

Authors: Motahareh Sohrabi, Juan Ramirez, Tianyue H. Zhang, Simon Lacoste-Julien, Jose Gallego-Posada

Abstract: Constrained optimization offers a powerful framework to prescribe desired behaviors in neural network models. Typically, constrained problems are solved via their min-max Lagrangian formulations, which exhibit unstable oscillatory dynamics when optimized using gradient descent-ascent. The adoption of constrained optimization techniques in the machine learning community is currently limited by the… ▽ More Constrained optimization offers a powerful framework to prescribe desired behaviors in neural network models. Typically, constrained problems are solved via their min-max Lagrangian formulations, which exhibit unstable oscillatory dynamics when optimized using gradient descent-ascent. The adoption of constrained optimization techniques in the machine learning community is currently limited by the lack of reliable, general-purpose update schemes for the Lagrange multipliers. This paper proposes the $ν$PI algorithm and contributes an optimization perspective on Lagrange multiplier updates based on PI controllers, extending the work of Stooke, Achiam and Abbeel (2020). We provide theoretical and empirical insights explaining the inability of momentum methods to address the shortcomings of gradient descent-ascent, and contrast this with the empirical success of our proposed $ν$PI controller. Moreover, we prove that $ν$PI generalizes popular momentum methods for single-objective minimization. Our experiments demonstrate that $ν$PI reliably stabilizes the multiplier dynamics and its hyperparameters enjoy robust and predictable behavior. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: Published at ICML 2024. Code available at https://github.com/motahareh-sohrabi/nuPI

arXiv:2312.00928 [pdf, ps, other]

The Hat Guessing Number of Cactus Graphs and Cycles

Authors: Jeremy Chizewer, I. M. J. McInnis, Mehrdad Sohrabi, Shriya Kaistha

Abstract: We study the hat guessing game on graphs. In this game, a player is placed on each vertex $v$ of a graph $G$ and assigned a colored hat from $h(v)$ possible colors. Each player makes a deterministic guess on their hat color based on the colors assigned to the players on neighboring vertices, and the players win if at least one player correctly guesses his assigned color. If there exists a strategy… ▽ More We study the hat guessing game on graphs. In this game, a player is placed on each vertex $v$ of a graph $G$ and assigned a colored hat from $h(v)$ possible colors. Each player makes a deterministic guess on their hat color based on the colors assigned to the players on neighboring vertices, and the players win if at least one player correctly guesses his assigned color. If there exists a strategy that ensures at least one player guesses correctly for every possible assignment of colors, the game defined by $\langle G,h\rangle$ is called winning. The hat guessing number of $G$ is the largest integer $q$ so that if $h(v)=q$ for all $v\in G$ then $\langle G,h\rangle$ is winning. In this note, we determine whether $\langle G,h\rangle $ is winning for any $h$ whenever $G$ is a cycle, resolving a conjecture of Kokhas and Latyshev in the affirmative and extending it. We then use this result to determine the hat guessing number of every cactus graph, graphs in which every pair of cycles share at most one vertex. △ Less

Submitted 1 December, 2023; originally announced December 2023.

Comments: 13 pages, 5 figures

MSC Class: 05C57

arXiv:2311.03096 [pdf, other]

Weight-Sharing Regularization

Authors: Mehran Shakerinava, Motahareh Sohrabi, Siamak Ravanbakhsh, Simon Lacoste-Julien

Abstract: Weight-sharing is ubiquitous in deep learning. Motivated by this, we propose a "weight-sharing regularization" penalty on the weights $w \in \mathbb{R}^d$ of a neural network, defined as $\mathcal{R}(w) = \frac{1}{d - 1}\sum_{i > j}^d |w_i - w_j|$. We study the proximal mapping of $\mathcal{R}$ and provide an intuitive interpretation of it in terms of a physical system of interacting particles. We… ▽ More Weight-sharing is ubiquitous in deep learning. Motivated by this, we propose a "weight-sharing regularization" penalty on the weights $w \in \mathbb{R}^d$ of a neural network, defined as $\mathcal{R}(w) = \frac{1}{d - 1}\sum_{i > j}^d |w_i - w_j|$. We study the proximal mapping of $\mathcal{R}$ and provide an intuitive interpretation of it in terms of a physical system of interacting particles. We also parallelize existing algorithms for $\operatorname{prox}_\mathcal{R}$ (to run on GPU) and find that one of them is fast in practice but slow ($O(d)$) for worst-case inputs. Using the physical interpretation, we design a novel parallel algorithm which runs in $O(\log^3 d)$ when sufficient processors are available, thus guaranteeing fast training. Our experiments reveal that weight-sharing regularization enables fully connected networks to learn convolution-like filters even when pixels have been shuffled while convolutional neural networks fail in this setting. Our code is available on github. △ Less

Submitted 10 March, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

Comments: Our code is available at https://github.com/motahareh-sohrabi/weight-sharing-regularization

arXiv:2309.16413 [pdf, other]

Genetic Engineering Algorithm (GEA): An Efficient Metaheuristic Algorithm for Solving Combinatorial Optimization Problems

Authors: Majid Sohrabi, Amir M. Fathollahi-Fard, Vasilii A. Gromov

Abstract: Genetic Algorithms (GAs) are known for their efficiency in solving combinatorial optimization problems, thanks to their ability to explore diverse solution spaces, handle various representations, exploit parallelism, preserve good solutions, adapt to changing dynamics, handle combinatorial diversity, and provide heuristic search. However, limitations such as premature convergence, lack of problem-… ▽ More Genetic Algorithms (GAs) are known for their efficiency in solving combinatorial optimization problems, thanks to their ability to explore diverse solution spaces, handle various representations, exploit parallelism, preserve good solutions, adapt to changing dynamics, handle combinatorial diversity, and provide heuristic search. However, limitations such as premature convergence, lack of problem-specific knowledge, and randomness of crossover and mutation operators make GAs generally inefficient in finding an optimal solution. To address these limitations, this paper proposes a new metaheuristic algorithm called the Genetic Engineering Algorithm (GEA) that draws inspiration from genetic engineering concepts. GEA redesigns the traditional GA while incorporating new search methods to isolate, purify, insert, and express new genes based on existing ones, leading to the emergence of desired traits and the production of specific chromosomes based on the selected genes. Comparative evaluations against state-of-the-art algorithms on benchmark instances demonstrate the superior performance of GEA, showcasing its potential as an innovative and efficient solution for combinatorial optimization problems. △ Less

Submitted 28 September, 2023; originally announced September 2023.

Comments: Accepted in Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2023)

arXiv:2309.14399 [pdf, other]

Date-Driven Approach for Identifying State of Hemodialysis Fistulas: Entropy-Complexity and Formal Concept Analysis

Authors: Vasilii A. Gromov, E. I. Zvorykina, Yurii N. Beschastnov, Majid Sohrabi

Abstract: The paper explores mathematical methods that differentiate regular and chaotic time series, specifically for identifying pathological fistulas. It proposes a noise-resistant method for classifying responding rows of normally and pathologically functioning fistulas. This approach is grounded in the hypothesis that laminar blood flow signifies normal function, while turbulent flow indicates patholog… ▽ More The paper explores mathematical methods that differentiate regular and chaotic time series, specifically for identifying pathological fistulas. It proposes a noise-resistant method for classifying responding rows of normally and pathologically functioning fistulas. This approach is grounded in the hypothesis that laminar blood flow signifies normal function, while turbulent flow indicates pathology. The study explores two distinct methods for distinguishing chaotic from regular time series. The first method involves mapping the time series onto the entropy-complexity plane and subsequently comparing it to established clusters. The second method, introduced by the authors, constructs a concepts-objects graph using formal concept analysis. Both of these methods exhibit high efficiency in determining the state of the fistula. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: Accepted in AIST-2023 conference. Yerevan, Armenia

Showing 1–5 of 5 results for author: Sohrabi, M