-
System Level Synthesis for Affine Control Policies: Model Based and Data-Driven Settings
Authors:
Lukas Schüepp,
Giulia De Pasquale,
Florian Dörfler,
Carmen Amo Alonso
Abstract:
There is an increasing need for effective control of systems with complex dynamics, particularly through data-driven approaches. System Level Synthesis (SLS) has emerged as a powerful framework that facilitates the control of large-scale systems while accounting for model uncertainties. SLS approaches are currently limited to linear systems and time-varying linear control policies, thus limiting t…
▽ More
There is an increasing need for effective control of systems with complex dynamics, particularly through data-driven approaches. System Level Synthesis (SLS) has emerged as a powerful framework that facilitates the control of large-scale systems while accounting for model uncertainties. SLS approaches are currently limited to linear systems and time-varying linear control policies, thus limiting the class of achievable control strategies. We introduce a novel closed-loop parameterization for time-varying affine control policies, extending the SLS framework to a broader class of systems and policies. We show that the closed-loop behavior under affine policies can be equivalently characterized using past system trajectories, enabling a fully data-driven formulation. This parameterization seamlessly integrates affine policies into optimal control problems, allowing for a closed-loop formulation of general Model Predictive Control (MPC) problems. To the best of our knowledge, this is the first work to extend SLS to affine policies in both model-based and data-driven settings, enabling an equivalent formulation of MPC problems using closed-loop maps. We validate our approach through numerical experiments, demonstrating that our model-based and data-driven affine SLS formulations achieve performance on par with traditional model-based MPC.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Authors:
Federico Arangath Joseph,
Jerome Sieber,
Melanie N. Zeilinger,
Carmen Amo Alonso
Abstract:
Rank collapse, a phenomenon where embedding vectors in sequence models rapidly converge to a uniform token or equilibrium state, has recently gained attention in the deep learning literature. This phenomenon leads to reduced expressivity and potential training instabilities due to vanishing gradients. Empirical evidence suggests that architectural components like skip connections, LayerNorm, and M…
▽ More
Rank collapse, a phenomenon where embedding vectors in sequence models rapidly converge to a uniform token or equilibrium state, has recently gained attention in the deep learning literature. This phenomenon leads to reduced expressivity and potential training instabilities due to vanishing gradients. Empirical evidence suggests that architectural components like skip connections, LayerNorm, and MultiLayer Perceptrons (MLPs) play critical roles in mitigating rank collapse. While this issue is well-documented for transformers, alternative sequence models, such as State Space Models (SSMs), which have recently gained prominence, have not been thoroughly examined for similar vulnerabilities. This paper extends the theory of rank collapse from transformers to SSMs using a unifying framework that captures both architectures. We study how a parametrized version of the classic skip connection component, which we call \emph{lambda-skip connections}, provides guarantees for rank collapse prevention. Through analytical results, we present a sufficient condition to guarantee prevention of rank collapse across all the aforementioned architectures. We also study the necessity of this condition via ablation studies and analytical examples. To our knowledge, this is the first study that provides a general guarantee to prevent rank collapse, and that investigates rank collapse in the context of SSMs, offering valuable understanding for both theoreticians and practitioners. Finally, we validate our findings with experiments demonstrating the crucial role of architectural components such as skip connections and gating mechanisms in preventing rank collapse.
△ Less
Submitted 13 February, 2025; v1 submitted 14 October, 2024;
originally announced October 2024.
-
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
Authors:
Jerome Sieber,
Carmen Amo Alonso,
Alexandre Didier,
Melanie N. Zeilinger,
Antonio Orvieto
Abstract:
Softmax attention is the principle backbone of foundation models for various artificial intelligence applications, yet its quadratic complexity in sequence length can limit its inference throughput in long-context settings. To address this challenge, alternative architectures such as linear attention, State Space Models (SSMs), and Recurrent Neural Networks (RNNs) have been considered as more effi…
▽ More
Softmax attention is the principle backbone of foundation models for various artificial intelligence applications, yet its quadratic complexity in sequence length can limit its inference throughput in long-context settings. To address this challenge, alternative architectures such as linear attention, State Space Models (SSMs), and Recurrent Neural Networks (RNNs) have been considered as more efficient alternatives. While connections between these approaches exist, such models are commonly developed in isolation and there is a lack of theoretical understanding of the shared principles underpinning these architectures and their subtle differences, greatly influencing performance and scalability. In this paper, we introduce the Dynamical Systems Framework (DSF), which allows a principled investigation of all these architectures in a common representation. Our framework facilitates rigorous comparisons, providing new insights on the distinctive characteristics of each model class. For instance, we compare linear attention and selective SSMs, detailing their differences and conditions under which both are equivalent. We also provide principled comparisons between softmax attention and other model classes, discussing the theoretical conditions under which softmax attention can be approximated. Additionally, we substantiate these new insights with empirical validations and mathematical arguments. This shows the DSF's potential to guide the systematic development of future more efficient and scalable foundation models.
△ Less
Submitted 8 December, 2024; v1 submitted 24 May, 2024;
originally announced May 2024.
-
Linearly Controlled Language Generation with Performative Guarantees
Authors:
Emily Cheng,
Marco Baroni,
Carmen Amo Alonso
Abstract:
The increasing prevalence of Large Language Models (LMs) in critical applications highlights the need for controlled language generation strategies that are not only computationally efficient but that also enjoy performance guarantees. To achieve this, we use a common model of concept semantics as linearly represented in an LM's latent space. In particular, we take the view that natural language g…
▽ More
The increasing prevalence of Large Language Models (LMs) in critical applications highlights the need for controlled language generation strategies that are not only computationally efficient but that also enjoy performance guarantees. To achieve this, we use a common model of concept semantics as linearly represented in an LM's latent space. In particular, we take the view that natural language generation traces a trajectory in this continuous semantic space, realized by the language model's hidden activations. This view permits a control-theoretic treatment of text generation in latent space, in which we propose a lightweight, gradient-free intervention that dynamically steers trajectories away from regions corresponding to undesired meanings. Crucially, we show that this intervention, which we compute in closed form, is guaranteed (in probability) to steer the output into the allowed region. Finally, we demonstrate on a toxicity avoidance objective that the intervention steers language away from undesired content while maintaining text quality.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
State Space Models as Foundation Models: A Control Theoretic Overview
Authors:
Carmen Amo Alonso,
Jerome Sieber,
Melanie N. Zeilinger
Abstract:
In recent years, there has been a growing interest in integrating linear state-space models (SSM) in deep neural network architectures of foundation models. This is exemplified by the recent success of Mamba, showing better performance than the state-of-the-art Transformer architectures in language tasks. Foundation models, like e.g. GPT-4, aim to encode sequential data into a latent space in orde…
▽ More
In recent years, there has been a growing interest in integrating linear state-space models (SSM) in deep neural network architectures of foundation models. This is exemplified by the recent success of Mamba, showing better performance than the state-of-the-art Transformer architectures in language tasks. Foundation models, like e.g. GPT-4, aim to encode sequential data into a latent space in order to learn a compressed representation of the data. The same goal has been pursued by control theorists using SSMs to efficiently model dynamical systems. Therefore, SSMs can be naturally connected to deep sequence modeling, offering the opportunity to create synergies between the corresponding research areas. This paper is intended as a gentle introduction to SSM-based architectures for control theorists and summarizes the latest research developments. It provides a systematic review of the most successful SSM proposals and highlights their main features from a control theoretic perspective. Additionally, we present a comparative analysis of these models, evaluating their performance on a standardized benchmark designed for assessing a model's efficiency at learning long sequences.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
NARRATE: Versatile Language Architecture for Optimal Control in Robotics
Authors:
Seif Ismail,
Antonio Arbues,
Ryan Cotterell,
René Zurbrügg,
Carmen Amo Alonso
Abstract:
The impressive capabilities of Large Language Models (LLMs) have led to various efforts to enable robots to be controlled through natural language instructions, opening exciting possibilities for human-robot interaction The goal is for the motor-control task to be performed accurately, efficiently and safely while also enjoying the flexibility imparted by LLMs to specify and adjust the task throug…
▽ More
The impressive capabilities of Large Language Models (LLMs) have led to various efforts to enable robots to be controlled through natural language instructions, opening exciting possibilities for human-robot interaction The goal is for the motor-control task to be performed accurately, efficiently and safely while also enjoying the flexibility imparted by LLMs to specify and adjust the task through natural language. In this work, we demonstrate how a careful layering of an LLM in combination with a Model Predictive Control (MPC) formulation allows for accurate and flexible robotic control via natural language while taking into consideration safety constraints. In particular, we rely on the LLM to effectively frame constraints and objective functions as mathematical expressions, which are later used in the motor-control module via MPC. The transparency of the optimization formulation allows for interpretability of the task and enables adjustments through human feedback. We demonstrate the validity of our method through extensive experiments on long-horizon reasoning, contact-rich, and multi-object interaction tasks. Our evaluations show that NARRATE outperforms current existing methods on these benchmarks and effectively transfers to the real world on two different embodiments. Videos, Code and Prompts at narrate-mpc.github.io
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Global Performance Guarantees for Localized Model Predictive Control
Authors:
Jing Shuang Li,
Carmen Amo Alonso
Abstract:
Recent advances in model predictive control (MPC) leverage local communication constraints to produce localized MPC algorithms whose complexities scale independently of total network size. However, no characterization is available regarding global performance, i.e. whether localized MPC (with communication constraints) performs just as well as global MPC (no communication constraints). In this pap…
▽ More
Recent advances in model predictive control (MPC) leverage local communication constraints to produce localized MPC algorithms whose complexities scale independently of total network size. However, no characterization is available regarding global performance, i.e. whether localized MPC (with communication constraints) performs just as well as global MPC (no communication constraints). In this paper, we provide analysis and guarantees on global performance of localized MPC -- in particular, we derive sufficient conditions for optimal global performance in the presence of local communication constraints. We also present an algorithm to determine the communication structure for a given system that will preserve performance while minimizing computational complexity. The effectiveness of the algorithm is verified in simulations, and additional relationships between network properties and performance-preserving communication constraints are characterized. A striking finding is that in a network of 121 coupled pendula, each node only needs to communicate with its immediate neighbors to preserve optimal global performance. Overall, this work offers theoretical understanding on the effect of local communication on global performance, and provides practitioners with the tools necessary to deploy localized model predictive control by establishing a rigorous method of selecting local communication constraints. This work also demonstrates -- surprisingly -- that the inclusion of severe communication constraints need not compromise global performance.
△ Less
Submitted 29 August, 2023; v1 submitted 20 March, 2023;
originally announced March 2023.
-
Distributed and Localized Model Predictive Control. Part II: Theoretical Guarantees
Authors:
Carmen Amo Alonso,
Jing Shuang Li,
Nikolai Matni,
James Anderson
Abstract:
Engineered cyberphysical systems are growing increasingly large and complex. These systems require scalable controllers that robustly satisfy state and input constraints in the presence of additive noise -- such controllers should also be accompanied by theoretical guarantees on feasibility and stability. In our companion paper, we introduced Distributed and Localized Model Predictive Control (DLM…
▽ More
Engineered cyberphysical systems are growing increasingly large and complex. These systems require scalable controllers that robustly satisfy state and input constraints in the presence of additive noise -- such controllers should also be accompanied by theoretical guarantees on feasibility and stability. In our companion paper, we introduced Distributed and Localized Model Predictive Control (DLMPC) for large-scale linear systems; DLMPC is a scalable closed-loop MPC scheme in which subsystems need only exchange local information in order to synthesize and implement local controllers. In this paper, we provide recursive feasibility and asymptotic stability guarantees for DLMPC. We leverage the System Level Synthesis framework to express the maximal positive robust invariant set for the closed-loop system and its corresponding Lyapunov function, both in terms of the closed-loop system responses. We use the invariant set as the terminal set for DLMPC, and show that this guarantees feasibility with minimal conservatism. We use the Lyapunov function as the terminal cost, and show that this guarantees stability. We provide fully distributed and localized algorithms to compute the terminal set offline, and also provide necessary additions to the online DLMPC algorithm to accommodate coupled terminal constraint and cost. In all algorithms, only local information exchanges are necessary, and computational complexity is independent of the global system size -- we demonstrate this analytically and experimentally. This is the first distributed MPC approach that provides minimally conservative yet fully distributed guarantees for recursive feasibility and asymptotic stability, for both nominal and robust settings.
△ Less
Submitted 1 March, 2022;
originally announced March 2022.
-
Data-driven Distributed and Localized Model Predictive Control
Authors:
Carmen Amo Alonso,
Fengjun Yang,
Nikolai Matni
Abstract:
Motivated by large-scale but computationally constrained settings, e.g., the Internet of Things, we present a novel data-driven distributed control algorithm that is synthesized directly from trajectory data. Our method, data-driven Distributed and Localized Model Predictive Control (D$^3$LMPC), builds upon the data-driven System Level Synthesis (SLS) framework, which allows one to parameterize \e…
▽ More
Motivated by large-scale but computationally constrained settings, e.g., the Internet of Things, we present a novel data-driven distributed control algorithm that is synthesized directly from trajectory data. Our method, data-driven Distributed and Localized Model Predictive Control (D$^3$LMPC), builds upon the data-driven System Level Synthesis (SLS) framework, which allows one to parameterize \emph{closed-loop} system responses directly from collected open-loop trajectories. The resulting model-predictive controller can be implemented with distributed computation and only local information sharing. By imposing locality constraints on the system response, we show that the amount of data needed for our synthesis problem is independent of the size of the global system. Moreover, we show that our algorithm enjoys theoretical guarantees for recursive feasibility and asymptotic stability. Finally, we also demonstrate the optimality and scalability of our algorithm in a simulation experiment.
△ Less
Submitted 22 December, 2021;
originally announced December 2021.
-
Distributed and Localized Model Predictive Control. Part I: Synthesis and Implementation
Authors:
Carmen Amo Alonso,
Jing Shuang Li,
James Anderson,
Nikolai Matni
Abstract:
The increasing presence of large-scale distributed systems highlights the need for scalable control strategies where only local communication is required. Moreover, in safety-critical systems it is imperative that such control strategies handle constraints in the presence of disturbances. In response to this need, we present the Distributed and Localized Model Predictive Control (DLMPC) algorithm…
▽ More
The increasing presence of large-scale distributed systems highlights the need for scalable control strategies where only local communication is required. Moreover, in safety-critical systems it is imperative that such control strategies handle constraints in the presence of disturbances. In response to this need, we present the Distributed and Localized Model Predictive Control (DLMPC) algorithm for large-scale linear systems. DLMPC is a distributed closed-loop model predictive control (MPC) scheme wherein only local state and model information needs to be exchanged between subsystems for the computation and implementation of control actions. We use the System Level Synthesis (SLS) framework to reformulate the centralized MPC problem, and show that this allows us to naturally impose localized communication constraints between sub-controllers. The structure of the resulting problem can be exploited to develop an Alternating Direction Method of Multipliers (ADMM) based algorithm that allows for distributed and localized computation of closed-loop control policies. We demonstrate that computational complexity of the subproblems solved by each subsystem in DLMPC is independent of the size of the global system. To the best of our knowledge, DLMPC is the first MPC algorithm that allows for the scalable distributed computation as well as implementation of distributed closed-loop control policies, and seemingly deals with additive disturbances. In our companion paper, we show that this approach enjoys recursive feasibility and asymptotic stability.
△ Less
Submitted 14 March, 2022; v1 submitted 13 October, 2021;
originally announced October 2021.
-
Effective GPU Parallelization of Distributed and Localized Model Predictive Control
Authors:
Carmen Amo Alonso,
Shih-Hao Tseng
Abstract:
To effectively control large-scale distributed systems online, model predictive control (MPC) has to swiftly solve the underlying high-dimensional optimization. There are multiple techniques applied to accelerate the solving process in the literature, mainly attributed to software-based algorithmic advancements and hardware-assisted computation enhancements. However, those methods focus on arithme…
▽ More
To effectively control large-scale distributed systems online, model predictive control (MPC) has to swiftly solve the underlying high-dimensional optimization. There are multiple techniques applied to accelerate the solving process in the literature, mainly attributed to software-based algorithmic advancements and hardware-assisted computation enhancements. However, those methods focus on arithmetic accelerations and overlook the benefits of the underlying system's structure. In particular, the existing decoupled software-hardware algorithm design that naively parallelizes the arithmetic operations by the hardware does not tackle the hardware overheads such as CPU-GPU and thread-to-thread communications in a principled manner. Also, the advantages of parallelizable subproblem decomposition in distributed MPC are not well recognized and exploited. As a result, we have not reached the full potential of hardware acceleration for MPC. In this paper, we explore those opportunities by leveraging GPU to parallelize the distributed and localized MPC (DLMPC) algorithm. We exploit the locality constraints embedded in the DLMPC formulation to reduce the hardware-intrinsic communication overheads. Our parallel implementation achieves up to 50x faster runtime than its CPU counterparts under various parameters. Furthermore, we find that the locality-aware GPU parallelization could halve the optimization runtime comparing to the naive acceleration. Overall, our results demonstrate the performance gains brought by software-hardware co-design with the information exchange structure in mind.
△ Less
Submitted 27 March, 2021;
originally announced March 2021.
-
Robust Distributed and Localized Model Predictive Control
Authors:
Carmen Amo Alonso,
Jing Shuang Li,
Nikolai Matni,
James Anderson
Abstract:
We present a robust Distributed and Localized Model Predictive Control (rDLMPC) framework for large-scale structured linear systems. The proposed algorithm uses the System Level Synthesis to provide a distributed closed-loop model predictive control scheme that is robust to exogenous disturbances. The resulting controllers require only local information exchange for both synthesis and implementati…
▽ More
We present a robust Distributed and Localized Model Predictive Control (rDLMPC) framework for large-scale structured linear systems. The proposed algorithm uses the System Level Synthesis to provide a distributed closed-loop model predictive control scheme that is robust to exogenous disturbances. The resulting controllers require only local information exchange for both synthesis and implementation. We exploit the fact that for polytopic disturbance constraints, SLS- based distributed control problems have been shown to have structure amenable for distributed optimization techniques. We show that similar to the disturbance-free DLMPC algorithm, the computational complexity of rDLMPC is independent of the size of the global system. To the best of our knowledge, robust DLMPC is the first MPC algorithm that allows for the scalable distributed computation of distributed closed-loop control policies in the presence of additive disturbances.
△ Less
Submitted 25 March, 2021;
originally announced March 2021.
-
Distributed Linear Quadratic Regulator Robust to Communication Dropouts
Authors:
Carmen Amo Alonso,
Dimitar Ho,
Jose M. Maestre
Abstract:
We present a solution to deal with information package dropouts in distributed controllers for large-scale networks. We do this by leveraging the System Level Synthesis approach, a control framework particularly suitable for large-scale networks that addresses information exchange in a very transparent manner. To this end, we propose two different schemes for controller synthesis and implementatio…
▽ More
We present a solution to deal with information package dropouts in distributed controllers for large-scale networks. We do this by leveraging the System Level Synthesis approach, a control framework particularly suitable for large-scale networks that addresses information exchange in a very transparent manner. To this end, we propose two different schemes for controller synthesis and implementation. The first one synthesizes a controller inherently robust to dropouts, which is later implemented in an offline fashion. For the second approach, we synthesize a collection of controllers offline and then switch between different controllers online depending on the current dropouts detected in the system. The two approaches are illustrated and compared by means of a simulation example.
△ Less
Submitted 5 March, 2021;
originally announced March 2021.
-
Frontiers in Scalable Distributed Control: SLS, MPC, and Beyond
Authors:
Jing Shuang Li,
Carmen Amo Alonso,
John C. Doyle
Abstract:
The System Level Synthesis (SLS) approach facilitates distributed control of large cyberphysical networks in an easy-to-understand, computationally scalable way. We present an overview of the SLS approach and its associated extensions in nonlinear control, MPC, adaptive control, and learning for control. To illustrate the effectiveness of SLS-based methods, we present a case study motivated by the…
▽ More
The System Level Synthesis (SLS) approach facilitates distributed control of large cyberphysical networks in an easy-to-understand, computationally scalable way. We present an overview of the SLS approach and its associated extensions in nonlinear control, MPC, adaptive control, and learning for control. To illustrate the effectiveness of SLS-based methods, we present a case study motivated by the power grid, with communication constraints, actuator saturation, disturbances, and changing setpoints. This simple but challenging case study necessitates the use of model predictive control (MPC); however, standard MPC techniques often scales poorly to large systems and incurs heavy computational burden. To address this challenge, we combine two SLS-based controllers to form a layered MPC-like controller. Our controller has constant computational complexity with respect to the system size, gives a 20-fold reduction in online computation requirements, and still achieves performance that is within 3% of the centralized MPC controller.
△ Less
Submitted 31 March, 2021; v1 submitted 3 October, 2020;
originally announced October 2020.
-
Explicit Distributed and Localized Model Predictive Control via System Level Synthesis
Authors:
Carmen Amo Alonso,
Nikolai Matni,
James Anderson
Abstract:
An explicit Model Predictive Control algorithm for large-scale structured linear systems is presented. We base our results on Distributed and Localized Model Predictive Control (DLMPC), a closed-loop model predictive control scheme based on the System Level Synthesis (SLS) framework wherein only local state and model information needs to be exchanged between subsystems for the computation and impl…
▽ More
An explicit Model Predictive Control algorithm for large-scale structured linear systems is presented. We base our results on Distributed and Localized Model Predictive Control (DLMPC), a closed-loop model predictive control scheme based on the System Level Synthesis (SLS) framework wherein only local state and model information needs to be exchanged between subsystems for the computation and implementation of control actions. We provide an explicit solution for each of the subproblems resulting from the distributed MPC scheme. We show that given the separability of the problem, the explicit solution is only divided into three regions per state and input instantiation, making the point location problem very efficient. Moreover, given the locality constraints, the subproblems are of much smaller dimension than the full problem, which significantly reduces the computational overhead of explicit solutions. We conclude with numerical simulations to demonstrate the computational advantages of our method, in which we show a large improvement in runtime per MPC iteration as compared with the results of computing the optimization with a solver online.
△ Less
Submitted 28 May, 2020;
originally announced May 2020.
-
Distributed and Localized Model Predictive Control via System Level Synthesis
Authors:
Carmen Amo Alonso,
Nikolai Matni
Abstract:
We present the Distributed and Localized Model Predictive Control (DLMPC) algorithm for large-scale structured linear systems, wherein only local state and model information needs to be exchanged between subsystems for the computation and implementation of control actions. We use the System Level Synthesis (SLS) framework to reformulate the MPC problem as an optimization problem over closed loop s…
▽ More
We present the Distributed and Localized Model Predictive Control (DLMPC) algorithm for large-scale structured linear systems, wherein only local state and model information needs to be exchanged between subsystems for the computation and implementation of control actions. We use the System Level Synthesis (SLS) framework to reformulate the MPC problem as an optimization problem over closed loop system responses, and show that this allows us to naturally impose localized communication constraints between sub-controllers, such that only local state and system model information needs to be exchanged for both computation and implementation of closed loop MPC control policies. In particular, we show that the structure of the resulting optimization problem can be exploited to develop an Alternating Direction Method of Multipliers (ADMM) based algorithm that allows for distributed and localized computation of control decisions. Moreover, our approach can accommodate constraints and objective functions that couple the behavior of different subsystems, so long as the coupled systems are able to communicate directly with each other, allowing for a broader class of MPC problems to be solved via distributed optimization. We conclude with numerical simulations to demonstrate the usefulness of our method, and in particular, we demonstrate that the computational complexity of the subproblems solved by each subsystem in DLMPC is independent of the size of the global system.
△ Less
Submitted 10 September, 2020; v1 submitted 22 September, 2019;
originally announced September 2019.