-
Data-Driven Nonlinear Regulation: Gaussian Process Learning
Authors:
Telema Harry,
Martin Guay,
Shimin Wang,
Richard D. Braatz
Abstract:
This article addresses the output regulation problem for a class of nonlinear systems using a data-driven approach. An output feedback controller is proposed that integrates a traditional control component with a data-driven learning algorithm based on Gaussian Process (GP) regression to learn the nonlinear internal model. Specifically, a data-driven technique is employed to directly approximate t…
▽ More
This article addresses the output regulation problem for a class of nonlinear systems using a data-driven approach. An output feedback controller is proposed that integrates a traditional control component with a data-driven learning algorithm based on Gaussian Process (GP) regression to learn the nonlinear internal model. Specifically, a data-driven technique is employed to directly approximate the unknown internal model steady-state map from observed input-output data online. Our method does not rely on model-based observers utilized in previous studies, making it robust and suitable for systems with modelling errors and model uncertainties. Finally, we demonstrate through numerical examples and detailed stability analysis that, under suitable conditions, the closed-loop system remains bounded and converges to a compact set, with the size of this set decreasing as the accuracy of the data-driven model improves over time.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Nonadaptive Output Regulation of Second-Order Nonlinear Uncertain Systems
Authors:
Maobin Lu,
Martin Guay,
Telema Harry,
Shimin Wang,
Jordan Cooper
Abstract:
This paper investigates the robust output regulation problem of second-order nonlinear uncertain systems with an unknown exosystem. Instead of the adaptive control approach, this paper resorts to a robust control methodology to solve the problem and thus avoid the bursting phenomenon. In particular, this paper constructs generic internal models for the steady-state state and input variables of the…
▽ More
This paper investigates the robust output regulation problem of second-order nonlinear uncertain systems with an unknown exosystem. Instead of the adaptive control approach, this paper resorts to a robust control methodology to solve the problem and thus avoid the bursting phenomenon. In particular, this paper constructs generic internal models for the steady-state state and input variables of the system. By introducing a coordinate transformation, this paper converts the robust output regulation problem into a nonadaptive stabilization problem of an augmented system composed of the second-order nonlinear uncertain system and the generic internal models. Then, we design the stabilization control law and construct a strict Lyapunov function that guarantees the robustness with respect to unmodeled disturbances. The analysis shows that the output zeroing manifold of the augmented system can be made attractive by the proposed nonadaptive control law, which solves the robust output regulation problem. Finally, we demonstrate the effectiveness of the proposed nonadaptive internal model approach by its application to the control of the Duffing system.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
Optimal Output Feedback Learning Control for Discrete-Time Linear Quadratic Regulation
Authors:
Kedi Xie,
Martin Guay,
Shimin Wang,
Fang Deng,
Maobin Lu
Abstract:
This paper studies the linear quadratic regulation (LQR) problem of unknown discrete-time systems via dynamic output feedback learning control. In contrast to the state feedback, the optimality of the dynamic output feedback control for solving the LQR problem requires an implicit condition on the convergence of the state observer. Moreover, due to unknown system matrices and the existence of obse…
▽ More
This paper studies the linear quadratic regulation (LQR) problem of unknown discrete-time systems via dynamic output feedback learning control. In contrast to the state feedback, the optimality of the dynamic output feedback control for solving the LQR problem requires an implicit condition on the convergence of the state observer. Moreover, due to unknown system matrices and the existence of observer error, it is difficult to analyze the convergence and stability of most existing output feedback learning-based control methods. To tackle these issues, we propose a generalized dynamic output feedback learning control approach with guaranteed convergence, stability, and optimality performance for solving the LQR problem of unknown discrete-time linear systems. In particular, a dynamic output feedback controller is designed to be equivalent to a state feedback controller. This equivalence relationship is an inherent property without requiring convergence of the estimated state by the state observer, which plays a key role in establishing the off-policy learning control approaches. By value iteration and policy iteration schemes, the adaptive dynamic programming based learning control approaches are developed to estimate the optimal feedback control gain. In addition, a model-free stability criterion is provided by finding a nonsingular parameterization matrix, which contributes to establishing a switched iteration scheme. Furthermore, the convergence, stability, and optimality analyses of the proposed output feedback learning control approaches are given. Finally, the theoretical results are validated by two numerical examples.
△ Less
Submitted 27 May, 2025; v1 submitted 8 March, 2025;
originally announced March 2025.
-
Deficient Excitation in Parameter Learning
Authors:
Ganghui Cao,
Shimin Wang,
Martin Guay,
Jinzhi Wang,
Zhisheng Duan,
Marios M. Polycarpou
Abstract:
This paper investigates parameter learning problems under deficient excitation (DE). The DE condition is a rank-deficient, and therefore, a more general evolution of the well-known persistent excitation condition. Under the DE condition, a proposed online algorithm is able to calculate the identifiable and non-identifiable subspaces, and finally give an optimal parameter estimate in the sense of l…
▽ More
This paper investigates parameter learning problems under deficient excitation (DE). The DE condition is a rank-deficient, and therefore, a more general evolution of the well-known persistent excitation condition. Under the DE condition, a proposed online algorithm is able to calculate the identifiable and non-identifiable subspaces, and finally give an optimal parameter estimate in the sense of least squares. In particular, the learning error within the identifiable subspace exponentially converges to zero in the noise-free case, even without persistent excitation. The DE condition also provides a new perspective for solving distributed parameter learning problems, where the challenge is posed by local regressors that are often insufficiently excited. To improve knowledge of the unknown parameters, a cooperative learning protocol is proposed for a group of estimators that collect measured information under complementary DE conditions. This protocol allows each local estimator to operate locally in its identifiable subspace, and reach a consensus with neighbours in its non-identifiable subspace. As a result, the task of estimating unknown parameters can be achieved in a distributed way using cooperative local estimators. Application examples in system identification are given to demonstrate the effectiveness of the theoretical results developed in this paper.
△ Less
Submitted 27 May, 2025; v1 submitted 3 March, 2025;
originally announced March 2025.
-
Learning-Enhanced Safeguard Control for High-Relative-Degree Systems: Robust Optimization under Disturbances and Faults
Authors:
Xinyang Wang,
Hongwei Zhang,
Shimin Wang,
Wei Xiao,
Martin Guay
Abstract:
Merely pursuing performance may adversely affect the safety, while a conservative policy for safe exploration will degrade the performance. How to balance the safety and performance in learning-based control problems is an interesting yet challenging issue. This paper aims to enhance system performance with safety guarantee in solving the reinforcement learning (RL)-based optimal control problems…
▽ More
Merely pursuing performance may adversely affect the safety, while a conservative policy for safe exploration will degrade the performance. How to balance the safety and performance in learning-based control problems is an interesting yet challenging issue. This paper aims to enhance system performance with safety guarantee in solving the reinforcement learning (RL)-based optimal control problems of nonlinear systems subject to high-relative-degree state constraints and unknown time-varying disturbance/actuator faults. First, to combine control barrier functions (CBFs) with RL, a new type of CBFs, termed high-order reciprocal control barrier function (HO-RCBF) is proposed to deal with high-relative-degree constraints during the learning process. Then, the concept of gradient similarity is proposed to quantify the relationship between the gradient of safety and the gradient of performance. Finally, gradient manipulation and adaptive mechanisms are introduced in the safe RL framework to enhance the performance with a safety guarantee. Two simulation examples illustrate that the proposed safe RL framework can address high-relative-degree constraint, enhance safety robustness and improve system performance.
△ Less
Submitted 25 January, 2025;
originally announced January 2025.
-
Generalized Pose Space Embeddings for Training In-the-Wild using Anaylis-by-Synthesis
Authors:
Dominik Borer,
Jakob Buhmann,
Martin Guay
Abstract:
Modern pose estimation models are trained on large, manually-labelled datasets which are costly and may not cover the full extent of human poses and appearances in the real world. With advances in neural rendering, analysis-by-synthesis and the ability to not only predict, but also render the pose, is becoming an appealing framework, which could alleviate the need for large scale manual labelling…
▽ More
Modern pose estimation models are trained on large, manually-labelled datasets which are costly and may not cover the full extent of human poses and appearances in the real world. With advances in neural rendering, analysis-by-synthesis and the ability to not only predict, but also render the pose, is becoming an appealing framework, which could alleviate the need for large scale manual labelling efforts. While recent work have shown the feasibility of this approach, the predictions admit many flips due to a simplistic intermediate skeleton representation, resulting in low precision and inhibiting the acquisition of any downstream knowledge such as three-dimensional positioning. We solve this problem with a more expressive intermediate skeleton representation capable of capturing the semantics of the pose (left and right), which significantly reduces flips. To successfully train this new representation, we extend the analysis-by-synthesis framework with a training protocol based on synthetic data. We show that our representation results in less flips and more accurate predictions. Our approach outperforms previous models trained with analysis-by-synthesis on standard benchmarks.
△ Less
Submitted 13 November, 2024;
originally announced November 2024.
-
Nonlinear Bipartite Output Regulation with Application to Turing Pattern
Authors:
Dong Liang,
Martin Guay,
Shimin Wang
Abstract:
In this paper, a bipartite output regulation problem is solved for a class of nonlinear multi-agent systems subject to static signed communication networks. A nonlinear distributed observer is proposed for a nonlinear exosystem with cooperation-competition interactions to address the problem. Sufficient conditions are provided to guarantee its existence and stability. The exponential stability of…
▽ More
In this paper, a bipartite output regulation problem is solved for a class of nonlinear multi-agent systems subject to static signed communication networks. A nonlinear distributed observer is proposed for a nonlinear exosystem with cooperation-competition interactions to address the problem. Sufficient conditions are provided to guarantee its existence and stability. The exponential stability of the observer is established. As a practical application, a leader-following bipartite consensus problem is solved for a class of nonlinear multi-agent systems based on the observer. Finally, a network of multiple pendulum systems is treated to support the feasibility of the proposed design. The possible application of the approach to generate specific Turing patterns is also presented.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Rig-space Neural Rendering
Authors:
Dominik Borer,
Lu Yuhang,
Laura Wuelfroth,
Jakob Buhmann,
Martin Guay
Abstract:
Movie productions use high resolution 3d characters with complex proprietary rigs to create the highest quality images possible for large displays. Unfortunately, these 3d assets are typically not compatible with real-time graphics engines used for games, mixed reality and real-time pre-visualization. Consequently, the 3d characters need to be re-modeled and re-rigged for these new applications, r…
▽ More
Movie productions use high resolution 3d characters with complex proprietary rigs to create the highest quality images possible for large displays. Unfortunately, these 3d assets are typically not compatible with real-time graphics engines used for games, mixed reality and real-time pre-visualization. Consequently, the 3d characters need to be re-modeled and re-rigged for these new applications, requiring weeks of work and artistic approval. Our solution to this problem is to learn a compact image-based rendering of the original 3d character, conditioned directly on the rig parameters. Our idea is to render the character in many different poses and views, and to train a deep neural network to render high resolution images, from the rig parameters directly. Many neural rendering techniques have been proposed to render from 2d skeletons, or geometry and UV maps. However these require manual work, and to do not remain compatible with the animator workflow of manipulating rig widgets, as well as the real-time game engine pipeline of interpolating rig parameters. We extend our architecture to support dynamic re-lighting and composition with other 3d objects in the scene. We designed a network that efficiently generates multiple scene feature maps such as normals, depth, albedo and mask, which are composed with other scene objects to form the final image.
△ Less
Submitted 22 March, 2020;
originally announced March 2020.
-
Animating an Autonomous 3D Talking Avatar
Authors:
Dominik Borer,
Dominik Lutz,
Martin Guay
Abstract:
One of the main challenges with embodying a conversational agent is annotating how and when motions can be played and composed together in real-time, without any visual artifact. The inherent problem is to do so---for a large amount of motions---without introducing mistakes in the annotation. To our knowledge, there is no automatic method that can process animations and automatically label actions…
▽ More
One of the main challenges with embodying a conversational agent is annotating how and when motions can be played and composed together in real-time, without any visual artifact. The inherent problem is to do so---for a large amount of motions---without introducing mistakes in the annotation. To our knowledge, there is no automatic method that can process animations and automatically label actions and compatibility between them. In practice, a state machine, where clips are the actions, is created manually by setting connections between the states with the timing parameters for these connections. Authoring this state machine for a large amount of motions leads to a visual overflow, and increases the amount of possible mistakes. In consequence, conversational agent embodiments are left with little variations and quickly become repetitive. In this paper, we address this problem with a compact taxonomy of chit chat behaviors, that we can utilize to simplify and partially automate the graph authoring process. We measured the time required to label actions of an embodiment using our simple interface, compared to the standard state machine interface in Unreal Engine, and found that our approach is 7 times faster. We believe that our labeling approach could be a path to automated labeling: once a sub-set of motions are labeled (using our interface), we could learn a prediction that could attribute a label to new clips---allowing to really scale up virtual agent embodiments.
△ Less
Submitted 13 March, 2019;
originally announced March 2019.