-
Data-Driven Near-Optimal Control of Nonlinear Systems Over Finite Horizon
Authors:
Vasanth Reddy,
Hoda Eldardiry,
Almuatazbellah Boker
Abstract:
We examine the problem of two-point boundary optimal control of nonlinear systems over finite-horizon time periods with unknown model dynamics by employing reinforcement learning. We use techniques from singular perturbation theory to decompose the control problem over the finite horizon into two sub-problems, each solved over an infinite horizon. In the process, we avoid the need to solve the tim…
▽ More
We examine the problem of two-point boundary optimal control of nonlinear systems over finite-horizon time periods with unknown model dynamics by employing reinforcement learning. We use techniques from singular perturbation theory to decompose the control problem over the finite horizon into two sub-problems, each solved over an infinite horizon. In the process, we avoid the need to solve the time-varying Hamilton-Jacobi-Bellman equation. Using a policy iteration method, which is made feasible as a result of this decomposition, it is now possible to learn the controller gains of both sub-problems. The overall control is then formed by piecing together the solutions to the two sub-problems. We show that the performance of the proposed closed-loop system approaches that of the model-based optimal performance as the time horizon gets long. Finally, we provide three simulation scenarios to support the paper's claims.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Singular Perturbation-based Reinforcement Learning of Two-Point Boundary Optimal Control Systems
Authors:
Vasanth Reddy,
Hoda Eldardiry,
Almuatazbellah Boker
Abstract:
This work presents a technique for learning systems, where the learning process is guided by knowledge of the physics of the system. In particular, we solve the problem of the two-point boundary optimal control problem of linear time-varying systems with unknown model dynamics using reinforcement learning. Borrowing techniques from singular perturbation theory, we transform the time-varying optima…
▽ More
This work presents a technique for learning systems, where the learning process is guided by knowledge of the physics of the system. In particular, we solve the problem of the two-point boundary optimal control problem of linear time-varying systems with unknown model dynamics using reinforcement learning. Borrowing techniques from singular perturbation theory, we transform the time-varying optimal control problem into a couple of time-invariant subproblems. This allows the utilization of an off-policy iteration method to learn the controller gains. We show that the performance of the learning-based controller approximates that of the model-based optimal controller and the accuracy of the approximation improves as the time horizon of the control problem increases. Finally, we provide a simulation example to verify the results of the paper.
△ Less
Submitted 29 April, 2021; v1 submitted 19 April, 2021;
originally announced April 2021.
-
Predicting Coordinated Actuated Traffic Signal Change Times using LSTM Neural Networks
Authors:
Seifeldeen Eteifa,
Hesham A. Rakha,
Hoda Eldardiry
Abstract:
Vehicle acceleration and deceleration maneuvers at traffic signals results in significant fuel and energy consumption levels. Green light optimal speed advisory systems require reliable estimates of signal switching times to improve vehicle fuel efficiency. Obtaining these estimates is difficult for actuated signals where the length of each green indication changes to accommodate varying traffic c…
▽ More
Vehicle acceleration and deceleration maneuvers at traffic signals results in significant fuel and energy consumption levels. Green light optimal speed advisory systems require reliable estimates of signal switching times to improve vehicle fuel efficiency. Obtaining these estimates is difficult for actuated signals where the length of each green indication changes to accommodate varying traffic conditions. This study details a four-step Long Short-Term Memory deep learning-based methodology that can be used to provide reasonable switching time estimates from green to red and vice versa while being robust to missing data. The four steps are data gathering, data preparation, machine learning model tuning, and model testing and evaluation. The input to the models included controller logic, signal timing parameters, time of day, traffic state from detectors, vehicle actuation data, and pedestrian actuation data. The methodology is applied and evaluated on data from an intersection in Northern Virginia. A comparative analysis is conducted between different loss functions including the mean squared error, mean absolute error, and mean relative error used in LSTM and a new loss function is proposed. The results show that while the proposed loss function outperforms conventional loss functions in terms of overall absolute error values, the choice of the loss function is dependent on the prediction horizon. In particular, the proposed loss function is outperformed by the mean relative error for very short prediction horizons and mean squared error for very long prediction horizons.
△ Less
Submitted 10 August, 2020;
originally announced August 2020.
-
Two-stage building energy consumption clustering based on temporal and peak demand patterns
Authors:
Milad Afzalan,
Farrokh Jazizadeh,
Hoda Eldardiry
Abstract:
Analyzing smart meter data to understand energy consumption patterns helps utilities and energy providers perform customized demand response operations. Existing energy consumption segmentation techniques use assumptions that could result in reduced quality of clusters in representing their members. We address this limitation by introducing a two-stage clustering method that more accurately captur…
▽ More
Analyzing smart meter data to understand energy consumption patterns helps utilities and energy providers perform customized demand response operations. Existing energy consumption segmentation techniques use assumptions that could result in reduced quality of clusters in representing their members. We address this limitation by introducing a two-stage clustering method that more accurately captures load shape temporal patterns and peak demands. In the first stage, load shapes are clustered by allowing a large number of clusters to accurately capture variations in energy use patterns and cluster centroids are extracted by accounting for shape misalignments. In the second stage, clusters of similar centroid and power magnitude range are merged by using Dynamic Time Warping. We used three datasets consisting of ~250 households (~15000 profiles) to demonstrate the performance improvement, compared to baseline methods, and discuss the impact on energy management.
△ Less
Submitted 29 August, 2020; v1 submitted 10 August, 2020;
originally announced August 2020.