-
Optimal control barrier functions for RL based safe powertrain control
Authors:
Habtamu Hailemichael,
Beshah Ayalew,
Andrej Ivanco
Abstract:
Reinforcement learning (RL) can improve control performance by seeking to learn optimal control policies in the end-use environment for vehicles and other systems. To accomplish this, RL algorithms need to sufficiently explore the state and action spaces. This presents inherent safety risks, and applying RL on safety-critical systems like vehicle powertrain control requires safety enforcement appr…
▽ More
Reinforcement learning (RL) can improve control performance by seeking to learn optimal control policies in the end-use environment for vehicles and other systems. To accomplish this, RL algorithms need to sufficiently explore the state and action spaces. This presents inherent safety risks, and applying RL on safety-critical systems like vehicle powertrain control requires safety enforcement approaches. In this paper, we seek control-barrier function (CBF)-based safety certificates that demarcate safe regions where the RL agent could optimize the control performance. In particular, we derive optimal high-order CBFs that avoid conservatism while ensuring safety for a vehicle in traffic. We demonstrate the workings of the high-order CBF with an RL agent which uses a deep actor-critic architecture to learn to optimize fuel economy and other driver accommodation metrics. We find that the optimized high-order CBF allows the RL-based powertrain control agent to achieve higher total rewards without any crashes in training and evaluation while achieving better accommodation of driver demands compared to previously proposed exponential barrier function filters and model-based baseline controllers.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Combined film and pulse heating of lithium ion batteries to improve performance in low ambient temperature
Authors:
Habtamu Hailemichael,
Beshah Ayalew
Abstract:
Low ambient temperatures significantly reduce Lithium ion batteries' (LIBs') charge/discharge power and energy capacity, and cause rapid degradation through lithium plating. These limitations can be addressed by preheating the LIB with an external heat source or by exploiting the internal heat generation through the LIB's internal impedance. Fast external heating generates large temperature gradie…
▽ More
Low ambient temperatures significantly reduce Lithium ion batteries' (LIBs') charge/discharge power and energy capacity, and cause rapid degradation through lithium plating. These limitations can be addressed by preheating the LIB with an external heat source or by exploiting the internal heat generation through the LIB's internal impedance. Fast external heating generates large temperature gradients across the LIB due to the low thermal conductivity of the cell, while internal impedance heating (usually through AC or pulse charge/discharging) tends to be relatively slow, although it can achieve more uniform temperature distribution. This paper investigates the potential of combining externally sourced resistive film heating with bidirectional pulse heating to achieve fast preheating without causing steep temperature gradients. The LIB is modeled with the Doyle Fuller Newman (DFN) electrochemical model and 1D thermal model, and reinforcement learning (RL) is used to optimize the pulse current amplitude and film voltage concurrently. The results indicate that the optimal policy for maximizing the rate of temperature rise while limiting temperature gradients has the film heating dominate the initial phases and create the ideal conditions for pulse heating to take over. In addition, the pulse component shares the heating load and reduces the energy rating of the auxiliary power source.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Safe Reinforcement Learning for an Energy-Efficient Driver Assistance System
Authors:
Habtamu Hailemichael,
Beshah Ayalew,
Lindsey Kerbel,
Andrej Ivanco,
Keith Loiselle
Abstract:
Reinforcement learning (RL)-based driver assistance systems seek to improve fuel consumption via continual improvement of powertrain control actions considering experiential data from the field. However, the need to explore diverse experiences in order to learn optimal policies often limits the application of RL techniques in safety-critical systems like vehicle control. In this paper, an exponent…
▽ More
Reinforcement learning (RL)-based driver assistance systems seek to improve fuel consumption via continual improvement of powertrain control actions considering experiential data from the field. However, the need to explore diverse experiences in order to learn optimal policies often limits the application of RL techniques in safety-critical systems like vehicle control. In this paper, an exponential control barrier function (ECBF) is derived and utilized to filter unsafe actions proposed by an RL-based driver assistance system. The RL agent freely explores and optimizes the performance objectives while unsafe actions are projected to the closest actions in the safe domain. The reward is structured so that driver's acceleration requests are met in a manner that boosts fuel economy and doesn't compromise comfort. The optimal gear and traction torque control actions that maximize the cumulative reward are computed via the Maximum a Posteriori Policy Optimization (MPO) algorithm configured for a hybrid action space. The proposed safe-RL scheme is trained and evaluated in car following scenarios where it is shown that it effectively avoids collision both during training and evaluation while delivering on the expected fuel economy improvements for the driver assistance system.
△ Less
Submitted 2 January, 2023;
originally announced January 2023.
-
Safety Filtering for Reinforcement Learning-based Adaptive Cruise Control
Authors:
Habtamu Hailemichael,
Beshah Ayalew,
Lindsey Kerbel,
Andrej Ivanco,
Keith Loiselle
Abstract:
Reinforcement learning (RL)-based adaptive cruise control systems (ACC) that learn and adapt to road, traffic and vehicle conditions are attractive for enhancing vehicle energy efficiency and traffic flow. However, the application of RL in safety critical systems such as ACC requires strong safety guarantees which are difficult to achieve with learning agents that have a fundamental need to explor…
▽ More
Reinforcement learning (RL)-based adaptive cruise control systems (ACC) that learn and adapt to road, traffic and vehicle conditions are attractive for enhancing vehicle energy efficiency and traffic flow. However, the application of RL in safety critical systems such as ACC requires strong safety guarantees which are difficult to achieve with learning agents that have a fundamental need to explore. In this paper, we derive control barrier functions as safety filters that allow an RL-based ACC controller to explore freely within a collision safe set. Specifically, we derive control barrier functions for high relative degree nonlinear systems to take into account inertia effects relevant to commercial vehicles. We also outline an algorithm for accommodating actuation saturation with these barrier functions. While any RL algorithm can be used as the performance ACC controller together with these filters, we implement the Maximum A Posteriori Policy Optimization (MPO) algorithm with a hybrid action space that learns fuel optimal gear selection and torque control policies. The safety filtering RL approach is contrasted with a reward shaping RL approach that only learns to avoid collisions after sufficient training. Evaluations on different drive cycles demonstrate significant improvements in fuel economy with the proposed approach compared to baseline ACC algorithms.
△ Less
Submitted 2 January, 2023;
originally announced January 2023.