-
An Attentive Graph Agent for Topology-Adaptive Cyber Defence
Authors:
Ilya Orson Sandoval,
Isaac Symes Thompson,
Vasilios Mavroudis,
Chris Hicks
Abstract:
As cyber threats grow increasingly sophisticated, reinforcement learning (RL) is emerging as a promising technique to create intelligent and adaptive cyber defense systems. However, most existing autonomous defensive agents have overlooked the inherent graph structure of computer networks subject to cyber attacks, potentially missing critical information and constraining their adaptability. To ove…
▽ More
As cyber threats grow increasingly sophisticated, reinforcement learning (RL) is emerging as a promising technique to create intelligent and adaptive cyber defense systems. However, most existing autonomous defensive agents have overlooked the inherent graph structure of computer networks subject to cyber attacks, potentially missing critical information and constraining their adaptability. To overcome these limitations, we developed a custom version of the Cyber Operations Research Gym (CybORG) environment, encoding network state as a directed graph with realistic low-level features. We employ a Graph Attention Network (GAT) architecture to process node, edge, and global features, and adapt its output to be compatible with policy gradient methods in RL. Our GAT-based approach offers key advantages over flattened alternatives: policies that demonstrate resilience to certain types of unexpected dynamic network topology changes, reasonable generalisation to networks of varying sizes within the same structural distribution, and interpretable defensive actions grounded in tangible network properties. We demonstrate that GAT defensive policies can be trained using our low-level directed graph observations, even when unexpected connections arise during simulation. Evaluations across networks of different sizes, but consistent subnetwork structure, show our policies achieve comparable performance to policies trained specifically for each network configuration. Our study contributes to the development of robust cyber defence systems that can better adapt to real-world network security challenges.
△ Less
Submitted 15 April, 2025; v1 submitted 24 January, 2025;
originally announced January 2025.
-
The Automated Discovery of Kinetic Rate Models -- Methodological Frameworks
Authors:
Miguel Ángel de Carvalho Servia,
Ilya Orson Sandoval,
Klaus Hellgardt,
King Kuok,
Hii,
Dongda Zhang,
Ehecatl Antonio del Rio Chanona
Abstract:
The industrialization of catalytic processes requires reliable kinetic models for their design, optimization and control. Mechanistic models require significant domain knowledge, while data-driven and hybrid models lack interpretability. Automated knowledge discovery methods, such as ALAMO (Automated Learning of Algebraic Models for Optimization), SINDy (Sparse Identification of Nonlinear Dynamics…
▽ More
The industrialization of catalytic processes requires reliable kinetic models for their design, optimization and control. Mechanistic models require significant domain knowledge, while data-driven and hybrid models lack interpretability. Automated knowledge discovery methods, such as ALAMO (Automated Learning of Algebraic Models for Optimization), SINDy (Sparse Identification of Nonlinear Dynamics), and genetic programming, have gained popularity but suffer from limitations such as needing model structure assumptions, exhibiting poor scalability, and displaying sensitivity to noise. To overcome these challenges, we propose two methodological frameworks, ADoK-S and ADoK-W (Automated Discovery of Kinetic rate models using a Strong/Weak formulation of symbolic regression), for the automated generation of catalytic kinetic models using a robust criterion for model selection. We leverage genetic programming for model generation and a sequential optimization routine for model refinement. The frameworks are tested against three case studies of increasing complexity, demonstrating their ability to retrieve the underlying kinetic rate model with limited noisy data from the catalytic systems, showcasing their potential for chemical reaction engineering applications.
△ Less
Submitted 2 November, 2023; v1 submitted 26 January, 2023;
originally announced January 2023.
-
Neural ODEs as Feedback Policies for Nonlinear Optimal Control
Authors:
Ilya Orson Sandoval,
Panagiotis Petsagkourakis,
Ehecatl Antonio del Rio-Chanona
Abstract:
Neural ordinary differential equations (Neural ODEs) define continuous time dynamical systems with neural networks. The interest in their application for modelling has sparked recently, spanning hybrid system identification problems and time series analysis. In this work we propose the use of a neural control policy capable of satisfying state and control constraints to solve nonlinear optimal con…
▽ More
Neural ordinary differential equations (Neural ODEs) define continuous time dynamical systems with neural networks. The interest in their application for modelling has sparked recently, spanning hybrid system identification problems and time series analysis. In this work we propose the use of a neural control policy capable of satisfying state and control constraints to solve nonlinear optimal control problems. The control policy optimization is posed as a Neural ODE problem to efficiently exploit the availability of a dynamical system model. We showcase the efficacy of this type of deterministic neural policies in two constrained systems: the controlled Van der Pol system and a bioreactor control problem. This approach represents a practical approximation to the intractable closed-loop solution of nonlinear control problems.
△ Less
Submitted 12 November, 2022; v1 submitted 20 October, 2022;
originally announced October 2022.
-
Chance Constrained Policy Optimization for Process Control and Optimization
Authors:
Panagiotis Petsagkourakis,
Ilya Orson Sandoval,
Eric Bradford,
Federico Galvanin,
Dongda Zhang,
Ehecatl Antonio del Rio-Chanona
Abstract:
Chemical process optimization and control are affected by 1) plant-model mismatch, 2) process disturbances, and 3) constraints for safe operation. Reinforcement learning by policy optimization would be a natural way to solve this due to its ability to address stochasticity, plant-model mismatch, and directly account for the effect of future uncertainty and its feedback in a proper closed-loop mann…
▽ More
Chemical process optimization and control are affected by 1) plant-model mismatch, 2) process disturbances, and 3) constraints for safe operation. Reinforcement learning by policy optimization would be a natural way to solve this due to its ability to address stochasticity, plant-model mismatch, and directly account for the effect of future uncertainty and its feedback in a proper closed-loop manner; all without the need of an inner optimization loop. One of the main reasons why reinforcement learning has not been considered for industrial processes (or almost any engineering application) is that it lacks a framework to deal with safety critical constraints. Present algorithms for policy optimization use difficult-to-tune penalty parameters, fail to reliably satisfy state constraints or present guarantees only in expectation. We propose a chance constrained policy optimization (CCPO) algorithm which guarantees the satisfaction of joint chance constraints with a high probability - which is crucial for safety critical tasks. This is achieved by the introduction of constraint tightening (backoffs), which are computed simultaneously with the feedback policy. Backoffs are adjusted with Bayesian optimization using the empirical cumulative distribution function of the probabilistic constraints, and are therefore self-tuned. This results in a general methodology that can be imbued into present policy optimization algorithms to enable them to satisfy joint chance constraints with high probability. We present case studies that analyze the performance of the proposed approach.
△ Less
Submitted 17 December, 2020; v1 submitted 30 July, 2020;
originally announced August 2020.
-
Constrained Reinforcement Learning for Dynamic Optimization under Uncertainty
Authors:
Panagiotis Petsagkourakis,
Ilya Orson Sandoval,
Eric Bradford,
Dongda Zhang,
Ehecatl Antonio del Río Chanona
Abstract:
Dynamic real-time optimization (DRTO) is a challenging task due to the fact that optimal operating conditions must be computed in real time. The main bottleneck in the industrial application of DRTO is the presence of uncertainty. Many stochastic systems present the following obstacles: 1) plant-model mismatch, 2) process disturbances, 3) risks in violation of process constraints. To accommodate t…
▽ More
Dynamic real-time optimization (DRTO) is a challenging task due to the fact that optimal operating conditions must be computed in real time. The main bottleneck in the industrial application of DRTO is the presence of uncertainty. Many stochastic systems present the following obstacles: 1) plant-model mismatch, 2) process disturbances, 3) risks in violation of process constraints. To accommodate these difficulties, we present a constrained reinforcement learning (RL) based approach. RL naturally handles the process uncertainty by computing an optimal feedback policy. However, no state constraints can be introduced intuitively. To address this problem, we present a chance-constrained RL methodology. We use chance constraints to guarantee the probabilistic satisfaction of process constraints, which is accomplished by introducing backoffs, such that the optimal policy and backoffs are computed simultaneously. Backoffs are adjusted using the empirical cumulative distribution function to guarantee the satisfaction of a joint chance constraint. The advantage and performance of this strategy are illustrated through a stochastic dynamic bioprocess optimization problem, to produce sustainable high-value bioproducts.
△ Less
Submitted 4 June, 2020;
originally announced June 2020.