Skip to main content

Showing 1–32 of 32 results for author: Groš, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.20161  [pdf, ps, other

    cs.CE

    Efficient Multi-Objective Constrained Bayesian Optimization of Bridge Girder

    Authors: Heine Havneraas Røstum, Joseph Morlier, Sebastien Gros, Ketil Aas-Jakobsen

    Abstract: The buildings and construction sector is a significant source of greenhouse gas emissions, with cement production alone contributing 7~\% of global emissions and the industry as a whole accounting for approximately 37~\%. Reducing emissions by optimizing structural design can achieve significant global benefits. This article introduces an efficient multi-objective constrained Bayesian optimization… ▽ More

    Submitted 24 September, 2025; originally announced September 2025.

  2. arXiv:2509.12833  [pdf, ps, other

    cs.LG

    Safe Reinforcement Learning using Action Projection: Safeguard the Policy or the Environment?

    Authors: Hannah Markgraf, Shamburaj Sawant, Hanna Krasowski, Lukas Schäfer, Sebastien Gros, Matthias Althoff

    Abstract: Projection-based safety filters, which modify unsafe actions by mapping them to the closest safe alternative, are widely used to enforce safety constraints in reinforcement learning (RL). Two integration strategies are commonly considered: Safe environment RL (SE-RL), where the safeguard is treated as part of the environment, and safe policy RL (SP-RL), where it is embedded within the policy throu… ▽ More

    Submitted 16 September, 2025; originally announced September 2025.

  3. arXiv:2508.02441  [pdf, ps, other

    eess.SY cs.LG

    Computationally efficient Gauss-Newton reinforcement learning for model predictive control

    Authors: Dean Brandner, Sebastien Gros, Sergio Lucia

    Abstract: Model predictive control (MPC) is widely used in process control due to its interpretability and ability to handle constraints. As a parametric policy in reinforcement learning (RL), MPC offers strong initial performance and low data requirements compared to black-box policies like neural networks. However, most RL methods rely on first-order updates, which scale well to large parameter spaces but… ▽ More

    Submitted 4 August, 2025; originally announced August 2025.

    Comments: 14 pages, 8 figures, submitted to Elsevier

  4. arXiv:2507.04356  [pdf, ps, other

    math.OC cs.AI cs.RO

    Mission-Aligned Learning-Informed Control of Autonomous Systems: Formulation and Foundations

    Authors: Vyacheslav Kungurtsev, Gustav Sir, Akhil Anand, Sebastien Gros, Haozhe Tian, Homayoun Hamedmoghadam

    Abstract: Research, innovation and practical capital investment have been increasing rapidly toward the realization of autonomous physical agents. This includes industrial and service robots, unmanned aerial vehicles, embedded control devices, and a number of other realizations of cybernetic/mechatronic implementations of intelligent autonomous devices. In this paper, we consider a stylized version of robot… ▽ More

    Submitted 6 July, 2025; originally announced July 2025.

  5. arXiv:2505.16242  [pdf, ps, other

    cs.LG eess.SY

    Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies

    Authors: Runze Yan, Xun Shen, Akifumi Wachi, Sebastien Gros, Anni Zhao, Xiao Hu

    Abstract: When applying offline reinforcement learning (RL) in healthcare scenarios, the out-of-distribution (OOD) issues pose significant risks, as inappropriate generalization beyond clinical expertise can result in potentially harmful recommendations. While existing methods like conservative Q-learning (CQL) attempt to address the OOD issue, their effectiveness is limited by only constraining action sele… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  6. arXiv:2505.01353  [pdf, other

    math.OC cs.AI cs.LG

    Differentiable Nonlinear Model Predictive Control

    Authors: Jonathan Frey, Katrin Baumgärtner, Gianluca Frison, Dirk Reinhardt, Jasper Hoffmann, Leonard Fichtner, Sebastien Gros, Moritz Diehl

    Abstract: The efficient computation of parametric solution sensitivities is a key challenge in the integration of learning-enhanced methods with nonlinear model predictive control (MPC), as their availability is crucial for many learning algorithms. While approaches presented in the machine learning community are limited to convex or unconstrained formulations, this paper discusses the computation of soluti… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

    Comments: 19 page, 4 figures, 2 tables

  7. arXiv:2502.02133  [pdf, other

    eess.SY cs.AI cs.LG

    Synthesis of Model Predictive Control and Reinforcement Learning: Survey and Classification

    Authors: Rudolf Reiter, Jasper Hoffmann, Dirk Reinhardt, Florian Messerer, Katrin Baumgärtner, Shamburaj Sawant, Joschka Boedecker, Moritz Diehl, Sebastien Gros

    Abstract: The fields of MPC and RL consider two successful control techniques for Markov decision processes. Both approaches are derived from similar fundamental principles, and both are widely used in practical applications, including robotics, process control, energy systems, and autonomous driving. Despite their similarities, MPC and RL follow distinct paradigms that emerged from diverse communities and… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

  8. arXiv:2501.06086  [pdf, other

    cs.AI cs.LG

    All AI Models are Wrong, but Some are Optimal

    Authors: Akhil S Anand, Shambhuraj Sawant, Dirk Reinhardt, Sebastien Gros

    Abstract: AI models that predict the future behavior of a system (a.k.a. predictive AI models) are central to intelligent decision-making. However, decision-making using predictive AI models often results in suboptimal performance. This is primarily because AI models are typically constructed to best fit the data, and hence to predict the most likely future rather than to enable high-performance decision-ma… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

  9. arXiv:2411.18305  [pdf, other

    eess.SY cs.AI cs.LG

    Application of Soft Actor-Critic Algorithms in Optimizing Wastewater Treatment with Time Delays Integration

    Authors: Esmaeel Mohammadi, Daniel Ortiz-Arroyo, Aviaja Anna Hansen, Mikkel Stokholm-Bjerregaard, Sebastien Gros, Akhil S Anand, Petar Durdevic

    Abstract: Wastewater treatment plants face unique challenges for process control due to their complex dynamics, slow time constants, and stochastic delays in observations and actions. These characteristics make conventional control methods, such as Proportional-Integral-Derivative controllers, suboptimal for achieving efficient phosphorus removal, a critical component of wastewater treatment to ensure envir… ▽ More

    Submitted 27 November, 2024; originally announced November 2024.

    Journal ref: Expert Systems with Applications Volume 277, 5 June 2025, 127180

  10. arXiv:2410.06474  [pdf, ps, other

    cs.LG math.OC

    Flipping-based Policy for Chance-Constrained Markov Decision Processes

    Authors: Xun Shen, Shuo Jiang, Akifumi Wachi, Kaumune Hashimoto, Sebastien Gros

    Abstract: Safe reinforcement learning (RL) is a promising approach for many real-world decision-making problems where ensuring safety is a critical necessity. In safe RL research, while expected cumulative safety constraints (ECSCs) are typically the first choices, chance constraints are often more pragmatic for incorporating safety under uncertainties. This paper proposes a \textit{flipping-based policy} f… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

    Comments: Accepted to NeurIPS 2024

  11. arXiv:2401.00661  [pdf, ps, other

    eess.SY cs.GT

    Personalized Dynamic Pricing Policy for Electric Vehicles: Reinforcement learning approach

    Authors: Sangjun Bae, Balazs Kulcsar, Sebastien Gros

    Abstract: With the increasing number of fast-electric vehicle charging stations (fast-EVCSs) and the popularization of information technology, electricity price competition between fast-EVCSs is highly expected, in which the utilization of public and/or privacy-preserved information will play a crucial role. Self-interest electric vehicle (EV) users, on the other hand, try to select a fast-EVCS for charging… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

  12. arXiv:2302.12667  [pdf, other

    cs.AI cs.IT eess.SY

    Deep active learning for nonlinear system identification

    Authors: Erlend Torje Berg Lundby, Adil Rasheed, Ivar Johan Halvorsen, Dirk Reinhardt, Sebastien Gros, Jan Tommy Gravdahl

    Abstract: The exploding research interest for neural networks in modeling nonlinear dynamical systems is largely explained by the networks' capacity to model complex input-output relations directly from data. However, they typically need vast training data before they can be put to any good use. The data generation process for dynamical systems can be an expensive endeavor both in terms of time and resource… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

  13. arXiv:2301.01667  [pdf, other

    eess.SY cs.LG

    Learning-based MPC from Big Data Using Reinforcement Learning

    Authors: Shambhuraj Sawant, Akhil S Anand, Dirk Reinhardt, Sebastien Gros

    Abstract: This paper presents an approach for learning Model Predictive Control (MPC) schemes directly from data using Reinforcement Learning (RL) methods. The state-of-the-art learning methods use RL to improve the performance of parameterized MPC schemes. However, these learning algorithms are often gradient-based methods that require frequent evaluations of computationally expensive MPC schemes, thereby… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

  14. Systematic review of automatic translation of high-level security policy into firewall rules

    Authors: Ivan Kovačević, Bruno Štengl, Stjepan Groš

    Abstract: Firewalls are security devices that perform network traffic filtering. They are ubiquitous in the industry and are a common method used to enforce organizational security policy. Security policy is specified on a high level of abstraction, with statements such as "web browsing is allowed only on workstations inside the office network", and needs to be translated into low-level firewall rules to be… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

    Comments: 6 pages, 1 figure; Published in the 2022 45th Jubilee International Convention on Information, Communication and Electronic Technology (MIPRO)

  15. arXiv:2205.08856  [pdf, other

    eess.SY cs.AI cs.LG

    Bridging the gap between QP-based and MPC-based RL

    Authors: Shambhuraj Sawant, Sebastien Gros

    Abstract: Reinforcement learning methods typically use Deep Neural Networks to approximate the value functions and policies underlying a Markov Decision Process. Unfortunately, DNN-based RL suffers from a lack of explainability of the resulting policy. In this paper, we instead approximate the policy and value functions using an optimization problem, taking the form of Quadratic Programs (QPs). We propose s… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

  16. Interpretable Battery Cycle Life Range Prediction Using Early Degradation Data at Cell Level

    Authors: Huang Zhang, Yang Su, Faisal Altaf, Torsten Wik, Sebastien Gros

    Abstract: Battery cycle life prediction using early degradation data has many potential applications throughout the battery product life cycle. For that reason, various data-driven methods have been proposed for point prediction of battery cycle life with minimum knowledge of the battery degradation mechanisms. However, managing the rapidly increasing amounts of batteries at end-of-life with lower economic… ▽ More

    Submitted 23 April, 2023; v1 submitted 26 April, 2022; originally announced April 2022.

  17. arXiv:2203.13854  [pdf, other

    cs.LG eess.SY

    Quasi-Newton Iteration in Deterministic Policy Gradient

    Authors: Arash Bahari Kordabad, Hossein Nejatbakhsh Esfahani, Wenqi Cai, Sebastien Gros

    Abstract: This paper presents a model-free approximation for the Hessian of the performance of deterministic policies to use in the context of Reinforcement Learning based on Quasi-Newton steps in the policy parameters. We show that the approximate Hessian converges to the exact Hessian at the optimal policy, and allows for a superlinear convergence in the learning, provided that the policy parametrization… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Comments: This paper has been accepted to 2022 American Control Conference (ACC). 6 pages

  18. arXiv:2111.04146  [pdf, other

    eess.SY cs.LG cs.RO

    Optimization of the Model Predictive Control Meta-Parameters Through Reinforcement Learning

    Authors: Eivind Bøhn, Sebastien Gros, Signe Moe, Tor Arne Johansen

    Abstract: Model predictive control (MPC) is increasingly being considered for control of fast systems and embedded applications. However, the MPC has some significant challenges for such systems. Its high computational complexity results in high power consumption from the control algorithm, which could account for a significant share of the energy resources in battery-powered embedded systems. The MPC param… ▽ More

    Submitted 7 November, 2021; originally announced November 2021.

    Comments: This work has been submitted to the IEEE for possible publication

  19. Automatically generating models of IT systems

    Authors: Ivan Kovačević, Stjepan Groš, Ante Đerek

    Abstract: Information technology system (ITS), informally, consists of hardware and software infrastructure (e.g., workstations, servers, laptops, installed software packages, databases, LANs, firewalls, etc.), along with physical and logical connections and inter-dependencies between various items. Nowadays, every company owns and operates an ITS, but detailed information about the system is rarely publicl… ▽ More

    Submitted 31 January, 2022; v1 submitted 23 July, 2021; originally announced July 2021.

    Comments: 20 pages, 16 figures

    Journal ref: IEEE Access (2022)

  20. arXiv:2106.06000  [pdf, ps, other

    cs.CR

    Use of a non-peer reviewed sources in cyber-security scientific research

    Authors: Dalibor Gernhardt, Stjepan Groš

    Abstract: Most publicly available data on cyber incidents comes from private companies and non-academic sources. Common sources of information include various security bulletins, white papers, reports, court cases, and blog posts describing specific events, often from a single point of view, followed by occasional academic sources, usually conference proceedings. The main characteristics of the available da… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: 9 pages, 6 tables

  21. arXiv:2106.05702  [pdf, ps, other

    cs.CR

    Myths and Misconceptions about Attackers and Attacks

    Authors: Stjepan Groš

    Abstract: This paper is based on a three year project during which we studied attackers' behavior, reading military planning literature, and thinking on how would we do the same things they do, and what problems would we, as attackers, face. This research is still ongoing, but while participating in applications for other projects and talking to cyber security experts we constantly face the same issues, nam… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: 8 pages, 27 reference. This paper is work in progress and as such may contain inaccuracies, missing or unfinished sentences and paragraphs

  22. arXiv:2106.01154  [pdf, other

    cs.CR cs.SE

    Controlled Update of Software Components using Concurrent Exection of Patched and Unpatched Versions

    Authors: Stjepan Groš, Ivan Kovačević, Ivan Dujmić, Matej Petrinović

    Abstract: Software patching is a common method of removing vulnerabilities in software components to make IT systems more secure. However, there are many cases where software patching is not possible due to the critical nature of the application, especially when the vendor providing the application guarantees correct operation only in a specific configuration. In this paper, we propose a method to solve thi… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: 9 pages, 4 figures

  23. arXiv:2104.02743  [pdf, other

    eess.SY cs.LG cs.RO

    Approximate Robust NMPC using Reinforcement Learning

    Authors: Hossein Nejatbakhsh Esfahani, Arash Bahari Kordabad, Sebastien Gros

    Abstract: We present a Reinforcement Learning-based Robust Nonlinear Model Predictive Control (RL-RNMPC) framework for controlling nonlinear systems in the presence of disturbances and uncertainties. An approximate Robust Nonlinear Model Predictive Control (RNMPC) of low computational complexity is used in which the state trajectory uncertainty is modelled via ellipsoids. Reinforcement Learning is then used… ▽ More

    Submitted 6 April, 2021; originally announced April 2021.

    Comments: This paper has been accepted to 2021 European Control Conference (ECC)

  24. arXiv:2104.02411  [pdf, other

    cs.LG eess.SY

    MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage

    Authors: Arash Bahari Kordabad, Wenqi Cai, Sebastien Gros

    Abstract: In this paper, we are interested in optimal control problems with purely economic costs, which often yield optimal policies having a (nearly) bang-bang structure. We focus on policy approximations based on Model Predictive Control (MPC) and the use of the deterministic policy gradient method to optimize the MPC closed-loop performance in the presence of unmodelled stochasticity or model error. Whe… ▽ More

    Submitted 6 April, 2021; originally announced April 2021.

    Comments: This paper has been accepted to ECC2021. 6 pages

  25. arXiv:2102.11122  [pdf, other

    eess.SY cs.LG

    Reinforcement Learning of the Prediction Horizon in Model Predictive Control

    Authors: Eivind Bøhn, Sebastien Gros, Signe Moe, Tor Arne Johansen

    Abstract: Model predictive control (MPC) is a powerful trajectory optimization control technique capable of controlling complex nonlinear systems while respecting system constraints and ensuring safe operation. The MPC's capabilities come at the cost of a high online computational complexity, the requirement of an accurate model of the system dynamics, and the necessity of tuning its parameters to the speci… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

    Comments: This work has been submitted to IFAC NMPC 2021 for possible publication

  26. arXiv:2102.01383  [pdf, other

    cs.LG eess.SY math.OC

    Stability-Constrained Markov Decision Processes Using MPC

    Authors: Mario Zanon, Sébastien Gros, Michele Palladino

    Abstract: In this paper, we consider solving discounted Markov Decision Processes (MDPs) under the constraint that the resulting policy is stabilizing. In practice MDPs are solved based on some form of policy approximation. We will leverage recent results proposing to use Model Predictive Control (MPC) as a structured policy in the context of Reinforcement Learning to make it possible to introduce stability… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

  27. arXiv:2012.07369  [pdf, other

    cs.LG eess.SY math.OC

    Learning for MPC with Stability & Safety Guarantees

    Authors: Sébastien Gros, Mario Zanon

    Abstract: The combination of learning methods with Model Predictive Control (MPC) has attracted a significant amount of attention in the recent literature. The hope of this combination is to reduce the reliance of MPC schemes on accurate models, and to tap into the fast developing machine learning and reinforcement learning tools to exploit the growing amount of data available for many systems. In particula… ▽ More

    Submitted 22 July, 2022; v1 submitted 14 December, 2020; originally announced December 2020.

  28. arXiv:2011.13365  [pdf, other

    eess.SY cs.LG

    Optimization of the Model Predictive Control Update Interval Using Reinforcement Learning

    Authors: Eivind Bøhn, Sebastien Gros, Signe Moe, Tor Arne Johansen

    Abstract: In control applications there is often a compromise that needs to be made with regards to the complexity and performance of the controller and the computational resources that are available. For instance, the typical hardware platform in embedded control applications is a microcontroller with limited memory and processing power, and for battery powered applications the control system can account f… ▽ More

    Submitted 26 November, 2020; originally announced November 2020.

    Comments: Submitted to 3rd Annual Learning for Dynamics and Control Conference (L4DC 2021)

  29. arXiv:2004.01430  [pdf, ps, other

    eess.SY cs.AI cs.LG

    Reinforcement Learning for Mixed-Integer Problems Based on MPC

    Authors: Sebastien Gros, Mario Zanon

    Abstract: Model Predictive Control has been recently proposed as policy approximation for Reinforcement Learning, offering a path towards safe and explainable Reinforcement Learning. This approach has been investigated for Q-learning and actor-critic methods, both in the context of nominal Economic MPC and Robust (N)MPC, showing very promising results. In that context, actor-critic methods seem to be the mo… ▽ More

    Submitted 3 April, 2020; originally announced April 2020.

    Comments: Accepted at IFAC 2020

  30. arXiv:2004.00915  [pdf, ps, other

    eess.SY cs.AI cs.LG

    Safe Reinforcement Learning via Projection on a Safe Set: How to Achieve Optimality?

    Authors: Sebastien Gros, Mario Zanon, Alberto Bemporad

    Abstract: For all its successes, Reinforcement Learning (RL) still struggles to deliver formal guarantees on the closed-loop behavior of the learned policy. Among other things, guaranteeing the safety of RL with respect to safety-critical systems is a very active research topic. Some recent contributions propose to rely on projections of the inputs delivered by the learned policy into a safe set, ensuring t… ▽ More

    Submitted 2 April, 2020; originally announced April 2020.

    Comments: Accepted at IFAC 2020

  31. arXiv:2001.06616  [pdf, ps, other

    cs.CR

    Research Directions in Cyber Threat Intelligence

    Authors: Stjepan Groš

    Abstract: Cyber threat intelligence is a relatively new field that has grown from two distinct fields, cyber security and intelligence. As such, it draws knowledge from and mixes the two fields. Yet, looking into current scientific research on cyber threat intelligence research, it is relatively scarce, which opens up a lot of opportunities. In this paper we define what cyber threat intelligence is, briefly… ▽ More

    Submitted 18 January, 2020; originally announced January 2020.

    Comments: 6 pages

  32. arXiv:1910.01721  [pdf, ps, other

    cs.CR

    A Critical View on CIS Controls

    Authors: Stjepan Groš

    Abstract: CIS Controls is a set of 20 controls and 171 sub-controls that were created with an idea of having a list of something to implement so that organizations can increase their security. While good in theory, it is a big question of how viable this approach is in practice, and does it really help. There is only a minor number of critical views of CIS Controls and since CIS Controls are marketed by two… ▽ More

    Submitted 2 May, 2020; v1 submitted 3 October, 2019; originally announced October 2019.

    Comments: 7 pages